使用机器学习确定文本的编程语言

导入必要的库

norman Python 语句:import

<span style="color:#000000"><span style="background-color:#fbedbb"><span style="color:#0000ff">import</span> pandas <span style="color:#0000ff">as</span> pd
<span style="color:#0000ff">import</span> numpy <span style="color:#0000ff">as</span> np<span style="color:#0000ff">from</span> sklearn.feature_extraction.text <span style="color:#0000ff">import</span> TfidfVectorizer
<span style="color:#0000ff">from</span> sklearn.linear_model.logistic <span style="color:#0000ff">import</span> LogisticRegression
<span style="color:#0000ff">from</span> sklearn.ensemble <span style="color:#0000ff">import</span> RandomForestClassifier
<span style="color:#0000ff">from</span> sklearn.svm <span style="color:#0000ff">import</span> LinearSVC
<span style="color:#0000ff">from</span> sklearn.tree <span style="color:#0000ff">import</span> DecisionTreeClassifier<span style="color:#0000ff">from</span> sklearn.naive_bayes <span style="color:#0000ff">import</span> MultinomialNB<span style="color:#0000ff">from</span> sklearn.model_selection <span style="color:#0000ff">import</span> train_test_split, cross_val_score
<span style="color:#0000ff">from</span> sklearn.utils <span style="color:#0000ff">import</span> shuffle
<span style="color:#0000ff">from</span> sklearn.metrics <span style="color:#0000ff">import</span> precision_score, classification_report, accuracy_score<span style="color:#0000ff">from</span> sklearn.pipeline <span style="color:#0000ff">import</span> FeatureUnion
<span style="color:#0000ff">from</span> sklearn.preprocessing <span style="color:#0000ff">import</span> LabelEncoder<span style="color:#0000ff">import</span> re
<span style="color:#0000ff">import</span> time</span></span>

检索和解析数据

我在这个挑战中的大部分时间都花在了弄清楚如何有效地解析数据以从文本中提取语言名称,然后从文本中删除该信息,这样它就不会污染我们的训练和测试数据集。

下面是两个文本字符串/段(跨越多行并包含回车符)的示例:

<span style="color:#000000"><span style="background-color:#fbedbb"><pre lang=<span style="color:#800080">"</span><span style="color:#800080">Swift"</span>>
@objc func handleTap(sender: UITapGestureRecognizer) {<span style="color:#0000ff">if</span> <span style="color:#0000ff">let</span> tappedSceneView = sender.view as? ARSCNView {<span style="color:#0000ff">let</span> tapLocationInView = sender.<span style="color:#339999">location</span>(<span style="color:#0000ff">in</span>: tappedSceneView)<span style="color:#0000ff">let</span> planeHitTest = tappedSceneView.hitTest(tapLocationInView,types: .existingPlaneUsingExtent)<span style="color:#0000ff">if</span> !planeHitTest.isEmpty {addFurniture(hitTest: planeHitTest)}}
}<span style="color:#0000ff"></</span><span style="color:#800000">pre</span><span style="color:#0000ff">></span><pre lang=<span style="color:#800080">"</span><span style="color:#800080">JavaScript"</span>>
<span style="color:#0000ff">var</span> my_dataset = [{id: <span style="color:#800080">"</span><span style="color:#800080">1"</span>,text: <span style="color:#800080">"</span><span style="color:#800080">Chairman & CEO"</span>,title: <span style="color:#800080">"</span><span style="color:#800080">Henry Bennett"</span>},{id: <span style="color:#800080">"</span><span style="color:#800080">2"</span>,text: <span style="color:#800080">"</span><span style="color:#800080">Manager"</span>,title: <span style="color:#800080">"</span><span style="color:#800080">Mildred Kim"</span>},{id: <span style="color:#800080">"</span><span style="color:#800080">3"</span>,text: <span style="color:#800080">"</span><span style="color:#800080">Technical Director"</span>,title: <span style="color:#800080">"</span><span style="color:#800080">Jerry Wagner"</span>},{ id: <span style="color:#800080">"</span><span style="color:#800080">1-2"</span>, <span style="color:#0000ff">from</span>: <span style="color:#800080">"</span><span style="color:#800080">1"</span>, to: <span style="color:#800080">"</span><span style="color:#800080">2"</span>, type: <span style="color:#800080">"</span><span style="color:#800080">line"</span> },{ id: <span style="color:#800080">"</span><span style="color:#800080">1-3"</span>, <span style="color:#0000ff">from</span>: <span style="color:#800080">"</span><span style="color:#800080">1"</span>, to: <span style="color:#800080">"</span><span style="color:#800080">3"</span>, type: <span style="color:#800080">"</span><span style="color:#800080">line"</span> }
];<span style="color:#0000ff"></</span><span style="color:#800000">pre</span><span style="color:#0000ff">></span></span></span>

棘手的部分是让正则表达式返回 “” 标签中的数据,然后创建另一个正则表达式来只返回 “” 标签的 “” 部分。<pre lang...><pre>langpre

它并不漂亮,我相信它可以优化,但它有效:

<span style="color:#000000"><span style="background-color:#fbedbb"><span style="color:#0000ff">def</span> get_data():file_name = <span style="color:#800080">'</span><span style="color:#800080">./LanguageSamples.txt'</span>rawdata = <span style="color:#339999">open</span>(file_name, <span style="color:#800080">'</span><span style="color:#800080">r'</span>)lines = rawdata.readlines()<span style="color:#0000ff">return</span> lines<span style="color:#0000ff">def</span> clean_data(input_lines):<span style="color:#008000"><em>#</em></span><span style="color:#008000"><em>find matches for all data within the pre tags</em></span>all_found = re.findall(r<span style="color:#800080">'</span><span style="color:#800080"><pre[\s\S]*?<\/pre>'</span>, input_lines, re.MULTILINE)<span style="color:#008000"><em>#</em></span><span style="color:#008000"><em>clean the string of various tags</em></span>clean_string = <span style="color:#0000ff">lambda</span> x: x.replace(<span style="color:#800080">'</span><span style="color:#800080">&lt;'</span>, <span style="color:#800080">'</span><span style="color:#800080"><'</span>).replace(<span style="color:#800080">'</span><span style="color:#800080">&gt;'</span>, <span style="color:#800080">'</span><span style="color:#800080">>'</span>).replace(<span style="color:#800080">'</span><span style="color:#800080"></pre>'</span>, <span style="color:#800080">'</span><span style="color:#800080">'</span>).replace(<span style="color:#800080">'</span><span style="color:#800080">\n'</span>, <span style="color:#800080">'</span><span style="color:#800080">'</span>)all_found = [clean_string(item) <span style="color:#0000ff">for</span> item <span style="color:#0000ff">in</span> all_found]<span style="color:#008000"><em>#</em></span><span style="color:#008000"><em>get the language for all of the pre tags</em></span>get_language = <span style="color:#0000ff">lambda</span> x: re.findall(r<span style="color:#800080">'</span><span style="color:#800080"><pre lang="(.*?)">'</span>, x, re.MULTILINE)[<span style="color:#000080">0</span>]lang_items = [get_language(item) <span style="color:#0000ff">for</span> item <span style="color:#0000ff">in</span> all_found]<span style="color:#008000"><em>#</em></span><span style="color:#008000"><em>remove all of the pre tags that contain the language</em></span>remove_lang = <span style="color:#0000ff">lambda</span> x: re.sub(r<span style="color:#800080">'</span><span style="color:#800080"><pre lang="(.*?)">'</span>, <span style="color:#800080">"</span><span style="color:#800080">"</span>, x)all_found = [remove_lang(item) <span style="color:#0000ff">for</span> item <span style="color:#0000ff">in</span> all_found]<span style="color:#008000"><em>#</em></span><span style="color:#008000"><em>return let text between the pre tags and their corresponding language</em></span><span style="color:#0000ff">return</span> (all_found, lang_items) </span></span>

创建 Pandas DataFrame

在这里,我们获取数据,创建一个并用数据填充它。DataFrame

<span style="color:#000000"><span style="background-color:#fbedbb">all_samples = <span style="color:#800080">'</span><span style="color:#800080">'</span>.join(get_data())
cleaned_data, languages = clean_data(all_samples)df = pd.DataFrame()
df[<span style="color:#800080">'</span><span style="color:#800080">lang_text'</span>] = languages
df[<span style="color:#800080">'</span><span style="color:#800080">data'</span>] = cleaned_data</span></span>

这是我们的样子:DataFrame

初始 DataFrame

创建分类列

接下来我们需要做的是将我们的 “” 列变成一个数字列,因为这是许多机器学习模型对它试图确定的 “” 或输出的期望。为此,我们将使用 LabelEncoder 并使用它来将我们的 “” 列转换为分类列。lang_textYlang_text

<span style="color:#000000"><span style="background-color:#fbedbb">lb_enc = LabelEncoder()
df[<span style="color:#800080">'</span><span style="color:#800080">language'</span>] = lb_enc.fit_transform(df[<span style="color:#800080">'</span><span style="color:#800080">lang_text'</span>])  </span></span>

现在我们看起来像这样:DataFrame

带有新专栏的 DataFame

我们可以通过运行以下命令来查看该列是如何编码的:

<span style="color:#000000"><span style="background-color:#fbedbb">lb_enc.classes_</span></span>

显示此内容(数组中的位置与新的“语言”分类列中的整数值匹配):

<span style="color:#000000"><span style="background-color:#fbedbb">array([<span style="color:#800080">'</span><span style="color:#800080">ASM'</span>, <span style="color:#800080">'</span><span style="color:#800080">ASP.NET'</span>, <span style="color:#800080">'</span><span style="color:#800080">Angular'</span>, <span style="color:#800080">'</span><span style="color:#800080">C#'</span>, <span style="color:#800080">'</span><span style="color:#800080">C++'</span>, <span style="color:#800080">'</span><span style="color:#800080">CSS'</span>, <span style="color:#800080">'</span><span style="color:#800080">Delphi'</span>, <span style="color:#800080">'</span><span style="color:#800080">HTML'</span>,<span style="color:#800080">'</span><span style="color:#800080">Java'</span>, <span style="color:#800080">'</span><span style="color:#800080">JavaScript'</span>, <span style="color:#800080">'</span><span style="color:#800080">Javascript'</span>, <span style="color:#800080">'</span><span style="color:#800080">ObjectiveC'</span>, <span style="color:#800080">'</span><span style="color:#800080">PERL'</span>, <span style="color:#800080">'</span><span style="color:#800080">PHP'</span>,<span style="color:#800080">'</span><span style="color:#800080">Pascal'</span>, <span style="color:#800080">'</span><span style="color:#800080">PowerShell'</span>, <span style="color:#800080">'</span><span style="color:#800080">Powershell'</span>, <span style="color:#800080">'</span><span style="color:#800080">Python'</span>, <span style="color:#800080">'</span><span style="color:#800080">Razor'</span>, <span style="color:#800080">'</span><span style="color:#800080">React'</span>,<span style="color:#800080">'</span><span style="color:#800080">Ruby'</span>, <span style="color:#800080">'</span><span style="color:#800080">SQL'</span>, <span style="color:#800080">'</span><span style="color:#800080">Scala'</span>, <span style="color:#800080">'</span><span style="color:#800080">Swift'</span>, <span style="color:#800080">'</span><span style="color:#800080">TypeScript'</span>, <span style="color:#800080">'</span><span style="color:#800080">VB.NET'</span>, <span style="color:#800080">'</span><span style="color:#800080">XML'</span>], dtype=object)</span></span>

样板代码

     以下是后续步骤:

  1. 声明用于输出训练结果的函数
  2. 声明用于训练和测试模型的函数
  3. 声明用于创建要测试的模型的函数
  4. 随机播放数据
  5. 拆分训练和测试数据
  6. 将数据和模型传递到训练和测试函数中,并查看结果:
<span style="color:#000000"><span style="background-color:#fbedbb"><span style="color:#0000ff">def</span> output_accuracy(actual_y, predicted_y, model_name, train_time, predict_time):<span style="color:#0000ff">print</span>(<span style="color:#800080">'</span><span style="color:#800080">Model Name: '</span> + model_name)<span style="color:#0000ff">print</span>(<span style="color:#800080">'</span><span style="color:#800080">Train time: '</span>, <span style="color:#339999">round</span>(train_time, <span style="color:#000080">2</span>))<span style="color:#0000ff">print</span>(<span style="color:#800080">'</span><span style="color:#800080">Predict time: '</span>, <span style="color:#339999">round</span>(predict_time, <span style="color:#000080">2</span>))<span style="color:#0000ff">print</span>(<span style="color:#800080">'</span><span style="color:#800080">Model Accuracy: {:.4f}'</span>.<span style="color:#339999">format</span>(accuracy_score(actual_y, predicted_y)))<span style="color:#0000ff">print</span>(<span style="color:#800080">'</span><span style="color:#800080">'</span>)<span style="color:#0000ff">print</span>(classification_report(actual_y, predicted_y, digits=4))<span style="color:#0000ff">print</span>(<span style="color:#800080">"</span><span style="color:#800080">======================================================="</span>)<span style="color:#0000ff">def</span> test_models(X_train_input_raw, y_train_input, X_test_input_raw, y_test_input, models_dict):return_trained_models = {}return_vectorizer = FeatureUnion([(<span style="color:#800080">'</span><span style="color:#800080">tfidf_vect'</span>, TfidfVectorizer())])X_train = return_vectorizer.fit_transform(X_train_input_raw)X_test = return_vectorizer.transform(X_test_input_raw)<span style="color:#0000ff">for</span> key <span style="color:#0000ff">in</span> models_dict:model_name = keymodel = models_dict[key]t1 = time.time()model.fit(X_train, y_train_input)t2 = time.time()predicted_y = model.predict(X_test)t3 = time.time()output_accuracy(y_test_input, predicted_y, model_name, t2 - t1, t3 - t2)        return_trained_models[model_name] = model<span style="color:#0000ff">return</span> (return_trained_models, return_vectorizer)<span style="color:#0000ff">def</span> create_models():models = {}models[<span style="color:#800080">'</span><span style="color:#800080">LinearSVC'</span>] = LinearSVC()models[<span style="color:#800080">'</span><span style="color:#800080">LogisticRegression'</span>] = LogisticRegression()models[<span style="color:#800080">'</span><span style="color:#800080">RandomForestClassifier'</span>] = RandomForestClassifier()models[<span style="color:#800080">'</span><span style="color:#800080">DecisionTreeClassifier'</span>] = DecisionTreeClassifier()models[<span style="color:#800080">'</span><span style="color:#800080">MultinomialNB'</span>] = MultinomialNB()<span style="color:#0000ff">return</span> modelsX_input, y_input = shuffle(df[<span style="color:#800080">'</span><span style="color:#800080">data'</span>], df[<span style="color:#800080">'</span><span style="color:#800080">language'</span>], random_state=7)X_train_raw, X_test_raw, y_train, y_test = train_test_split(X_input, y_input, test_size=0.<span style="color:#000080">7</span>)models = create_models()
trained_models, fitted_vectorizer = test_models(X_train_raw, y_train, X_test_raw, y_test, models) </span></span>

结果是这样的:

<span style="color:#000000"><span style="background-color:#fbedbb">Model Name: LinearSVC
Train time:  0.99
Predict time:  0.0
Model Accuracy: 0.9262precision    recall  f1-score   support0     1.0000    1.0000    1.0000         61     1.0000    1.0000    1.0000         22     1.0000    1.0000    1.0000         13     0.8968    1.0000    0.9456       3394     0.9695    0.8527    0.9074       2245     0.9032    1.0000    0.9492        286     0.7000    1.0000    0.8235         77     0.9032    0.7568    0.8235        748     0.7778    0.5833    0.6667        369     0.9613    0.9255    0.9430       16110     1.0000    0.5000    0.6667         611     1.0000    1.0000    1.0000        1412     1.0000    1.0000    1.0000         513     1.0000    1.0000    1.0000         214     1.0000    0.4545    0.6250        1115     1.0000    1.0000    1.0000         616     1.0000    0.4000    0.5714         517     0.9589    0.9589    0.9589        7318     1.0000    1.0000    1.0000         819     0.7600    0.9268    0.8352        4120     0.1818    1.0000    0.3077         221     1.0000    1.0000    1.0000       13722     1.0000    0.8750    0.9333        2423     1.0000    1.0000    1.0000         724     1.0000    1.0000    1.0000        2525     0.9571    0.9571    0.9571        7026     0.9211    0.9722    0.9459       108avg / total     0.9339    0.9262    0.9255      1422=========================================================================
Model Name: DecisionTreeClassifier
Train time:  0.13
Predict time:  0.0
Model Accuracy: 0.9388precision    recall  f1-score   support0     1.0000    1.0000    1.0000         61     1.0000    1.0000    1.0000         22     1.0000    1.0000    1.0000         13     0.9123    0.9204    0.9163       3394     0.8408    0.9196    0.8785       2245     1.0000    0.8929    0.9434        286     1.0000    1.0000    1.0000         77     1.0000    0.9595    0.9793        748     0.9091    0.8333    0.8696        369     0.9817    1.0000    0.9908       16110     1.0000    0.5000    0.6667         611     1.0000    1.0000    1.0000        1412     1.0000    1.0000    1.0000         513     1.0000    1.0000    1.0000         214     1.0000    0.4545    0.6250        1115     1.0000    0.5000    0.6667         616     1.0000    0.4000    0.5714         517     1.0000    1.0000    1.0000        7318     1.0000    1.0000    1.0000         819     0.9268    0.9268    0.9268        4120     1.0000    1.0000    1.0000         221     1.0000    1.0000    1.0000       13722     1.0000    0.7500    0.8571        2423     1.0000    1.0000    1.0000         724     0.6786    0.7600    0.7170        2525     1.0000    1.0000    1.0000        7026     1.0000    1.0000    1.0000       108avg / total     0.9419    0.9388    0.9376      1422=========================================================================
Model Name: LogisticRegression
Train time:  0.71
Predict time:  0.01
Model Accuracy: 0.9304precision    recall  f1-score   support0     1.0000    1.0000    1.0000         61     1.0000    1.0000    1.0000         22     1.0000    1.0000    1.0000         13     0.9040    1.0000    0.9496       3394     0.9569    0.8929    0.9238       2245     0.9032    1.0000    0.9492        286     0.7000    1.0000    0.8235         77     0.8929    0.6757    0.7692        748     0.8750    0.5833    0.7000        369     0.9281    0.9627    0.9451       16110     1.0000    0.5000    0.6667         611     1.0000    1.0000    1.0000        1412     1.0000    1.0000    1.0000         513     1.0000    1.0000    1.0000         214     1.0000    0.4545    0.6250        1115     1.0000    1.0000    1.0000         616     1.0000    0.4000    0.5714         517     0.9589    0.9589    0.9589        7318     1.0000    1.0000    1.0000         819     0.7600    0.9268    0.8352        4120     1.0000    1.0000    1.0000         221     1.0000    0.9781    0.9889       13722     1.0000    0.8750    0.9333        2423     1.0000    1.0000    1.0000         724     1.0000    1.0000    1.0000        2525     0.9571    0.9571    0.9571        7026     0.9211    0.9722    0.9459       108avg / total     0.9329    0.9304    0.9272      1422=========================================================================
Model Name: RandomForestClassifier
Train time:  0.04
Predict time:  0.01
Model Accuracy: 0.9374precision    recall  f1-score   support0     1.0000    1.0000    1.0000         61     1.0000    1.0000    1.0000         22     1.0000    1.0000    1.0000         13     0.8760    1.0000    0.9339       3394     0.9452    0.9241    0.9345       2245     0.9032    1.0000    0.9492        286     0.7000    1.0000    0.8235         77     1.0000    0.8378    0.9118        748     1.0000    0.5278    0.6909        369     0.9527    1.0000    0.9758       16110     1.0000    0.1667    0.2857         611     1.0000    1.0000    1.0000        1412     1.0000    1.0000    1.0000         513     1.0000    1.0000    1.0000         214     1.0000    0.4545    0.6250        1115     1.0000    0.5000    0.6667         616     1.0000    0.4000    0.5714         517     1.0000    1.0000    1.0000        7318     1.0000    0.6250    0.7692         819     0.9268    0.9268    0.9268        4120     0.0000    0.0000    0.0000         221     1.0000    1.0000    1.0000       13722     1.0000    1.0000    1.0000        2423     1.0000    0.5714    0.7273         724     1.0000    1.0000    1.0000        2525     1.0000    0.9571    0.9781        7026     0.8889    0.8889    0.8889       108avg / total     0.9411    0.9374    0.9324      1422=========================================================================
Model Name: MultinomialNB
Train time:  0.01
Predict time:  0.0
Model Accuracy: 0.8776precision    recall  f1-score   support0     1.0000    1.0000    1.0000         61     0.0000    0.0000    0.0000         22     0.0000    0.0000    0.0000         13     0.8380    0.9764    0.9019       3394     1.0000    0.8750    0.9333       2245     1.0000    1.0000    1.0000        286     1.0000    1.0000    1.0000         77     0.6628    0.7703    0.7125        748     1.0000    0.5833    0.7368        369     0.8952    0.6894    0.7789       16110     1.0000    0.3333    0.5000         611     1.0000    1.0000    1.0000        1412     1.0000    1.0000    1.0000         513     0.0000    0.0000    0.0000         214     1.0000    0.7273    0.8421        1115     1.0000    1.0000    1.0000         616     1.0000    0.4000    0.5714         517     1.0000    0.9178    0.9571        7318     0.8000    1.0000    0.8889         819     0.4607    1.0000    0.6308        4120     0.0000    0.0000    0.0000         221     1.0000    1.0000    1.0000       13722     1.0000    1.0000    1.0000        2423     1.0000    1.0000    1.0000         724     0.8462    0.8800    0.8627        2525     0.8642    1.0000    0.9272        7026     0.9630    0.7222    0.8254       108avg / total     0.8982    0.8776    0.8770      1422=========================================================================</span></span>

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/pingmian/6409.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

Java面试题:解释Java内存模型(JMM)是什么,它为何重要?

Java内存模型&#xff08;Java Memory Model, JMM&#xff09; 定义&#xff1a; Java内存模型是一个抽象的概念&#xff0c;它定义了Java程序中各种变量&#xff08;线程共享变量&#xff09;的访问规则&#xff0c;以及在并发环境下&#xff0c;这些变量的读写操作如何与内存…

基于OpenCv的图像Harris角点检测

⚠申明&#xff1a; 未经许可&#xff0c;禁止以任何形式转载&#xff0c;若要引用&#xff0c;请标注链接地址。 全文共计3077字&#xff0c;阅读大概需要3分钟 &#x1f308;更多学习内容&#xff0c; 欢迎&#x1f44f;关注&#x1f440;【文末】我的个人微信公众号&#xf…

使用D3.js进行数据可视化

D3.js介绍 D3.js是一个流行的JavaScript数据可视化库&#xff0c;全称为Data-Driven Documents&#xff0c;即数据驱动文档。它以数据为核心&#xff0c;通过数据来驱动文档的展示和操作。D3.js提供了丰富的API和工具&#xff0c;使得开发者能够创建出各种交互式和动态的数据可…

无界微前端项目实战

前言 微前端框架&#xff1a;无界 wujievue 微前端是什么 | 无界主应用&#xff1a;Vue 2 elementui子应用&#xff1a;Vue 3viteelement plus 前提 子应用的资源和接口的请求都在主域名发起&#xff0c;所以会有跨域问题&#xff0c;子应用必须做cors 设置vue3vite 项目跨…

爬取B站评论:Python技术实现详解

引言 在当今信息爆炸的互联网时代&#xff0c;用户生成的内容不断涌现&#xff0c;其中包括了各种各样的评论。而B站作为一个充满活力的视频分享平台&#xff0c;其评论区更是一个充满了各种各样精彩评论的宝藏地。那么&#xff0c;有没有一种简单的方法可以将这些评论收集起来…

大模型日报2024-05-03

大模型日报 2024-05-03 大模型资讯 马克扎克伯格宣布Meta发布Llama 3大型语言模型的重大AI新闻 摘要: Meta公司在周四发布了其Llama 3大型语言模型的首两个版本。该模型是Meta AI的动力核心&#xff0c;马克扎克伯格称其为“未来的...”。这一进展标志着Meta在人工智能领域的进…

深度学习心得

1. KL loss 其经常要与softmax一起使用&#xff0c;就是为了学习one-hot分布 2. 降维 Pooling层的作用是增加模型的鲁棒性&#xff0c;让模型对输入的少量变化不那么敏感。 如果真想通过降维&#xff0c;减少模型训练参数&#xff0c;那应该用PCA降维方法&#xff0c; skl…

Java中new一个对象内存区域如何变化?顺序是什么?

Java中new一个对象内存区域如何变化&#xff1f;顺序是什么&#xff1f; 如果你对Java内存区域了解的话&#xff0c;那么肯定会知道&#xff0c;创建对象如果是第一次的话&#xff0c;首先肯定是要加载对应的Class&#xff08;要创建对象的类&#xff09;,加载的类信息就是放在…

贪心-耍杂技的牛

问题描述 农民约翰的 N头奶牛&#xff08;编号为 1…N&#xff09;计划逃跑并加入马戏团&#xff0c;为此它们决定练习表演杂技。 奶牛们不是非常有创意&#xff0c;只提出了一个杂技表演&#xff1a; 叠罗汉&#xff0c;表演时&#xff0c;奶牛们站在彼此的身上&#xff0c;形…

Django之配置数据库

一&#xff0c;创建项目 二&#xff0c;将项目的setting.py中的 DATABASES {default: {ENGINE: django.db.backends.sqlite3,NAME: BASE_DIR / db.sqlite3,} }替换成如下&#xff08;以mysql为例&#xff09; DATABASES {default: {ENGINE: django.db.backends.mysql,NAME: …

力扣---二叉树的锯齿形层序遍历

给你二叉树的根节点 root &#xff0c;返回其节点值的 锯齿形层序遍历 。&#xff08;即先从左往右&#xff0c;再从右往左进行下一层遍历&#xff0c;以此类推&#xff0c;层与层之间交替进行&#xff09;。 示例 1&#xff1a; 输入&#xff1a;root [3,9,20,null,null,15,…

基于深度学习神经网络的AI图片上色DDcolor系统源码

第一步&#xff1a;DDcolor介绍 DDColor 是最新的 SOTA 图像上色算法&#xff0c;能够对输入的黑白图像生成自然生动的彩色结果&#xff0c;使用 UNet 结构的骨干网络和图像解码器分别实现图像特征提取和特征图上采样&#xff0c;并利用 Transformer 结构的颜色解码器完成基于视…

PDF Shaper Ultimate 免安装中文破姐版 v14.1

软件介绍 PDF Shaper是一套完整的多功能PDF编辑工具&#xff0c;可实现最高的生产力和文档安全性。它允许你分割&#xff0c;合并&#xff0c;水印&#xff0c;署名&#xff0c;优化&#xff0c;转换&#xff0c;加密和解密您的PDF文件&#xff0c;也可插入和移动页&#xff0…

数字化思维的目的与价值,你真的懂吗?

在这个数字时代&#xff0c;数字化思维正逐渐成为企业和个人的能力。那么&#xff0c;数字化思维究竟以什么为中心&#xff1f;为了达成什么目的&#xff1f;又具有怎样的价值呢&#xff1f;让我们一起来揭开这个神秘的面纱。 数字化思维以数据为中心。数据成为了决策的关键依据…

Python系列一之excel的读取

这里我常用的 python 对于 excel 的读取库有两个&#xff0c;一个是 xlsxwriter 用于操作 excel 的写入&#xff0c;一个是 xlrd 用于 excel 文件的读取。 使用的库的版本如下&#xff1a; xlsx1.2.6xlrd1.1.0 xlsxwriter 写入 excel 新建一个 excel import xlsxwriterpat…

C语言实验-学生信息管理系统

按以下菜单界面编写学生信息管理系统&#xff1b; 1&#xff09;录入学生信息首先输入学生人数&#xff0c;然后根据学生人数开辟动态数组&#xff1b; 2&#xff09;学生信息包括学号、姓名、性别、三门课成绩、总分&#xff1b;其中学号、姓名、 性别、三门课成绩是需要从键盘…

初始《stack》《queue》及手搓模拟《stack》《queue》

目录 前言&#xff1a; stack的介绍和使用 stack的介绍&#xff1a; ​编辑stack的使用&#xff1a; ​编辑stack的模拟实现&#xff1a; queue的介绍和使用 queue的介绍&#xff1a; queue的使用: queue的模拟实现: priority_queue的介绍和使用 priority_queue的介绍:…

mysql先行笔记

mysql笔记 数据库&#xff1a;DataBase 简称&#xff1a;DB 按照一定格式存储数据的一些文件的组合 数据库管理系统&#xff1a; DataBaseManagement,简称&#xff1a;DBMS 专门用来管理数据库中的数据&#xff0c;可以对数据库中的数据进行增删改查 常见的数据库管理系统&am…

Hdfs小文件治理策略以及治理经验

小文件是 Hadoop 集群运维中的常见挑战&#xff0c;尤其对于大规模运行的集群来说可谓至关重要。如果处理不好&#xff0c;可能会导致许多并发症。Hadoop集群本质是为了TB,PB规模的数据存储和计算因运而生的。为啥大数据开发都说小文件的治理重要&#xff0c;说HDFS 存储小文件…

08 - 步骤 表输出

简介 表输出&#xff08;Table Output&#xff09;步骤是用于将 Kettle 中的数据写入关系型数据库表的步骤。它允许用户将数据流中的数据插入、更新或删除到目标数据库表中。 使用 场景 我要将处理完的数据流中的sysOrgCode 跟 plateNumber 保存记录到mysql 1、拖拽表输出…