工具系列:TensorFlow决策森林_(5)使用文本和神经网络特征

文章目录

    • 设置
    • 使用原始文本作为特征
    • 使用预训练的文本嵌入
    • 同时训练决策树和神经网络
      • 构建模型
      • 训练和评估模型

欢迎来到 TensorFlow决策森林TF-DF)的 中级教程
在本文中,您将学习有关 TF-DF的一些更高级的功能,包括如何处理自然语言特征。

本文假设您已经熟悉在决策森林中介绍的概念,特别是关于TF-DF的安装。

在本文中,您将会:

  1. 训练一个原生地将文本特征作为分类集合的随机森林。

  2. 使用TensorFlow Hub模块训练一个使用文本特征的随机森林。在这种情况下(迁移学习),该模块已经在一个大型文本语料库上进行了预训练。

  3. 同时训练一个梯度提升决策树(GBDT)和一个神经网络。GBDT将使用神经网络的输出作为输入。

设置

# 安装 TensorFlow Decision Forests 库
!pip install tensorflow_decision_forests
Collecting tensorflow_decision_forestsUsing cached tensorflow_decision_forests-1.8.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (6.0 kB)
Requirement already satisfied: numpy in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow_decision_forests) (1.26.2)
Requirement already satisfied: pandas in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow_decision_forests) (2.1.3)
Requirement already satisfied: tensorflow~=2.15.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow_decision_forests) (2.15.0)
Requirement already satisfied: six in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow_decision_forests) (1.16.0)
Requirement already satisfied: absl-py in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow_decision_forests) (1.4.0)
Requirement already satisfied: wheel in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow_decision_forests) (0.41.2)
Collecting wurlitzer (from tensorflow_decision_forests)Using cached wurlitzer-3.0.3-py3-none-any.whl (7.3 kB)
Requirement already satisfied: astunparse>=1.6.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow~=2.15.0->tensorflow_decision_forests) (1.6.3)
Requirement already satisfied: flatbuffers>=23.5.26 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow~=2.15.0->tensorflow_decision_forests) (23.5.26)
Requirement already satisfied: gast!=0.5.0,!=0.5.1,!=0.5.2,>=0.2.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow~=2.15.0->tensorflow_decision_forests) (0.5.4)
Requirement already satisfied: google-pasta>=0.1.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow~=2.15.0->tensorflow_decision_forests) (0.2.0)
Requirement already satisfied: h5py>=2.9.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow~=2.15.0->tensorflow_decision_forests) (3.10.0)
Requirement already satisfied: libclang>=13.0.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow~=2.15.0->tensorflow_decision_forests) (16.0.6)
Requirement already satisfied: ml-dtypes~=0.2.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow~=2.15.0->tensorflow_decision_forests) (0.2.0)
Requirement already satisfied: opt-einsum>=2.3.2 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow~=2.15.0->tensorflow_decision_forests) (3.3.0)
Requirement already satisfied: packaging in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow~=2.15.0->tensorflow_decision_forests) (23.2)
Requirement already satisfied: protobuf!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<5.0.0dev,>=3.20.3 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow~=2.15.0->tensorflow_decision_forests) (3.20.3)
Requirement already satisfied: setuptools in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow~=2.15.0->tensorflow_decision_forests) (68.2.2)
Requirement already satisfied: termcolor>=1.1.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow~=2.15.0->tensorflow_decision_forests) (2.3.0)
Requirement already satisfied: typing-extensions>=3.6.6 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow~=2.15.0->tensorflow_decision_forests) (4.8.0)
Requirement already satisfied: wrapt<1.15,>=1.11.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow~=2.15.0->tensorflow_decision_forests) (1.14.1)
Requirement already satisfied: tensorflow-io-gcs-filesystem>=0.23.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow~=2.15.0->tensorflow_decision_forests) (0.34.0)
Requirement already satisfied: grpcio<2.0,>=1.24.3 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow~=2.15.0->tensorflow_decision_forests) (1.60.0rc1)
Requirement already satisfied: tensorboard<2.16,>=2.15 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow~=2.15.0->tensorflow_decision_forests) (2.15.1)
Requirement already satisfied: tensorflow-estimator<2.16,>=2.15.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow~=2.15.0->tensorflow_decision_forests) (2.15.0)
Requirement already satisfied: keras<2.16,>=2.15.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow~=2.15.0->tensorflow_decision_forests) (2.15.0)
Requirement already satisfied: python-dateutil>=2.8.2 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from pandas->tensorflow_decision_forests) (2.8.2)
Requirement already satisfied: pytz>=2020.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from pandas->tensorflow_decision_forests) (2023.3.post1)
Requirement already satisfied: tzdata>=2022.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from pandas->tensorflow_decision_forests) (2023.3)
Requirement already satisfied: google-auth<3,>=1.6.3 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorboard<2.16,>=2.15->tensorflow~=2.15.0->tensorflow_decision_forests) (2.23.4)
Requirement already satisfied: google-auth-oauthlib<2,>=0.5 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorboard<2.16,>=2.15->tensorflow~=2.15.0->tensorflow_decision_forests) (1.1.0)
Requirement already satisfied: markdown>=2.6.8 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorboard<2.16,>=2.15->tensorflow~=2.15.0->tensorflow_decision_forests) (3.5.1)
Requirement already satisfied: requests<3,>=2.21.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorboard<2.16,>=2.15->tensorflow~=2.15.0->tensorflow_decision_forests) (2.31.0)
Requirement already satisfied: tensorboard-data-server<0.8.0,>=0.7.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorboard<2.16,>=2.15->tensorflow~=2.15.0->tensorflow_decision_forests) (0.7.2)
Requirement already satisfied: werkzeug>=1.0.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorboard<2.16,>=2.15->tensorflow~=2.15.0->tensorflow_decision_forests) (3.0.1)
Requirement already satisfied: cachetools<6.0,>=2.0.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from google-auth<3,>=1.6.3->tensorboard<2.16,>=2.15->tensorflow~=2.15.0->tensorflow_decision_forests) (5.3.2)
Requirement already satisfied: pyasn1-modules>=0.2.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from google-auth<3,>=1.6.3->tensorboard<2.16,>=2.15->tensorflow~=2.15.0->tensorflow_decision_forests) (0.3.0)
Requirement already satisfied: rsa<5,>=3.1.4 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from google-auth<3,>=1.6.3->tensorboard<2.16,>=2.15->tensorflow~=2.15.0->tensorflow_decision_forests) (4.9)
Requirement already satisfied: requests-oauthlib>=0.7.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from google-auth-oauthlib<2,>=0.5->tensorboard<2.16,>=2.15->tensorflow~=2.15.0->tensorflow_decision_forests) (1.3.1)
Requirement already satisfied: importlib-metadata>=4.4 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from markdown>=2.6.8->tensorboard<2.16,>=2.15->tensorflow~=2.15.0->tensorflow_decision_forests) (6.8.0)
Requirement already satisfied: charset-normalizer<4,>=2 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from requests<3,>=2.21.0->tensorboard<2.16,>=2.15->tensorflow~=2.15.0->tensorflow_decision_forests) (3.3.2)
Requirement already satisfied: idna<4,>=2.5 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from requests<3,>=2.21.0->tensorboard<2.16,>=2.15->tensorflow~=2.15.0->tensorflow_decision_forests) (3.4)
Requirement already satisfied: urllib3<3,>=1.21.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from requests<3,>=2.21.0->tensorboard<2.16,>=2.15->tensorflow~=2.15.0->tensorflow_decision_forests) (2.1.0)
Requirement already satisfied: certifi>=2017.4.17 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from requests<3,>=2.21.0->tensorboard<2.16,>=2.15->tensorflow~=2.15.0->tensorflow_decision_forests) (2023.11.17)
Requirement already satisfied: MarkupSafe>=2.1.1 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from werkzeug>=1.0.1->tensorboard<2.16,>=2.15->tensorflow~=2.15.0->tensorflow_decision_forests) (2.1.3)
Requirement already satisfied: zipp>=0.5 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from importlib-metadata>=4.4->markdown>=2.6.8->tensorboard<2.16,>=2.15->tensorflow~=2.15.0->tensorflow_decision_forests) (3.17.0)
Requirement already satisfied: pyasn1<0.6.0,>=0.4.6 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from pyasn1-modules>=0.2.1->google-auth<3,>=1.6.3->tensorboard<2.16,>=2.15->tensorflow~=2.15.0->tensorflow_decision_forests) (0.5.0)
Requirement already satisfied: oauthlib>=3.0.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from requests-oauthlib>=0.7.0->google-auth-oauthlib<2,>=0.5->tensorboard<2.16,>=2.15->tensorflow~=2.15.0->tensorflow_decision_forests) (3.2.2)
Using cached tensorflow_decision_forests-1.8.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (15.3 MB)
Installing collected packages: wurlitzer, tensorflow_decision_forests
Successfully installed tensorflow_decision_forests-1.8.1 wurlitzer-3.0.3

Wurlitzer 是在 Colabs 中显示详细的训练日志所需的(当在模型构造函数中使用 verbose=2 时)。

# 安装wurlitzer模块,用于在Jupyter Notebook中显示C语言的输出结果
!pip install wurlitzer
Requirement already satisfied: wurlitzer in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (3.0.3)

导入必要的库。

# 导入所需的库import tensorflow_decision_forests as tfdf  # 导入决策森林库
import os  # 导入操作系统库
import numpy as np  # 导入数值计算库
import pandas as pd  # 导入数据处理库
import tensorflow as tf  # 导入深度学习库
import math  # 导入数学库
2023-11-20 12:31:20.226021: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2023-11-20 12:31:20.226066: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2023-11-20 12:31:20.227643: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered

隐藏的代码单元格在colab中限制了输出的高度。

#@title# 导入所需的模块
from IPython.core.magic import register_line_magic
from IPython.display import Javascript
from IPython.display import display as ipy_display# 定义一个魔术命令,用于设置单元格的最大高度
@register_line_magic
def set_cell_height(size):# 调用Javascript代码,设置单元格的最大高度ipy_display(Javascript("google.colab.output.setIframeHeight(0, true, {maxHeight: " +str(size) + "})"))

使用原始文本作为特征

TF-DF可以原生地处理categorical-set特征。Categorical-sets将文本特征表示为词袋(或n-grams)。

例如:"The little blue dog" {"the", "little", "blue", "dog"}

在这个例子中,您将在Stanford Sentiment Treebank(SST)数据集上训练一个随机森林。该数据集的目标是将句子分类为positivenegative情感。您将使用在TensorFlow Datasets中精选的二分类版本的数据集。

注意: 训练categorical-set特征可能会很昂贵。在这本文中,我们将训练一个包含20棵树的小型随机森林。

# 安装 TensorFlow Datasets 包
!pip install tensorflow-datasets -U --quiet
# 导入tensorflow_datasets库
import tensorflow_datasets as tfds# 加载数据集
all_ds = tfds.load("glue/sst2")# 显示测试集中的前3个样例
for example in all_ds["test"].take(3):# 打印每个样例的属性名和属性值print({attr_name: attr_tensor.numpy() for attr_name, attr_tensor in example.items()})
{'idx': 163, 'label': -1, 'sentence': b'not even the hanson brothers can save it'}
{'idx': 131, 'label': -1, 'sentence': b'strong setup and ambitious goals fade as the film descends into unsophisticated scare tactics and b-film thuggery .'}
{'idx': 1579, 'label': -1, 'sentence': b'too timid to bring a sense of closure to an ugly chapter of the twentieth century .'}2023-11-20 12:31:28.022927: W tensorflow/core/kernels/data/cache_dataset_ops.cc:858] The calling iterator did not fully read the dataset being cached. In order to avoid unexpected truncation of the dataset, the partially cached contents of the dataset  will be discarded. This can happen if you have an input pipeline similar to `dataset.cache().take(k).repeat()`. You should use `dataset.take(k).cache().repeat()` instead.

数据集的修改如下:

  1. 原始标签是整数{-1, 1},但学习算法期望的是正整数标签,例如{0, 1}。因此,标签的转换如下:new_labels = (original_labels + 1) / 2
  2. 为了使数据集的读取更加高效,应用了批量大小为64。
  3. sentence属性需要进行分词,即"hello world" -> ["hello", "world"]

注意: 此示例不使用数据集的test拆分,因为它没有标签。如果test拆分有标签,可以将validation折叠连接到train中(例如all_ds["train"].concatenate(all_ds["validation"]))。

细节: 某些决策森林学习算法不需要验证数据集(例如随机森林),而其他一些算法则需要(例如某些情况下的梯度提升树)。由于TF-DF下的每个学习算法可以以不同的方式使用验证数据,TF-DF在内部处理训练/验证拆分。因此,当您有训练和验证集时,它们可以始终作为学习算法的输入进行连接。

# 定义函数prepare_dataset,用于处理数据集
# 参数example为输入的样本数据
def prepare_dataset(example):# 将label加1后除以2,得到标签值label = (example["label"] + 1) // 2# 将句子按空格进行分割,并返回分割后的结果作为"sentence"键的值return {"sentence" : tf.strings.split(example["sentence"])}, label# 将训练数据集all_ds中的"train"部分进行批处理,每批100个样本,并使用prepare_dataset函数进行处理
train_ds = all_ds["train"].batch(100).map(prepare_dataset)# 将验证数据集all_ds中的"validation"部分进行批处理,每批100个样本,并使用prepare_dataset函数进行处理
test_ds = all_ds["validation"].batch(100).map(prepare_dataset)

最后,像往常一样训练和评估模型。TF-DF会自动将多值分类特征识别为分类集合。

# 设置单元格高度为300
%set_cell_height 300
# 指定模型为随机森林模型,使用30棵树,verbose参数为2表示输出训练过程中的详细信息
model_1 = tfdf.keras.RandomForestModel(num_trees=30, verbose=2)# 训练模型,使用train_ds作为训练数据集
model_1.fit(x=train_ds)
<IPython.core.display.Javascript object>Warning: The `num_threads` constructor argument is not set and the number of CPU is os.cpu_count()=32 > 32. Setting num_threads to 32. Set num_threads manually to use more than 32 cpus.WARNING:absl:The `num_threads` constructor argument is not set and the number of CPU is os.cpu_count()=32 > 32. Setting num_threads to 32. Set num_threads manually to use more than 32 cpus.Use /tmpfs/tmp/tmpp9alip3z as temporary training directory
Reading training dataset...
Training tensor examples:
Features: {'sentence': tf.RaggedTensor(values=Tensor("data:0", shape=(None,), dtype=string), row_splits=Tensor("data_1:0", shape=(None,), dtype=int64))}
Label: Tensor("data_2:0", shape=(None,), dtype=int64)
Weights: None
Normalized tensor features:{'sentence': SemanticTensor(semantic=<Semantic.CATEGORICAL_SET: 4>, tensor=tf.RaggedTensor(values=Tensor("data:0", shape=(None,), dtype=string), row_splits=Tensor("data_1:0", shape=(None,), dtype=int64)))}
Training dataset read in 0:00:04.588912. Found 67349 examples.
Training model...
Standard output detected as not visible to the user e.g. running in a notebook. Creating a training log redirection. If training gets stuck, try calling tfdf.keras.set_training_logs_redirection(False).[INFO 23-11-20 12:31:32.7845 UTC kernel.cc:771] Start Yggdrasil model training
[INFO 23-11-20 12:31:32.7845 UTC kernel.cc:772] Collect training examples
[INFO 23-11-20 12:31:32.7846 UTC kernel.cc:785] Dataspec guide:
column_guides {column_name_pattern: "^__LABEL$"type: CATEGORICALcategorial {min_vocab_frequency: 0max_vocab_count: -1}
}
default_column_guide {categorial {max_vocab_count: 2000}discretized_numerical {maximum_num_bins: 255}
}
ignore_columns_without_guides: false
detect_numerical_as_discretized_numerical: false[INFO 23-11-20 12:31:32.7849 UTC kernel.cc:391] Number of batches: 674
[INFO 23-11-20 12:31:32.7849 UTC kernel.cc:392] Number of examples: 67349
[INFO 23-11-20 12:31:32.8290 UTC data_spec_inference.cc:305] 12816 item(s) have been pruned (i.e. they are considered out of dictionary) for the column sentence (2000 item(s) left) because min_value_count=5 and max_number_of_unique_values=2000
[INFO 23-11-20 12:31:32.8820 UTC kernel.cc:792] Training dataset:
Number of records: 67349
Number of columns: 2Number of columns by type:CATEGORICAL_SET: 1 (50%)CATEGORICAL: 1 (50%)Columns:CATEGORICAL_SET: 1 (50%)1: "sentence" CATEGORICAL_SET has-dict vocab-size:2001 num-oods:10187 (15.1257%) most-frequent:"the" 27205 (40.3941%)CATEGORICAL: 1 (50%)0: "__LABEL" CATEGORICAL integerized vocab-size:3 no-ood-itemTerminology:nas: Number of non-available (i.e. missing) values.ood: Out of dictionary.manually-defined: Attribute whose type is manually defined by the user, i.e., the type was not automatically inferred.tokenized: The attribute value is obtained through tokenization.has-dict: The attribute is attached to a string dictionary e.g. a categorical attribute stored as a string.vocab-size: Number of unique values.[INFO 23-11-20 12:31:32.8821 UTC kernel.cc:808] Configure learner
[INFO 23-11-20 12:31:32.8823 UTC kernel.cc:822] Training config:
learner: "RANDOM_FOREST"
features: "^sentence$"
label: "^__LABEL$"
task: CLASSIFICATION
random_seed: 123456
metadata {framework: "TF Keras"
}
pure_serving_model: false
[yggdrasil_decision_forests.model.random_forest.proto.random_forest_config] {num_trees: 30decision_tree {max_depth: 16min_examples: 5in_split_min_examples_check: truekeep_non_leaf_label_distribution: truenum_candidate_attributes: 0missing_value_policy: GLOBAL_IMPUTATIONallow_na_conditions: falsecategorical_set_greedy_forward {sampling: 0.1max_num_items: -1min_item_frequency: 1}growing_strategy_local {}categorical {cart {}}axis_aligned_split {}internal {sorting_strategy: PRESORTED}uplift {min_examples_in_treatment: 5split_score: KULLBACK_LEIBLER}}winner_take_all_inference: truecompute_oob_performances: truecompute_oob_variable_importances: falsenum_oob_variable_importances_permutations: 1bootstrap_training_dataset: truebootstrap_size_ratio: 1adapt_bootstrap_size_ratio_for_maximum_training_duration: falsesampling_with_replacement: true
}[INFO 23-11-20 12:31:32.8826 UTC kernel.cc:825] Deployment config:
cache_path: "/tmpfs/tmp/tmpp9alip3z/working_cache"
num_threads: 32
try_resume_training: true[INFO 23-11-20 12:31:32.8828 UTC kernel.cc:887] Train model
[INFO 23-11-20 12:31:32.8836 UTC random_forest.cc:416] Training random forest on 67349 example(s) and 1 feature(s).
[INFO 23-11-20 12:32:02.2437 UTC random_forest.cc:802] Training of tree  1/30 (tree index:13) done accuracy:0.738731 logloss:9.4171
[INFO 23-11-20 12:32:12.3428 UTC random_forest.cc:802] Training of tree  3/30 (tree index:27) done accuracy:0.754745 logloss:6.47525
[INFO 23-11-20 12:32:17.6546 UTC random_forest.cc:802] Training of tree  13/30 (tree index:20) done accuracy:0.801813 logloss:2.334
[INFO 23-11-20 12:32:18.5584 UTC random_forest.cc:802] Training of tree  23/30 (tree index:15) done accuracy:0.81742 logloss:0.942096
[INFO 23-11-20 12:32:21.9457 UTC random_forest.cc:802] Training of tree  30/30 (tree index:21) done accuracy:0.821274 logloss:0.854486
[INFO 23-11-20 12:32:21.9462 UTC random_forest.cc:882] Final OOB metrics: accuracy:0.821274 logloss:0.854486
[INFO 23-11-20 12:32:21.9558 UTC kernel.cc:919] Export model in log directory: /tmpfs/tmp/tmpp9alip3z with prefix d2f2a624a65443d5
[INFO 23-11-20 12:32:21.9870 UTC kernel.cc:937] Save model in resources
[INFO 23-11-20 12:32:21.9901 UTC abstract_model.cc:881] Model self evaluation:
Number of predictions (without weights): 67349
Number of predictions (with weights): 67349
Task: CLASSIFICATION
Label: __LABELAccuracy: 0.821274  CI95[W][0.818828 0.8237]
LogLoss: : 0.854486
ErrorRate: : 0.178726Default Accuracy: : 0.557826
Default LogLoss: : 0.686445
Default ErrorRate: : 0.442174Confusion Table:
truth\prediction1      2
1  19593  10187
2   1850  35719
Total: 67349[INFO 23-11-20 12:32:22.0155 UTC kernel.cc:1233] Loading model from path /tmpfs/tmp/tmpp9alip3z/model/ with prefix d2f2a624a65443d5
[INFO 23-11-20 12:32:22.3248 UTC decision_forest.cc:660] Model loaded with 30 root(s), 43180 node(s), and 1 input feature(s).
[INFO 23-11-20 12:32:22.3249 UTC abstract_model.cc:1344] Engine "RandomForestGeneric" built
[INFO 23-11-20 12:32:22.3249 UTC kernel.cc:1061] Use fast generic engineModel trained in 0:00:49.561739
Compiling model...
Model compiled.<keras.src.callbacks.History at 0x7fd79650ec70>

在之前的日志中,注意到 sentence 是一个 CATEGORICAL_SET 特征。

模型的评估与往常一样:

# 对模型进行编译,指定评估指标为准确率
model_1.compile(metrics=["accuracy"])# 对测试数据集进行评估,返回损失值和准确率
evaluation = model_1.evaluate(test_ds)# 打印二元交叉熵损失值
print(f"BinaryCrossentropyloss: {evaluation[0]}")# 打印准确率
print(f"Accuracy: {evaluation[1]}")
1/9 [==>...........................] - ETA: 4s - loss: 0.0000e+00 - accuracy: 0.8100
9/9 [==============================] - 1s 5ms/step - loss: 0.0000e+00 - accuracy: 0.7638
BinaryCrossentropyloss: 0.0
Accuracy: 0.7637614607810974

训练日志如下所示:

# 导入matplotlib.pyplot模块,用于绘制图表
import matplotlib.pyplot as plt# 获取模型的训练日志
logs = model_1.make_inspector().training_logs()# 绘制折线图,横坐标为日志中的树的数量,纵坐标为日志中的评估准确率
plt.plot([log.num_trees for log in logs], [log.evaluation.accuracy for log in logs])# 设置横坐标的标签为"Number of trees"
plt.xlabel("Number of trees")# 设置纵坐标的标签为"Out-of-bag accuracy"
plt.ylabel("Out-of-bag accuracy")# 保持图表的原样,不做任何处理
pass

更多的树可能会有益处(我确定,因为我试过:p)。

使用预训练的文本嵌入

前面的例子使用原始文本特征训练了一个随机森林模型。这个例子将使用一个预训练的TF-Hub嵌入将文本特征转换为密集嵌入,并在其上训练一个随机森林模型。在这种情况下,随机森林模型只会“看到”嵌入的数值输出(即它不会看到原始文本)。

在这个实验中,我们将使用Universal-Sentence-Encoder。不同的预训练嵌入可能适用于不同类型的文本(例如不同的语言、不同的任务),也适用于其他类型的结构化特征(例如图像)。

**注意:**这个嵌入模块很大(1GB),因此最终模型的运行速度会比传统的决策树推断慢。

嵌入模块可以应用在两个地方:

  1. 在数据集准备阶段。
  2. 在模型的预处理阶段。

通常情况下,第二个选项更可取:将嵌入打包到模型中使得模型更容易使用(也更难被误用)。

首先安装TF-Hub:

# 安装tensorflow-hub库的最新版本
!pip install --upgrade tensorflow-hub
Requirement already satisfied: tensorflow-hub in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (0.15.0)
Requirement already satisfied: numpy>=1.12.0 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow-hub) (1.26.2)
Requirement already satisfied: protobuf>=3.19.6 in /tmpfs/src/tf_docs_env/lib/python3.9/site-packages (from tensorflow-hub) (3.20.3)

与以前不同的是,您不需要对文本进行分词。

# 定义函数prepare_dataset,输入参数为example
def prepare_dataset(example):# 将label加1后除以2,得到label的值label = (example["label"] + 1) // 2# 返回一个字典,键为"sentence",值为example中的"sentence"对应的值,以及label的值return {"sentence" : example["sentence"]}, label# 将all_ds中的"train"数据集按照batch size为100进行分批,并对每个batch应用prepare_dataset函数进行处理,得到train_ds数据集
train_ds = all_ds["train"].batch(100).map(prepare_dataset)# 将all_ds中的"validation"数据集按照batch size为100进行分批,并对每个batch应用prepare_dataset函数进行处理,得到test_ds数据集
test_ds = all_ds["validation"].batch(100).map(prepare_dataset)
%set_cell_height 300# 导入tensorflow_hub模块
import tensorflow_hub as hub
# 定义使用的模型为Universal Sentence Encoder,版本为4
hub_url = "https://tfhub.dev/google/universal-sentence-encoder/4"
# 将模型转换为Keras层
embedding = hub.KerasLayer(hub_url)# 定义输入层,输入为字符串类型的句子
sentence = tf.keras.layers.Input(shape=(), name="sentence", dtype=tf.string)
# 将句子转换为嵌入向量
embedded_sentence = embedding(sentence)# 定义原始输入为句子,处理后的输入为嵌入向量
raw_inputs = {"sentence": sentence}
processed_inputs = {"embedded_sentence": embedded_sentence}
# 定义预处理模型,将原始输入转换为处理后的输入
preprocessor = tf.keras.Model(inputs=raw_inputs, outputs=processed_inputs)# 定义随机森林模型,使用预处理模型进行数据预处理,树的数量为100
model_2 = tfdf.keras.RandomForestModel(preprocessing=preprocessor,num_trees=100)# 使用训练数据进行模型训练
model_2.fit(x=train_ds)
<IPython.core.display.Javascript object>Warning: The `num_threads` constructor argument is not set and the number of CPU is os.cpu_count()=32 > 32. Setting num_threads to 32. Set num_threads manually to use more than 32 cpus.WARNING:absl:The `num_threads` constructor argument is not set and the number of CPU is os.cpu_count()=32 > 32. Setting num_threads to 32. Set num_threads manually to use more than 32 cpus.Use /tmpfs/tmp/tmp2l8qenh8 as temporary training directory
Reading training dataset...
Training dataset read in 0:00:22.682140. Found 67349 examples.
Training model...[INFO 23-11-20 12:33:16.6995 UTC kernel.cc:1233] Loading model from path /tmpfs/tmp/tmp2l8qenh8/model/ with prefix a883bbf674954d64Model trained in 0:00:14.090027
Compiling model...[INFO 23-11-20 12:33:18.4993 UTC decision_forest.cc:660] Model loaded with 100 root(s), 563552 node(s), and 512 input feature(s).
[INFO 23-11-20 12:33:18.4994 UTC abstract_model.cc:1344] Engine "RandomForestOptPred" built
[INFO 23-11-20 12:33:18.4996 UTC kernel.cc:1061] Use fast generic engineModel compiled.<keras.src.callbacks.History at 0x7fd690629e50>
# 编译模型
model_2.compile(metrics=["accuracy"])# 评估模型
evaluation = model_2.evaluate(test_ds)# 打印二元交叉熵损失
print(f"BinaryCrossentropyloss: {evaluation[0]}")# 打印准确率
print(f"Accuracy: {evaluation[1]}")
1/9 [==>...........................] - ETA: 13s - loss: 0.0000e+00 - accuracy: 0.7800
4/9 [============>.................] - ETA: 0s - loss: 0.0000e+00 - accuracy: 0.8075 
7/9 [======================>.......] - ETA: 0s - loss: 0.0000e+00 - accuracy: 0.7886
9/9 [==============================] - 2s 18ms/step - loss: 0.0000e+00 - accuracy: 0.7798
BinaryCrossentropyloss: 0.0
Accuracy: 0.7798165082931519

注意,分类集合与密集嵌入在表示文本时有所不同,因此同时使用这两种策略可能会很有用。

同时训练决策树和神经网络

前面的例子使用了一个预训练的神经网络(NN)来处理文本特征,然后将它们传递给随机森林。这个例子将从头开始训练神经网络和随机森林。

TF-DF的决策森林不会反向传播梯度(尽管这是正在进行的研究的主题)。因此,训练分为两个阶段:

  1. 将神经网络作为标准分类任务进行训练:
示例 → [归一化] → [神经网络*] → [分类头] → 预测
*: 训练。
  1. 用随机森林替换神经网络的头(最后一层和软最大值)。像往常一样训练随机森林:
示例 → [归一化] → [神经网络] → [随机森林*] → 预测
*: 训练。### 准备数据集本示例使用[Palmer's Penguins](https://allisonhorst.github.io/palmerpenguins/articles/intro.html)数据集。有关详细信息,请参阅[初学者colab](beginner_colab.ipynb)。首先,下载原始数据:```python
# 下载penguins.csv文件并保存到指定路径/tmp/penguins.csv!wget -q https://storage.googleapis.com/download.tensorflow.org/data/palmer_penguins/penguins.csv -O /tmp/penguins.csv

将数据集加载到Pandas Dataframe中

# 读取csv文件,将数据存储在dataset_df中
dataset_df = pd.read_csv("/tmp/penguins.csv")# 显示dataset_df中前3个样本的数据
dataset_df.head(3)
speciesislandbill_length_mmbill_depth_mmflipper_length_mmbody_mass_gsexyear
0AdelieTorgersen39.118.7181.03750.0male2007
1AdelieTorgersen39.517.4186.03800.0female2007
2AdelieTorgersen40.318.0195.03250.0female2007

准备训练数据集。

# 设置标签为"species"
label = "species"
# 将数据集中的数值NaN(表示Pandas Dataframe中的缺失值)替换为0。
# ...神经网络对数值NaN的处理效果不好。
for col in dataset_df.columns:# 如果数据集中的列的数据类型不是字符串或对象类型if dataset_df[col].dtype not in [str, object]:# 将该列中的NaN值替换为0dataset_df[col] = dataset_df[col].fillna(0)
# 将数据集拆分为训练集和测试集def split_dataset(dataset, test_ratio=0.30):"""将panda dataframe拆分为两个部分。"""# 生成一个与数据集长度相同的随机数组,元素值小于测试比例的为True,大于等于测试比例的为Falsetest_indices = np.random.rand(len(dataset)) < test_ratio# 返回训练集和测试集return dataset[~test_indices], dataset[test_indices]# 调用split_dataset函数将数据集拆分为训练集和测试集
train_ds_pd, test_ds_pd = split_dataset(dataset_df)
# 打印训练集和测试集的样本数量
print("{} 个样本用于训练,{} 个样本用于测试。".format(len(train_ds_pd), len(test_ds_pd)))# 将数据集转换为tensorflow数据集
train_ds = tfdf.keras.pd_dataframe_to_tf_dataset(train_ds_pd, label=label)
test_ds = tfdf.keras.pd_dataframe_to_tf_dataset(test_ds_pd, label=label)
248 examples in training, 96 examples for testing.

构建模型

接下来,使用Keras的函数式风格创建神经网络模型。

为了保持示例简单,该模型仅使用两个输入。

# 创建两个输入层
input_1 = tf.keras.Input(shape=(1,), name="bill_length_mm", dtype="float")  # 输入层1,表示企鹅嘴峰的长度,数据类型为浮点数
input_2 = tf.keras.Input(shape=(1,), name="island", dtype="string")  # 输入层2,表示企鹅所在的岛屿,数据类型为字符串# 将两个输入层组合成一个列表
nn_raw_inputs = [input_1, input_2]  # 输入层列表,包含两个输入层

使用预处理层将原始输入转换为适合神经网络的输入。

# 正则化
Normalization = tf.keras.layers.Normalization
CategoryEncoding = tf.keras.layers.CategoryEncoding
StringLookup = tf.keras.layers.StringLookup# 获取"bill_length_mm"列的值,并将其转换为二维数组
values = train_ds_pd["bill_length_mm"].values[:, tf.newaxis]
# 创建Normalization层实例
input_1_normalizer = Normalization()
# 对输入数据进行适应,计算均值和方差
input_1_normalizer.adapt(values)# 获取"island"列的值
values = train_ds_pd["island"].values
# 创建StringLookup层实例,将字符串转换为整数索引
input_2_indexer = StringLookup(max_tokens=32)
# 对输入数据进行适应,构建索引映射关系
input_2_indexer.adapt(values)# 创建CategoryEncoding层实例,将整数索引转换为二进制编码
input_2_onehot = CategoryEncoding(output_mode="binary", max_tokens=32)# 对输入数据进行正则化处理
normalized_input_1 = input_1_normalizer(input_1)
# 将输入数据转换为整数索引,并进行二进制编码
normalized_input_2 = input_2_onehot(input_2_indexer(input_2))# 组合处理后的输入数据
nn_processed_inputs = [normalized_input_1, normalized_input_2]
WARNING:tensorflow:max_tokens is deprecated, please use num_tokens instead.WARNING:tensorflow:max_tokens is deprecated, please use num_tokens instead.

构建神经网络的主体部分:

# 创建一个Concatenate层,用于将nn_processed_inputs中的张量连接起来
y = tf.keras.layers.Concatenate()(nn_processed_inputs)# 创建一个Dense层,输出维度为16,激活函数为relu6
y = tf.keras.layers.Dense(16, activation=tf.nn.relu6)(y)# 创建一个Dense层,输出维度为8,激活函数为relu,命名为"last"
last_layer = tf.keras.layers.Dense(8, activation=tf.nn.relu, name="last")(y)# 创建一个Dense层,输出维度为3,用于分类任务
classification_output = tf.keras.layers.Dense(3)(y)# 创建一个模型,输入为nn_raw_inputs,输出为classification_output
nn_model = tf.keras.models.Model(nn_raw_inputs, classification_output)

这个 nn_model 直接产生分类的 logits。

接下来创建一个决策森林模型。它将在神经网络在分类头之前提取的高级特征上操作。


# 将神经网络模型去掉输出层,得到nn_without_head模型
# 将神经网络模型和决策森林模型组合成一个keras模型,即df_and_nn_model
nn_without_head = tf.keras.models.Model(inputs=nn_model.inputs, outputs=last_layer)
df_and_nn_model = tfdf.keras.RandomForestModel(preprocessing=nn_without_head)
Warning: The `num_threads` constructor argument is not set and the number of CPU is os.cpu_count()=32 > 32. Setting num_threads to 32. Set num_threads manually to use more than 32 cpus.WARNING:absl:The `num_threads` constructor argument is not set and the number of CPU is os.cpu_count()=32 > 32. Setting num_threads to 32. Set num_threads manually to use more than 32 cpus.Use /tmpfs/tmp/tmpzwv9a980 as temporary training directory

训练和评估模型

该模型将分为两个阶段进行训练。首先,使用自己的分类头训练神经网络:

# 设置单元格高度为300# 编译神经网络模型
nn_model.compile(optimizer=tf.keras.optimizers.Adam(),  # 使用Adam优化器loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),  # 使用稀疏分类交叉熵损失函数metrics=["accuracy"]  # 评估指标为准确率
)# 使用训练数据集进行训练,并使用测试数据集进行验证,训练10个epochs
nn_model.fit(x=train_ds, validation_data=test_ds, epochs=10)# 打印神经网络模型的概要信息
nn_model.summary()
<IPython.core.display.Javascript object>Epoch 1/10/tmpfs/src/tf_docs_env/lib/python3.9/site-packages/keras/src/engine/functional.py:642: UserWarning: Input dict contained keys ['bill_depth_mm', 'flipper_length_mm', 'body_mass_g', 'sex', 'year'] which did not match any model input. They will be ignored by the model.inputs = self._flatten_to_reference_inputs(inputs)
WARNING: All log messages before absl::InitializeLog() is called are written to STDERR
I0000 00:00:1700483606.110085  457876 device_compiler.h:186] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.1/1 [==============================] - ETA: 0s - loss: 1.4043 - accuracy: 0.0161
1/1 [==============================] - 2s 2s/step - loss: 1.4043 - accuracy: 0.0161 - val_loss: 1.3502 - val_accuracy: 0.0208
Epoch 2/101/1 [==============================] - ETA: 0s - loss: 1.3963 - accuracy: 0.0161
1/1 [==============================] - 0s 21ms/step - loss: 1.3963 - accuracy: 0.0161 - val_loss: 1.3436 - val_accuracy: 0.0208
Epoch 3/101/1 [==============================] - ETA: 0s - loss: 1.3885 - accuracy: 0.0161
1/1 [==============================] - 0s 21ms/step - loss: 1.3885 - accuracy: 0.0161 - val_loss: 1.3371 - val_accuracy: 0.0208
Epoch 4/101/1 [==============================] - ETA: 0s - loss: 1.3809 - accuracy: 0.0161
1/1 [==============================] - 0s 21ms/step - loss: 1.3809 - accuracy: 0.0161 - val_loss: 1.3305 - val_accuracy: 0.0208
Epoch 5/101/1 [==============================] - ETA: 0s - loss: 1.3733 - accuracy: 0.0161
1/1 [==============================] - 0s 21ms/step - loss: 1.3733 - accuracy: 0.0161 - val_loss: 1.3241 - val_accuracy: 0.0208
Epoch 6/101/1 [==============================] - ETA: 0s - loss: 1.3658 - accuracy: 0.0161
1/1 [==============================] - 0s 20ms/step - loss: 1.3658 - accuracy: 0.0161 - val_loss: 1.3177 - val_accuracy: 0.0208
Epoch 7/101/1 [==============================] - ETA: 0s - loss: 1.3584 - accuracy: 0.0161
1/1 [==============================] - 0s 20ms/step - loss: 1.3584 - accuracy: 0.0161 - val_loss: 1.3113 - val_accuracy: 0.0208
Epoch 8/101/1 [==============================] - ETA: 0s - loss: 1.3511 - accuracy: 0.0081
1/1 [==============================] - 0s 20ms/step - loss: 1.3511 - accuracy: 0.0081 - val_loss: 1.3050 - val_accuracy: 0.0208
Epoch 9/101/1 [==============================] - ETA: 0s - loss: 1.3440 - accuracy: 0.0081
1/1 [==============================] - 0s 21ms/step - loss: 1.3440 - accuracy: 0.0081 - val_loss: 1.2988 - val_accuracy: 0.0208
Epoch 10/101/1 [==============================] - ETA: 0s - loss: 1.3369 - accuracy: 0.0121
1/1 [==============================] - 0s 21ms/step - loss: 1.3369 - accuracy: 0.0121 - val_loss: 1.2927 - val_accuracy: 0.0312
Model: "model_1"
__________________________________________________________________________________________________Layer (type)                Output Shape                 Param #   Connected to                  
==================================================================================================island (InputLayer)         [(None, 1)]                  0         []                            bill_length_mm (InputLayer  [(None, 1)]                  0         []                            )                                                                                                string_lookup (StringLooku  (None, 1)                    0         ['island[0][0]']              p)                                                                                               normalization (Normalizati  (None, 1)                    3         ['bill_length_mm[0][0]']      on)                                                                                              category_encoding (Categor  (None, 32)                   0         ['string_lookup[0][0]']       yEncoding)                                                                                       concatenate (Concatenate)   (None, 33)                   0         ['normalization[0][0]',       'category_encoding[0][0]']   dense (Dense)               (None, 16)                   544       ['concatenate[0][0]']         dense_1 (Dense)             (None, 3)                    51        ['dense[0][0]']               ==================================================================================================
Total params: 598 (2.34 KB)
Trainable params: 595 (2.32 KB)
Non-trainable params: 3 (16.00 Byte)
__________________________________________________________________________________________________

神经网络层在两个模型之间共享。因此,现在神经网络已经训练好了,决策森林模型将适应于神经网络层的训练输出。

# 使用df_and_nn_model模型对训练数据集train_ds进行拟合df_and_nn_model.fit(x=train_ds)
<IPython.core.display.Javascript object>Reading training dataset...
Training dataset read in 0:00:00.293304. Found 248 examples.
Training model...
Model trained in 0:00:00.045032
Compiling model...
Model compiled.[INFO 23-11-20 12:33:27.2559 UTC kernel.cc:1233] Loading model from path /tmpfs/tmp/tmpzwv9a980/model/ with prefix 3397b294ee2f42a4
[INFO 23-11-20 12:33:27.2721 UTC decision_forest.cc:660] Model loaded with 300 root(s), 5280 node(s), and 7 input feature(s).
[INFO 23-11-20 12:33:27.2721 UTC kernel.cc:1061] Use fast generic engine<keras.src.callbacks.History at 0x7fd6a07d5ac0>

现在评估组合模型:

# 编译模型
df_and_nn_model.compile(metrics=["accuracy"])# 打印模型评估结果
print("Evaluation:", df_and_nn_model.evaluate(test_ds))
1/1 [==============================] - ETA: 0s - loss: 0.0000e+00 - accuracy: 0.9479
1/1 [==============================] - 0s 162ms/step - loss: 0.0000e+00 - accuracy: 0.9479
Evaluation: [0.0, 0.9479166865348816]

与仅使用神经网络相比:

# 打印输出 "Evaluation :",并调用神经网络模型的 evaluate 方法对测试数据集进行评估
print("Evaluation :", nn_model.evaluate(test_ds))
1/1 [==============================] - ETA: 0s - loss: 1.2927 - accuracy: 0.0312
1/1 [==============================] - 0s 13ms/step - loss: 1.2927 - accuracy: 0.0312
Evaluation : [1.2926578521728516, 0.03125]

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/503478.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

hive 导出json格式 文件_hive支持json格式的数据

Hive支持完全json格式的数据现有json格式的数据test.txt&#xff0c;如下{"name":"zhang","age":"20","sex":"man"}{"name":"li","age":"21","sex":"m…

超级计算机游戏电脑,Salad邀请PC玩家参与全球最大分布式超级计算机的构建

(来自&#xff1a;Salad 官网)据悉&#xff0c;自 2018 年成立以来&#xff0c;Salad 已经在 25 万名 PC 玩家的帮助下&#xff0c;利用闲置的硬件算力、以及开源的桌面应用程序&#xff0c;来帮助验证区块链交易。作为奖励&#xff0c;Salad 用户能够分享计算资源&#xff0c;…

pd对焦速度_捕捉爆炸瞬间!魅蓝Note6双PD对焦速度逆天

随着手机拍照技术的不断突破&#xff0c;手机拍照效果让我们惊叹不已。全新发布的魅蓝Note6手机之前已经被曝光了拍照样张&#xff0c;同时官网也在发布会前夕提出“全球最快的双摄&#xff0c;对焦速度是多少?今天就让我们通过实际样张&#xff0c;感受一下魅蓝Note6给我们带…

更换锁定计算机图片,电脑锁屏图片怎么设置

电脑锁屏图片怎么设置觉得电脑锁屏的图片单调没有新意&#xff1f;其实大家想知道电脑锁屏图片应该怎么设置吗&#xff1f;下面是小编推荐给大家的电脑锁屏图片怎么设置&#xff0c;希望大家有所收获。同时按下窗口键winR&#xff0c;调出运行对话框&#xff0c;如下图所示运行…

单片机音频谱曲软件_【自己写的小软件】CLY单片机音乐代码超级生成器

我比较喜欢动漫歌曲&#xff0c;最近用它打了《东京泰迪熊》到单片机里面&#xff0c;用P0.0做蜂鸣器输出口&#xff0c;挺好听的&#xff0c;分享一下源代码应该没什么问题吧&#xff1f;/*--------------------------------------------------------8051单片机音乐代码生成器…

六年级计算机应用计划,2017六年级信息技术下册教学计划

2017六年级信息技术下册教学计划制订教学计划必须按学生的特点制订&#xff0c;不能仿制照搬的计划&#xff0c;只有自己去试着做&#xff0c;摸索出自己的完整方法&#xff0c;才是最有用的。下面应届毕业生考试网小编为大家提供了2017六年级信息技术下册教学计划&#xff0c;…

aop统计请求数量_Spring-Boot+AOP+统计单次请求方法的执行次数和耗时情况-Go语言中文社区...

本篇结合aop(面向切面编程)的特性&#xff0c;对spring-boot项目下后端开发人员所关心的java代码的性能做了一次简单的统计&#xff0c;比如&#xff0c;前端发了一个post请求(一连串数据的保存)&#xff0c;到了后端&#xff0c;首先是指定Controller的某个方法做接收&#xf…

xadmin的html文件,django xadmin(2) 在xadmin基础上完成自定义页面

1.在xadmin.py&#xff0c;GlobalSettings中自定义菜单2.自定义视图函数&#xff0c;并获取原来的菜单等一下信息(主要是为了用xadmin的模板)&#xff0c;具体的自己看xadmin源码3.在adminx.py中注册路由4.html继承。例&#xff1a;xadmin.py:class GlobalSettings(object):sit…

python教程苹果版_python教程

https://www.xin3721.com/eschool/pythonxin3721/1、安装Homebrewhttps://brew.sh/index_zh-cn.html2、通过brew安装pyenv1)命令行输入&#xff1a;$ brew install pyenv(如果一直卡在Updating Homebrew就按ctrlc一次跳转brew update)2)在home目录的 .bash_profile文件中添加&a…

计算机网络与通信思维导图,用思维导图描述5G场景

随着全球首个5G火车站在上海虹桥火车站启动建设&#xff0c;5G时代离我们越来越近。去年底&#xff0c;工业和信息化部向三大运营商发送了5G系统中低频段试验频率使用许可&#xff0c;5G设备将开始试商用。5G毕竟是新技术&#xff0c;小编今天用思维导图给大家讲解一下5G场景&a…

pytorch 指定卡1_在pytorch中指定显卡

1. 利用CUDA_VISIBLE_DEVICES设置可用显卡在CUDA中设定可用显卡&#xff0c;一般有2种方式&#xff1a;(1) 在代码中直接指定import osos.environ[CUDA_VISIBLE_DEVICES] gpu_ids(2) 在命令行中执行代码时指定CUDA_VISIBLE_DEVICESgpu_ids python3 train.py如果使用sh脚本文件…

计算机学院五名学生开发手语app,大学生团队研发成功“聋人自然手语翻译器”APP...

把语音转换成文字&#xff0c;再将文字翻译成手语&#xff0c;在第25个“全国助残日”到来之际&#xff0c;江苏科技大学的一群平均年龄不到22岁的年轻创业者们&#xff0c;研发出了一款“聋人自然手语翻译器”&#xff0c;为普通人与聋哑群体搭建沟通的桥梁。拿起手机&#xf…

java不同进程的相互唤醒_Java多线程(二)同步与等待唤醒

1&#xff1a;数据安全问题1.1&#xff1a;什么情况下会出现数据安全问题&#xff1f;多个线程对同一个资源进行操作&#xff0c;并且操作资源的语句有多条。那么这个时候这些语句因为cpu的随机性&#xff0c;有可能被多个线程分开执行。导致数据安全问题。例子&#xff1a;有3…

苏州宾馆管理也计算机哪个学校好,苏州十大寄宿式中学学校排名榜

教师的素质目标是什么*的发展离不开教育&#xff0c;教育的发展离不开教师&#xff0c;教师的素质提高关系着民族的未来&#xff0c;教师不仅要教授知识&#xff0c;更重要的是教授做人&#xff0c;以下是小编为您整理的教师的素质目标是什么的相关内容。素质教育目标是提高国民…

aop注解配置切点 spring_Spring通过自定义注解灵活配置AOP切点

package com.lsz.config.enums;import java.lang.annotation.ElementType;import java.lang.annotation.Retention;import java.lang.annotation.RetentionPolicy;import java.lang.annotation.Target;/*** 加载配置注解** author lishuzhen* date 2020/11/4 11:22*/Target(Ele…

电大计算机网考上机操作题,电大计算机上机考试模拟题及答案 (1)

模拟试题模拟试题一:第001题:在Windows中添加”传真服务”.第002题:设置Internet Explorer,对所有官方微软网站不进行安全认证.操作步骤第003题:设置Outlook Express,新邮件.新闻邮件下载5天后即被删除, 当浪费的空间达到40%时压缩邮件&#xff0c;并将存储在C盘DDKS根目录下。…

python中哪个符号用于从包中导入模块__学小易找答案

【简答题】7个积分题【单选题】5. Is it time for the meeting now?【单选题】result lambda x: x * x print(result(5)) 以上代码输出结果为?【其它】第一次作业.docx【简答题】11个求导题!【单选题】18. Where does the woman want to work?【单选题】15. Where does the…

在职人员计算机网络管理总结,关于学校网络管理员个人工作总结

关于学校网络管理员个人工作总结主要工作职责1.按照规定流程开通校园网用户&#xff0c;做好审核与登记工作。2.接听办公室报修电话&#xff0c;负责校园网网络故障报修用户信息登记。3.凭证参加网络故障现场维护(自网络故障报修日起两个工作日内)。4.统计当天网络故障报修数据…

long mode 分页_在Spring Boot中使用Spring-data-jpa实现分页查询(转)

在我们平时的工作中&#xff0c;查询列表在我们的系统中基本随处可见&#xff0c;那么我们如何使用jpa进行多条件查询以及查询列表分页呢&#xff1f;下面我将介绍两种多条件查询方式。1、引入起步依赖org.springframework.bootspring-boot-starter-weborg.springframework.boo…

小学计算机课程评价,小学信息技术课堂评价浅谈

小学信息技术是一门融知识性、趣味性和技能性于一体的学科&#xff0c;它着重于对小学生进行初步的信息意识、信息素养和信息技能的培养&#xff0c;集知识性和技能性于一体。而对于学生学习情况的评价&#xff0c;信息技术学科不像其他学科一样&#xff0c;可以留有课后作业或…