FLAN-T5模型的文本摘要任务

Text Summarization with FLAN-T5 — ROCm Blogs (amd.com)

在这篇博客中,我们展示了如何使用HuggingFace在AMD GPU + ROCm系统上对语言模型FLAN-T5进行微调,以执行文本摘要任务。

介绍

FLAN-T5是谷歌发布的一个开源大型语言模型,相较于之前的T5模型有所增强。它是一个已经在指令数据集上进行预训练的编码器-解码器模型,这意味着该模型具备执行诸如摘要、分类和翻译等特定任务的能力。有关FLAN-T5的更多详情,请参考[原始论文](https://arxiv.org/pdf/2210.11416.pdf)。要查看模型相对于之前的T5模型的完整改进细节,请参考[这个模型卡片](https://huggingface.co/docs/transformers/model_doc/t5v1.1)。

先决条件

• [ROCm](ROCm quick start installation guide for Linux — ROCm installation (Linux))
• [PyTorch](Installing PyTorch for ROCm — ROCm installation (Linux))
• [Linux 操作系统](System requirements (Linux) — ROCm installation (Linux))
• [一块AMD GPU](System requirements (Linux) — ROCm installation (Linux))
确保系统能识别出你的GPU:

! rocm-smi --showproductname
================= ROCm System Management Interface ================
========================= Product Info ============================
GPU[0] : Card series: Instinct MI210
GPU[0] : Card model: 0x0c34
GPU[0] : Card vendor: Advanced Micro Devices, Inc. [AMD/ATI]
GPU[0] : Card SKU: D67301
===================================================================
===================== End of ROCm SMI Log =========================

我们来检查是否安装了正确版本的ROCm。

!apt show rocm-libs -a
Package: rocm-libs
Version: 5.7.0.50700-63~22.04
Priority: optional
Section: devel
Maintainer: ROCm Libs Support <rocm-libs.support@amd.com>
Installed-Size: 13.3 kBA
Depends: hipblas (= 1.1.0.50700-63~22.04), hipblaslt (= 0.3.0.50700-63~22.04), hipfft (= 1.0.12.50700-63~22.04), hipsolver (= 1.8.1.50700-63~22.04), hipsparse (= 2.3.8.50700-63~22.04), miopen-hip (= 2.20.0.50700-63~22.04), rccl (= 2.17.1.50700-63~22.04), rocalution (= 2.1.11.50700-63~22.04), rocblas (= 3.1.0.50700-63~22.04), rocfft (= 1.0.23.50700-63~22.04), rocrand (= 2.10.17.50700-63~22.04), rocsolver (= 3.23.0.50700-63~22.04), rocsparse (= 2.5.4.50700-63~22.04), rocm-core (= 5.7.0.50700-63~22.04), hipblas-dev (= 1.1.0.50700-63~22.04), hipblaslt-dev (= 0.3.0.50700-63~22.04), hipcub-dev (= 2.13.1.50700-63~22.04), hipfft-dev (= 1.0.12.50700-63~22.04), hipsolver-dev (= 1.8.1.50700-63~22.04), hipsparse-dev (= 2.3.8.50700-63~22.04), miopen-hip-dev (= 2.20.0.50700-63~22.04), rccl-dev (= 2.17.1.50700-63~22.04), rocalution-dev (= 2.1.11.50700-63~22.04), rocblas-dev (= 3.1.0.50700-63~22.04), rocfft-dev (= 1.0.23.50700-63~22.04), rocprim-dev (= 2.13.1.50700-63~22.04), rocrand-dev (= 2.10.17.50700-63~22.04), rocsolver-dev (= 3.23.0.50700-63~22.04), rocsparse-dev (= 2.5.4.50700-63~22.04), rocthrust-dev (= 2.18.0.50700-63~22.04), rocwmma-dev (= 1.2.0.50700-63~22.04)
Homepage: https://github.com/RadeonOpenCompute/ROCm
Download-Size: 1012 B
APT-Manual-Installed: yes
APT-Sources: http://repo.radeon.com/rocm/apt/5.7 jammy/main amd64 Packages
Description: Radeon Open Compute (ROCm) Runtime software stack

确保PyTorch也识别出了GPU:

import torch
print(f"number of GPUs: {torch.cuda.device_count()}")
print([torch.cuda.get_device_name(i) for i in range(torch.cuda.device_count())])
number of GPUs: 1
['AMD Radeon Graphics']

在你开始之前,确保你已经安装了所有必需的库:

!pip install -q transformers accelerate einops datasets
!pip install --upgrade SQLAlchemy==1.4.46
!pip install -q alembic==1.4.1 numpy==1.23.4 grpcio-status==1.33.2 protobuf==3.19.6 
WARNING: Running pip as the 'root' user can result in broken permissions and conflicting behaviour with the system package manager. It is recommended to use a virtual environment instead: https://pip.pypa.io/warnings/venv
警告:以'root'用户身份运行pip可能会导致权限损坏和与系统包管理器的行为冲突。建议使用虚拟环境: https://pip.pypa.io/warnings/venv

接下来导入将在本博客中使用的模块:

import time 
import torch
from transformers import AutoModelForSeq2SeqLM, AutoTokenizer,Seq2SeqTrainingArguments, Seq2SeqTrainer, DataCollatorForSeq2Seq

加载模型

我们来加载模型及其分词器。FLAN-T5有多个不同大小的变体,从`small`到`xxl`。我们首先会使用`xxl`变体运行一些推论,并展示如何使用`small`变体在文本摘要任务上对Flan-T5进行微调。

start_time = time.time()
model_checkpoint = "google/flan-t5-xxl"
model = AutoModelForSeq2SeqLM.from_pretrained(model_checkpoint)
tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)
print(f"Loaded in {time.time() - start_time: .2f} seconds")
print(model)
Loading checkpoint shards: 100%|██████████| 5/5 [01:23<00:00, 16.69s/it]
Loaded in  85.46 seconds
T5ForConditionalGeneration((shared): Embedding(32128, 4096)(encoder): T5Stack((embed_tokens): Embedding(32128, 4096)(block): ModuleList((0): T5Block((layer): ModuleList((0): T5LayerSelfAttention((SelfAttention): T5Attention((q): Linear(in_features=4096, out_features=4096, bias=False)(k): Linear(in_features=4096, out_features=4096, bias=False)(v): Linear(in_features=4096, out_features=4096, bias=False)(o): Linear(in_features=4096, out_features=4096, bias=False)(relative_attention_bias): Embedding(32, 64))(layer_norm): FusedRMSNorm(torch.Size([4096]), eps=1e-06, elementwise_affine=True)(dropout): Dropout(p=0.1, inplace=False))(1): T5LayerFF((DenseReluDense): T5DenseGatedActDense((wi_0): Linear(in_features=4096, out_features=10240, bias=False)(wi_1): Linear(in_features=4096, out_features=10240, bias=False)(wo): Linear(in_features=10240, out_features=4096, bias=False)(dropout): Dropout(p=0.1, inplace=False)(act): NewGELUActivation())(layer_norm): FusedRMSNorm(torch.Size([4096]), eps=1e-06, elementwise_affine=True)(dropout): Dropout(p=0.1, inplace=False))))(1-23): 23 x T5Block((layer): ModuleList((0): T5LayerSelfAttention((SelfAttention): T5Attention((q): Linear(in_features=4096, out_features=4096, bias=False)(k): Linear(in_features=4096, out_features=4096, bias=False)(v): Linear(in_features=4096, out_features=4096, bias=False)(o): Linear(in_features=4096, out_features=4096, bias=False))(layer_norm): FusedRMSNorm(torch.Size([4096]), eps=1e-06, elementwise_affine=True)(dropout): Dropout(p=0.1, inplace=False))(1): T5LayerFF((DenseReluDense): T5DenseGatedActDense((wi_0): Linear(in_features=4096, out_features=10240, bias=False)(wi_1): Linear(in_features=4096, out_features=10240, bias=False)(wo): Linear(in_features=10240, out_features=4096, bias=False)(dropout): Dropout(p=0.1, inplace=False)(act): NewGELUActivation())(layer_norm): FusedRMSNorm(torch.Size([4096]), eps=1e-06, elementwise_affine=True)(dropout): Dropout(p=0.1, inplace=False)))))(final_layer_norm): FusedRMSNorm(torch.Size([4096]), eps=1e-06, elementwise_affine=True)(dropout): Dropout(p=0.1, inplace=False))(decoder): T5Stack((embed_tokens): Embedding(32128, 4096)(block): ModuleList((0): T5Block((layer): ModuleList((0): T5LayerSelfAttention((SelfAttention): T5Attention((q): Linear(in_features=4096, out_features=4096, bias=False)(k): Linear(in_features=4096, out_features=4096, bias=False)(v): Linear(in_features=4096, out_features=4096, bias=False)(o): Linear(in_features=4096, out_features=4096, bias=False)(relative_attention_bias): Embedding(32, 64))(layer_norm): FusedRMSNorm(torch.Size([4096]), eps=1e-06, elementwise_affine=True)(dropout): Dropout(p=0.1, inplace=False))(1): T5LayerCrossAttention((EncDecAttention): T5Attention((q): Linear(in_features=4096, out_features=4096, bias=False)(k): Linear(in_features=4096, out_features=4096, bias=False)(v): Linear(in_features=4096, out_features=4096, bias=False)(o): Linear(in_features=4096, out_features=4096, bias=False))(layer_norm): FusedRMSNorm(torch.Size([4096]), eps=1e-06, elementwise_affine=True)(dropout): Dropout(p=0.1, inplace=False))(2): T5LayerFF((DenseReluDense): T5DenseGatedActDense((wi_0): Linear(in_features=4096, out_features=10240, bias=False)(wi_1): Linear(in_features=4096, out_features=10240, bias=False)(wo): Linear(in_features=10240, out_features=4096, bias=False)(dropout): Dropout(p=0.1, inplace=False)(act): NewGELUActivation())(layer_norm): FusedRMSNorm(torch.Size([4096]), eps=1e-06, elementwise_affine=True)(dropout): Dropout(p=0.1, inplace=False))))(1-23): 23 x T5Block((layer): ModuleList((0): T5LayerSelfAttention((SelfAttention): T5Attention((q): Linear(in_features=4096, out_features=4096, bias=False)(k): Linear(in_features=4096, out_features=4096, bias=False)(v): Linear(in_features=4096, out_features=4096, bias=False)(o): Linear(in_features=4096, out_features=4096, bias=False))(layer_norm): FusedRMSNorm(torch.Size([4096]), eps=1e-06, elementwise_affine=True)(dropout): Dropout(p=0.1, inplace=False))(1): T5LayerCrossAttention((EncDecAttention): T5Attention((q): Linear(in_features=4096, out_features=4096, bias=False)(k): Linear(in_features=4096, out_features=4096, bias=False)(v): Linear(in_features=4096, out_features=4096, bias=False)(o): Linear(in_features=4096, out_features=4096, bias=False))(layer_norm): FusedRMSNorm(torch.Size([4096]), eps=1e-06, elementwise_affine=True)(dropout): Dropout(p=0.1, inplace=False))(2): T5LayerFF((DenseReluDense): T5DenseGatedActDense((wi_0): Linear(in_features=4096, out_features=10240, bias=False)(wi_1): Linear(in_features=4096, out_features=10240, bias=False)(wo): Linear(in_features=10240, out_features=4096, bias=False)(dropout): Dropout(p=0.1, inplace=False)(act): NewGELUActivation())(layer_norm): FusedRMSNorm(torch.Size([4096]), eps=1e-06, elementwise_affine=True)(dropout): Dropout(p=0.1, inplace=False)))))(final_layer_norm): FusedRMSNorm(torch.Size([4096]), eps=1e-06, elementwise_affine=True)(dropout): Dropout(p=0.1, inplace=False))(lm_head): Linear(in_features=4096, out_features=32128, bias=False)
)

执行推理

值得注意的是,我们可以直接在没有进行微调的情况下使用FLAN-T5模型。可以先看一些简单的推理。

例如,我们可以让模型回答一个简单的问题:

inputs = tokenizer("How to make milk coffee", return_tensors="pt")
outputs = model.generate(**inputs, max_new_tokens=100)
print(tokenizer.batch_decode(outputs, skip_special_tokens=True))
['Pour a cup of coffee into a mug. Add a tablespoon of milk. Add a pinch of sugar.']

或者我们可以要求它总结一段文字:

text = """ summarize: 
Amy: Hey Mark, have you heard about the new movie coming out this weekend?
Mark: Oh, no, I haven't. What's it called?
Amy: It's called "Stellar Odyssey." It's a sci-fi thriller with amazing special effects.
Mark: Sounds interesting. Who's in it?
Amy: The main lead is Emily Stone, and she's fantastic in the trailer. The plot revolves around a journey to a distant galaxy.
Mark: Nice! I'm definitely up for a good sci-fi flick. Want to catch it together on Saturday?
Amy: Sure, that sounds great! Let's meet at the theater around 7 pm.
"""
inputs = tokenizer(text, return_tensors="pt").input_ids
outputs = model.generate(inputs, max_new_tokens=100, do_sample=False)
tokenizer.decode(outputs[0], skip_special_tokens=True)
'Amy and Mark are going to see "Stellar Odyssey" on Saturday at 7 pm.'

微调

在本节中,我们将对模型进行微调以进行总结任务。我们将使用来自这个教程的代码作为我们的指导。正如提到的,我们将使用模型的`small`变体来进行微调:

start_time = time.time()
model_checkpoint = "google/flan-t5-small"
model = AutoModelForSeq2SeqLM.from_pretrained(model_checkpoint)
tokenizer = AutoTokenizer.from_pretrained(model_checkpoint)
print(f"Loaded in {time.time() - start_time: .2f} seconds")

加载数据集

我们的示例数据集是samsum数据集,包含约16K条类似Messenger的对话和总结。

from datasets import load_dataset
from evaluate import loadraw_datasets = load_dataset("samsum")

以下是我们数据集的一个样例:

print('Dialogue: ')
print(raw_datasets['train']['dialogue'][100])
print() 
print('Summary: ', raw_datasets['train']['summary'][100])
Dialogue: 
Gabby: How is you? Settling into the new house OK?
Sandra: Good. The kids and the rest of the menagerie are doing fine. The dogs absolutely love the new garden. Plenty of room to dig and run around.
Gabby: What about the hubby?
Sandra: Well, apart from being his usual grumpy self I guess he's doing OK.
Gabby: :-D yeah sounds about right for Jim.
Sandra: He's a man of few words. No surprises there. Give him a backyard shed and that's the last you'll see of him for months.
Gabby: LOL that describes most men I know.
Sandra: Ain't that the truth! 
Gabby: Sure is. :-) My one might as well move into the garage. Always tinkering and building something in there.
Sandra: Ever wondered what he's doing in there?
Gabby: All the time. But he keeps the place locked.
Sandra: Prolly building a portable teleporter or something. ;-)
Gabby: Or a time machine... LOL
Sandra: Or a new greatly improved Rabbit :-P
Gabby: I wish... Lmfao!Summary:  Sandra is setting into the new house; her family is happy with it. Then Sandra and Gabby discuss the nature of their men and laugh about their habit of spending time in the garage or a shed.

设置度量标准

接下来,我们将加载此任务的度量标准。通常,在总结任务中,我们使用ROUGE(回想导向的内容获取评估的助理)度量标准,这些标准量化原始文件与总结之间的相似度。更具体地说,这些度量标准测量系统总结和参考总结之间n-gram(n个连续词的序列)的重叠。有关此度量标准的更多细节,请参阅链接。

from evaluate import load
metric = load("rouge")
print(metric)
EvaluationModule(name: "rouge", module_type: "metric", features: [{'predictions': Value(dtype='string', id='sequence'), 'references': Sequence(feature=Value(dtype='string', id='sequence'), length=-1, id=None)}, {'predictions': Value(dtype='string', id='sequence'), 'references': Value(dtype='string', id='sequence')}], usage: """
Calculates average rouge scores for a list of hypotheses and references
Args:predictions: list of predictions to score. Each predictionshould be a string with tokens separated by spaces.references: list of reference for each prediction. Eachreference should be a string with tokens separated by spaces.rouge_types: A list of rouge types to calculate.Valid names:`"rouge{n}"` (e.g. `"rouge1"`, `"rouge2"`) where: {n} is the n-gram based scoring,`"rougeL"`: Longest common subsequence based scoring.`"rougeLsum"`: rougeLsum splits text using `"
"`.See details in https://github.com/huggingface/datasets/issues/617use_stemmer: Bool indicating whether Porter stemmer should be used to strip word suffixes.use_aggregator: Return aggregates if this is set to True
Returns:rouge1: rouge_1 (f1),rouge2: rouge_2 (f1),rougeL: rouge_l (f1),rougeLsum: rouge_lsum (f1)
Examples:>>> rouge = evaluate.load('rouge')>>> predictions = ["hello there", "general kenobi"]>>> references = ["hello there", "general kenobi"]>>> results = rouge.compute(predictions=predictions, references=references)>>> print(results){'rouge1': 1.0, 'rouge2': 1.0, 'rougeL': 1.0, 'rougeLsum': 1.0}
""", stored examples: 0)

我们需要创建一个函数来计算ROUGE度量标准:

import nltk
nltk.download('punkt')
import numpy as npdef compute_metrics(eval_pred):predictions, labels = eval_preddecoded_preds = tokenizer.batch_decode(predictions, skip_special_tokens=True)# We need to replace -100 in the labels since we can't decode it labels = np.where(labels != -100, labels, tokenizer.pad_token_id)decoded_labels = tokenizer.batch_decode(labels, skip_special_tokens=True)# Add new line after each sentence for rogue metricsdecoded_preds = ["\n".join(nltk.sent_tokenize(pred.strip())) for pred in decoded_preds]decoded_labels = ["\n".join(nltk.sent_tokenize(label.strip())) for label in decoded_labels]# compute metrics result = metric.compute(predictions=decoded_preds, references=decoded_labels, use_stemmer=True, use_aggregator=True)# Extract a few resultsresult = {key: value * 100 for key, value in result.items()}# compute the average length of the generated textprediction_lens = [np.count_nonzero(pred != tokenizer.pad_token_id) for pred in predictions]result["gen_len"] = np.mean(prediction_lens)return {k: round(v, 4) for k, v in result.items()}

处理数据

让我们创建一个函数来处理数据,这包括对每个样本文档的输入和输出进行标记化。我们还设置了长度阈值来截断输入和输出。

prefix = "summarize: "max_input_length = 1024
max_target_length = 128def preprocess_function(examples):inputs = [prefix + doc for doc in examples["dialogue"]]model_inputs = tokenizer(inputs, max_length=max_input_length, truncation=True)# Setup the tokenizer for targetslabels = tokenizer(text_target=examples["dialogue"], max_length=max_target_length, truncation=True)model_inputs["labels"] = labels["input_ids"]return model_inputstokenized_datasets = raw_datasets.map(preprocess_function, batched=True)

训练模型

要训练我们的模型,我们需要几样东西:

1. 数据收集器,在收集期间根据批次中最长的长度动态填充句子,而不是将整个数据集填充到最大长度。
2. 一个`TrainingArguments`类,用于自定义模型的训练方式。
3. Trainer类,这是一个用于在PyTorch中训练的API。

首先我们创建数据收集器:

data_collator = DataCollatorForSeq2Seq(tokenizer, model=model)

接下来,让我们设置我们的`TrainingArgument`类:

batch_size = 16
model_name = model_checkpoint.split("/")[-1]
args = Seq2SeqTrainingArguments(f"{model_name}-finetuned-samsum",evaluation_strategy = "epoch",learning_rate=2e-5,per_device_train_batch_size=batch_size,per_device_eval_batch_size=batch_size,weight_decay=0.01,save_total_limit=3,num_train_epochs=2,predict_with_generate=True,fp16=False,push_to_hub=False,
)

注:我们发现,由于模型是在Google TPU上预训练的,而不是在GPU上,我们需要设置`fp16=False`或`bf16=True`。否则我们会遇到溢出问题,从而导致我们的损失值出现NaN值。这可能是由于半精度浮点格式`fp16`和`bf16`之间的差异。

最后我们需要设置一个训练器API

trainer = Seq2SeqTrainer(model,args,train_dataset=tokenized_datasets["train"],eval_dataset=tokenized_datasets["validation"],data_collator=data_collator,tokenizer=tokenizer,compute_metrics=compute_metrics
)

有了这些,我们就可以训练我们的模型了!

trainer.train()
 [1842/1842 05:37, Epoch 2/2]
Epoch Training Loss Validation Loss Rouge1  Rouge2  Rougel  Rougelsum Gen Len
1 1.865700  1.693366  43.551000 20.046200 36.170400 40.096200 16.926700
2 1.816700  1.685862  43.506000 19.934800 36.278300 40.156700 16.837400

运行上述训练器应该会生成一个本地文件夹`flan-t5-small-finetuned-samsum`来存储我们的模型检查点。

推理

一旦我们有了微调模型,我们就可以使用它进行推理!让我们先重新加载来自我们本地检查点的分词器和经过微调的模型。

model = AutoModelForSeq2SeqLM.from_pretrained("flan-t5-small-finetuned-samsum/checkpoint-1500")
tokenizer = AutoTokenizer.from_pretrained("flan-t5-small-finetuned-samsum/checkpoint-1500")

接下来,我们用一些文本来总结。重要的是要像下面这样加上前缀:

text = """ summarize: 
Hannah: Hey, Mark, have you decided on your New Year's resolution yet?
Mark: Yeah, I'm thinking of finally hitting the gym regularly. What about you?
Hannah: I'm planning to read more books this year, at least one per month.
Mark: That sounds like a great goal. Any particular genre you're interested in?
Hannah: I want to explore more classic literature. Maybe start with some Dickens or Austen.
Mark: Nice choice. I'll hold you to it. We can discuss our progress over coffee.
Hannah: Deal! Accountability partners it is.
"""

最后,我们编码输入并生成摘要

inputs = tokenizer(text, return_tensors="pt").input_ids
outputs = model.generate(inputs, max_new_tokens=100, do_sample=False)
tokenizer.decode(outputs[0], skip_special_tokens=True)
'Hannah is planning to read more books this year. Mark will hold Hannah to it.'

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/853344.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

什么是专业的CRM客户管理系统,介绍crm客户管理系统的功能作用

CRM&#xff08;Customer Relationship Management&#xff09;客户管理系统&#xff0c;是现代企业不可或缺的一款管理工具。它集客户信息管理、销售自动化、客户服务与支持、数据分析与决策支持等多项功能于一身&#xff0c;帮助企业实现客户关系的全方位管理&#xff0c;从而…

浏览器必备插件:最新Allow copy万能网页复制下载,解锁网页限制!

今天阿星给大家安利一个超级实用的小工具&#xff0c;专治那些“禁止复制”的网页文字。学生党、资料搜集狂人&#xff0c;你们有福了&#xff01; 想象一下&#xff0c;你在网上冲浪&#xff0c;突然遇到一篇干货满满的文章&#xff0c;正想复制下来慢慢品味&#xff0c;结果…

CubeMX Keil Configure

// 使用外部高速晶振 外部高速晶振为8M&#xff08;根据开发板上的晶振频率设置&#xff09;&#xff0c;使用 PLLCLK&#xff0c;HSE 选项 USART1 使用 Asynchronous&#xff0c;PA9&#xff0c;PA10引脚&#xff08;USART1 引脚根据开发板上引脚设置&#xff09; 设置MDK AR…

Docker Nginx

Docker官网 https://www.docker.com/https://www.docker.com/ 删除原先安装的Docker sudo yum remove docker \ docker-client \ docker-client-latest \ docker-common \ docker-latest \ …

android13 应用冷启动

1 概述 launcher 通过binder到systemserver中atms中发送startActivity请求 startProcess向zygote发送启动新进程请求 zygote收到请求&#xff0c;fork新进程并调用ActivityThread的main初始化 新进程启动&#xff0c;发送attachApplication给ams&#xff0c;告诉他新进程启动…

希亦、添可、石头洗地机哪款好用?2024洗地机深度测评

今年的洗地机市场竞争异常激烈&#xff0c;各大品牌纷纷推出了自己的旗舰产品。这对消费者来说是个好消息&#xff0c;因为有更多的选择空间。然而&#xff0c;面对如此多的优质洗地机&#xff0c;选择合适的一款也成了一种“幸福的烦恼”。 作为一个专业的测评人士&#xff0…

kaggle竞赛实战10——特征优化

特征优化思路&#xff1a; 在完成常规流程后&#xff0c;如果不知道怎么办&#xff0c;可以针对文本or时间序列特征进行进一步处理 首先&#xff0c;我们注意到&#xff0c;每一笔信用卡的交易记录都有交易时间&#xff0c;而对于时间字段和文本字段&#xff0c;普通的批量创…

AI赋能软件测试

AI赋能软件测试 AI赋能软件测试软件测试分类软件质量模型:用来衡量软件质量的维度AI赋能软件测试 随着AI时代的到来,如何轻松掌握软件测试新趋势,将AI技术应用于软件测试行业,提高测试速度与测试效率~~ 传智星云AI助手:https://nebula.itcast.cn tips:各种AI工具应有尽有…

LeetCode | 66.加一

这道题有多个思路&#xff0c;可以依次取数组的每一位&#xff0c;乘10后加下一位&#xff0c;直到最后一位&#xff0c;就得到我们数组所表示的数字&#xff0c;然后加一&#xff0c;然后把新得到的数字再转化为对应的数组&#xff0c;我的做法是直接取数组的最后一位&#xf…

人工智能GPU互联技术分析,芯片巨头UALink向英伟达NVLink开战

芯片巨头组团&#xff0c;向英伟达NVLink开战 八大科技巨头——AMD、博通、思科、Google、惠普企业、英特尔、Meta及微软——联合推出UALink&#xff08;Ultra Accelerator Link&#xff09;技术&#xff0c;为人工智能数据中心网络设定全新互联标准。此举旨在打破Nvidia的市场…

Python 小市值股票模型代码及回测分析

目录 一、模型介绍 二、代码详解 2.1 初始化函数 2.2 股票筛选过滤函数 2.3 止损函数 2.4 开盘时运行函数 2.5 调仓函数 三、回测结果分析 3.1 收益净值图与概述 3.2 模型收益概览 3.3 年度收益图 3.4 月度收益的时间序列 3.5 月度收益热力图 3.6 月度收益频次分…

java之IO流和集合框架的笔记

1 File类的使用 1.1 概述 File类及本章下的各种流&#xff0c;都定义在java.io包下。 一个File对象代表硬盘或网络中可能存在的一个文件或者文件目录&#xff08;俗称文件夹&#xff09;&#xff0c;与平台无关。&#xff08;体会万事万物皆对象&#xff09; File 能新建、删…

Java---认识异常

欢迎大家来观看本博课------Java------认识异常。1.异常的概念和体系结构 1.异常的概念和体系结构 1.1 异常的概念 在Java中&#xff0c;在程序执行过程中发生的不正常行为称为异常。如在之前我们经常遇到的算数异常&#xff08;ArithmeticException&#xff09;、数组越界…

GDB:从零开始入门GDB

目录 1.前言 2.开启项目报错 3.GDB的进入和退出 4.GDB调试中查看代码和切换文件 5.GDB调试中程序的启动和main函数传参 6.GDB中断点相关的操作 7.GDB中的调试输出指令 8.GDB中自动输出值指令 9.GDB中的调试指令 前言 在日常开发中&#xff0c;调试是我们必不可少的技能。在专业…

408数据结构-图的遍历 自学知识点整理

前置知识&#xff1a;图的存储与基本操作 图的遍历是指从图的某一顶点出发&#xff0c;按照某种搜索方法沿着图中的边对图中的所有顶点访问一次&#xff0c;且仅访问一次。因为树是一种特殊的图&#xff0c;所以树的遍历实际上也可以视为一种特殊的图的遍历。图的遍历算法是求解…

利用鱼骨图进行项目问题复盘与改进

一、引言 在项目管理中&#xff0c;问题复盘是一个至关重要的环节。它不仅能帮助我们识别项目执行过程中出现的问题&#xff0c;还能促使我们深入探究问题的根本原因&#xff0c;从而采取有效的改进措施。在这个过程中&#xff0c;鱼骨图作为一种强大的工具&#xff0c;为我们…

MEMS:Lecture 16 Gyros

陀螺仪原理 A classic spinning gyroscope measures the rotation rate by utilizing the conservation of angular momentum. 经典旋转陀螺仪通过利用角动量守恒来测量旋转速率。 Coriolis Effect and Coriolis Force 科里奥利效应是一种出现在旋转参考系中的现象。它描述了…

Internet Download Manager ( 极速下载器 ) 序列号注册码 IDM下载器注册机中文激活破解版

IDM下载器(Internet Download Manager)是一款专业的下载管理软件&#xff0c;它通过多线程技术和智能文件分段技术&#xff0c;有效提升下载速度&#xff0c;并支持断点续传&#xff0c;还具有计划下载功能&#xff0c;用户可以设置特定的下载时间&#xff0c;非常适合需要在特…

代码随想录算法训练营刷题复习1 :动态规划背包问题 01背包+完全背包

动态规划刷题复习 一、01背包 416. 分割等和子集1049. 最后一块石头的重量 II494. 目标和474. 一和零 416. 分割等和子集 class Solution { public:bool canPartition(vector<int>& nums) {int sum0;for(int i0;i<nums.size();i) {sumnums[i];}if(sum%2!0)retu…

IP地址、子网掩码、网段、网关

前面相同就是在同一个网段 如果子网掩码和网络号相与的结果是一样的&#xff0c;那么他们就在同一个子网 IP地址、子网掩码、网络号、主机号、网络地址、主机地址以及ip段/数字-如192.168.0.1/24是什么意思?_掩码248可以用几个ip-CSDN博客