HuggingFace Transformer

NLP简介

在这里插入图片描述
在这里插入图片描述

HuggingFace简介

hugging face在NLP领域最出名,其提供的模型大多都是基于Transformer的。为了易用性,Hugging Face还为用户提供了以下几个项目:

  • Transformers(github, 官方文档): Transformers提供了上千个预训练好的模型可以用于不同的任务,例如文本领域、音频领域和CV领域。该项目是HuggingFace的核心,可以说学习HuggingFace就是在学习该项目如何使用。
  • Datasets(github, 官方文档): 一个轻量级的数据集框架,主要有两个功能:①一行代码下载和预处理常用的公开数据集; ② 快速、易用的数据预处理类库。
  • Accelerate(github, 官方文档): 帮助Pytorch用户很方便的实现 multi-GPU/TPU/fp16。
  • Space:Space提供了许多好玩的深度学习应用,可以尝试玩一下。

Transformers

在这里插入图片描述

1. Pipeline流水线

数据预处理tokenizer、模型调用model、结果后处理组装成一个流水线

在这里插入图片描述

Pipeline原理

pipeline(data, model, tokenizer, divece)的原理:

在这里插入图片描述

Pipeline使用方法

一般使用较多的方法是分别构建modeltokenizer,并指定task任务类型将其分别加入pipeline
(每类pipeline的具体使用方法可以点进具体Pipeline类的源码中查看!!)
在这里插入图片描述

Pipeline的Task类型

  • audio-classification {‘impl’: <class ‘transformers.pipelines.audio_classification.AudioClassificationPipeline’>, ‘tf’: (), ‘pt’: (<class ‘transformers.models.auto.modeling_auto.AutoModelForAudioClassification’>,), ‘default’: {‘model’: {‘pt’: (‘superb/wav2vec2-base-superb-ks’, ‘372e048’)}}, ‘type’: ‘audio’}
  • automatic-speech-recognition {‘impl’: <class ‘transformers.pipelines.automatic_speech_recognition.AutomaticSpeechRecognitionPipeline’>, ‘tf’: (), ‘pt’: (<class ‘transformers.models.auto.modeling_auto.AutoModelForCTC’>, <class ‘transformers.models.auto.modeling_auto.AutoModelForSpeechSeq2Seq’>), ‘default’: {‘model’: {‘pt’: (‘facebook/wav2vec2-base-960h’, ‘55bb623’)}}, ‘type’: ‘multimodal’}
  • feature-extraction {‘impl’: <class ‘transformers.pipelines.feature_extraction.FeatureExtractionPipeline’>, ‘tf’: (), ‘pt’: (<class ‘transformers.models.auto.modeling_auto.AutoModel’>,), ‘default’: {‘model’: {‘pt’: (‘distilbert-base-cased’, ‘935ac13’), ‘tf’: (‘distilbert-base-cased’, ‘935ac13’)}}, ‘type’: ‘multimodal’}
  • text-classification {‘impl’: <class ‘transformers.pipelines.text_classification.TextClassificationPipeline’>, ‘tf’: (), ‘pt’: (<class ‘transformers.models.auto.modeling_auto.AutoModelForSequenceClassification’>,), ‘default’: {‘model’: {‘pt’: (‘distilbert-base-uncased-finetuned-sst-2-english’, ‘af0f99b’), ‘tf’: (‘distilbert-base-uncased-finetuned-sst-2-english’, ‘af0f99b’)}}, ‘type’: ‘text’}
  • token-classification {‘impl’: <class ‘transformers.pipelines.token_classification.TokenClassificationPipeline’>, ‘tf’: (), ‘pt’: (<class ‘transformers.models.auto.modeling_auto.AutoModelForTokenClassification’>,), ‘default’: {‘model’: {‘pt’: (‘dbmdz/bert-large-cased-finetuned-conll03-english’, ‘f2482bf’), ‘tf’: (‘dbmdz/bert-large-cased-finetuned-conll03-english’, ‘f2482bf’)}}, ‘type’: ‘text’}
  • question-answering {‘impl’: <class ‘transformers.pipelines.question_answering.QuestionAnsweringPipeline’>, ‘tf’: (), ‘pt’: (<class ‘transformers.models.auto.modeling_auto.AutoModelForQuestionAnswering’>,), ‘default’: {‘model’: {‘pt’: (‘distilbert-base-cased-distilled-squad’, ‘626af31’), ‘tf’: (‘distilbert-base-cased-distilled-squad’, ‘626af31’)}}, ‘type’: ‘text’}
  • table-question-answering {‘impl’: <class ‘transformers.pipelines.table_question_answering.TableQuestionAnsweringPipeline’>, ‘pt’: (<class ‘transformers.models.auto.modeling_auto.AutoModelForTableQuestionAnswering’>,), ‘tf’: (), ‘default’: {‘model’: {‘pt’: (‘google/tapas-base-finetuned-wtq’, ‘69ceee2’), ‘tf’: (‘google/tapas-base-finetuned-wtq’, ‘69ceee2’)}}, ‘type’: ‘text’}
  • visual-question-answering {‘impl’: <class ‘transformers.pipelines.visual_question_answering.VisualQuestionAnsweringPipeline’>, ‘pt’: (<class ‘transformers.models.auto.modeling_auto.AutoModelForVisualQuestionAnswering’>,), ‘tf’: (), ‘default’: {‘model’: {‘pt’: (‘dandelin/vilt-b32-finetuned-vqa’, ‘4355f59’)}}, ‘type’: ‘multimodal’}
  • document-question-answering {‘impl’: <class ‘transformers.pipelines.document_question_answering.DocumentQuestionAnsweringPipeline’>, ‘pt’: (<class ‘transformers.models.auto.modeling_auto.AutoModelForDocumentQuestionAnswering’>,), ‘tf’: (), ‘default’: {‘model’: {‘pt’: (‘impira/layoutlm-document-qa’, ‘52e01b3’)}}, ‘type’: ‘multimodal’}
  • fill-mask {‘impl’: <class ‘transformers.pipelines.fill_mask.FillMaskPipeline’>, ‘tf’: (), ‘pt’: (<class ‘transformers.models.auto.modeling_auto.AutoModelForMaskedLM’>,), ‘default’: {‘model’: {‘pt’: (‘distilroberta-base’, ‘ec58a5b’), ‘tf’: (‘distilroberta-base’, ‘ec58a5b’)}}, ‘type’: ‘text’}
  • summarization {‘impl’: <class ‘transformers.pipelines.text2text_generation.SummarizationPipeline’>, ‘tf’: (), ‘pt’: (<class ‘transformers.models.auto.modeling_auto.AutoModelForSeq2SeqLM’>,), ‘default’: {‘model’: {‘pt’: (‘sshleifer/distilbart-cnn-12-6’, ‘a4f8f3e’), ‘tf’: (‘t5-small’, ‘d769bba’)}}, ‘type’: ‘text’}
  • translation {‘impl’: <class ‘transformers.pipelines.text2text_generation.TranslationPipeline’>, ‘tf’: (), ‘pt’: (<class ‘transformers.models.auto.modeling_auto.AutoModelForSeq2SeqLM’>,), ‘default’: {(‘en’, ‘fr’): {‘model’: {‘pt’: (‘t5-base’, ‘686f1db’), ‘tf’: (‘t5-base’, ‘686f1db’)}}, (‘en’, ‘de’): {‘model’: {‘pt’: (‘t5-base’, ‘686f1db’), ‘tf’: (‘t5-base’, ‘686f1db’)}}, (‘en’, ‘ro’): {‘model’: {‘pt’: (‘t5-base’, ‘686f1db’), ‘tf’: (‘t5-base’, ‘686f1db’)}}}, ‘type’: ‘text’}
  • text2text-generation {‘impl’: <class ‘transformers.pipelines.text2text_generation.Text2TextGenerationPipeline’>, ‘tf’: (), ‘pt’: (<class ‘transformers.models.auto.modeling_auto.AutoModelForSeq2SeqLM’>,), ‘default’: {‘model’: {‘pt’: (‘t5-base’, ‘686f1db’), ‘tf’: (‘t5-base’, ‘686f1db’)}}, ‘type’: ‘text’}
  • text-generation {‘impl’: <class ‘transformers.pipelines.text_generation.TextGenerationPipeline’>, ‘tf’: (), ‘pt’: (<class ‘transformers.models.auto.modeling_auto.AutoModelForCausalLM’>,), ‘default’: {‘model’: {‘pt’: (‘gpt2’, ‘6c0e608’), ‘tf’: (‘gpt2’, ‘6c0e608’)}}, ‘type’: ‘text’}
  • zero-shot-classification {‘impl’: <class ‘transformers.pipelines.zero_shot_classification.ZeroShotClassificationPipeline’>, ‘tf’: (), ‘pt’: (<class ‘transformers.models.auto.modeling_auto.AutoModelForSequenceClassification’>,), ‘default’: {‘model’: {‘pt’: (‘facebook/bart-large-mnli’, ‘c626438’), ‘tf’: (‘roberta-large-mnli’, ‘130fb28’)}, ‘config’: {‘pt’: (‘facebook/bart-large-mnli’, ‘c626438’), ‘tf’: (‘roberta-large-mnli’, ‘130fb28’)}}, ‘type’: ‘text’}
  • zero-shot-image-classification {‘impl’: <class ‘transformers.pipelines.zero_shot_image_classification.ZeroShotImageClassificationPipeline’>, ‘tf’: (), ‘pt’: (<class ‘transformers.models.auto.modeling_auto.AutoModelForZeroShotImageClassification’>,), ‘default’: {‘model’: {‘pt’: (‘openai/clip-vit-base-patch32’, ‘f4881ba’), ‘tf’: (‘openai/clip-vit-base-patch32’, ‘f4881ba’)}}, ‘type’: ‘multimodal’}
  • zero-shot-audio-classification {‘impl’: <class ‘transformers.pipelines.zero_shot_audio_classification.ZeroShotAudioClassificationPipeline’>, ‘tf’: (), ‘pt’: (<class ‘transformers.models.auto.modeling_auto.AutoModel’>,), ‘default’: {‘model’: {‘pt’: (‘laion/clap-htsat-fused’, ‘973b6e5’)}}, ‘type’: ‘multimodal’}
  • conversational {‘impl’: <class ‘transformers.pipelines.conversational.ConversationalPipeline’>, ‘tf’: (), ‘pt’: (<class ‘transformers.models.auto.modeling_auto.AutoModelForSeq2SeqLM’>, <class ‘transformers.models.auto.modeling_auto.AutoModelForCausalLM’>), ‘default’: {‘model’: {‘pt’: (‘microsoft/DialoGPT-medium’, ‘8bada3b’), ‘tf’: (‘microsoft/DialoGPT-medium’, ‘8bada3b’)}}, ‘type’: ‘text’}
  • image-classification {‘impl’: <class ‘transformers.pipelines.image_classification.ImageClassificationPipeline’>, ‘tf’: (), ‘pt’: (<class ‘transformers.models.auto.modeling_auto.AutoModelForImageClassification’>,), ‘default’: {‘model’: {‘pt’: (‘google/vit-base-patch16-224’, ‘5dca96d’), ‘tf’: (‘google/vit-base-patch16-224’, ‘5dca96d’)}}, ‘type’: ‘image’}
  • image-segmentation {‘impl’: <class ‘transformers.pipelines.image_segmentation.ImageSegmentationPipeline’>, ‘tf’: (), ‘pt’: (<class ‘transformers.models.auto.modeling_auto.AutoModelForImageSegmentation’>, <class ‘transformers.models.auto.modeling_auto.AutoModelForSemanticSegmentation’>), ‘default’: {‘model’: {‘pt’: (‘facebook/detr-resnet-50-panoptic’, ‘fc15262’)}}, ‘type’: ‘multimodal’}
  • image-to-text {‘impl’: <class ‘transformers.pipelines.image_to_text.ImageToTextPipeline’>, ‘tf’: (), ‘pt’: (<class ‘transformers.models.auto.modeling_auto.AutoModelForVision2Seq’>,), ‘default’: {‘model’: {‘pt’: (‘ydshieh/vit-gpt2-coco-en’, ‘65636df’), ‘tf’: (‘ydshieh/vit-gpt2-coco-en’, ‘65636df’)}}, ‘type’: ‘multimodal’}
  • object-detection {‘impl’: <class ‘transformers.pipelines.object_detection.ObjectDetectionPipeline’>, ‘tf’: (), ‘pt’: (<class ‘transformers.models.auto.modeling_auto.AutoModelForObjectDetection’>,), ‘default’: {‘model’: {‘pt’: (‘facebook/detr-resnet-50’, ‘2729413’)}}, ‘type’: ‘multimodal’}
  • zero-shot-object-detection {‘impl’: <class ‘transformers.pipelines.zero_shot_object_detection.ZeroShotObjectDetectionPipeline’>, ‘tf’: (), ‘pt’: (<class ‘transformers.models.auto.modeling_auto.AutoModelForZeroShotObjectDetection’>,), ‘default’: {‘model’: {‘pt’: (‘google/owlvit-base-patch32’, ‘17740e1’)}}, ‘type’: ‘multimodal’}
  • depth-estimation {‘impl’: <class ‘transformers.pipelines.depth_estimation.DepthEstimationPipeline’>, ‘tf’: (), ‘pt’: (<class ‘transformers.models.auto.modeling_auto.AutoModelForDepthEstimation’>,), ‘default’: {‘model’: {‘pt’: (‘Intel/dpt-large’, ‘e93beec’)}}, ‘type’: ‘image’}
  • video-classification {‘impl’: <class ‘transformers.pipelines.video_classification.VideoClassificationPipeline’>, ‘tf’: (), ‘pt’: (<class ‘transformers.models.auto.modeling_auto.AutoModelForVideoClassification’>,), ‘default’: {‘model’: {‘pt’: (‘MCG-NJU/videomae-base-finetuned-kinetics’, ‘4800870’)}}, ‘type’: ‘video’}

2. Tokenizer分词器

Tokenizer将过去繁琐的text-to-token的过程进行简化:
在这里插入图片描述

2.1 Tokenizer的使用方法

在这里插入图片描述

Step1 加载与保存

from transformers import AutoTokenizer# 从HuggingFace加载,输入模型名称,即可加载对应的分词器
tokenizer = AutoTokenizer.from_pretrained("uer/roberta-base-finetuned-dianping-chinese")
"""
BertTokenizerFast(name_or_path='uer/roberta-base-finetuned-dianping-chinese', vocab_size=21128, model_max_length=1000000000000000019884624838656, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'unk_token': '[UNK]', 'sep_token': '[SEP]', 'pad_token': '[PAD]', 'cls_token': '[CLS]', 'mask_token': '[MASK]'}, clean_up_tokenization_spaces=True)
"""
# tokenizer 保存到本地
tokenizer.save_pretrained("本地文件夹路径")
''' 文件夹内的文件格式
('./roberta_tokenizer\\tokenizer_config.json','./roberta_tokenizer\\special_tokens_map.json','./roberta_tokenizer\\vocab.txt','./roberta_tokenizer\\added_tokens.json','./roberta_tokenizer\\tokenizer.json')
'''
# 从本地加载tokenizer
tokenizer = AutoTokenizer.from_pretrained("本地文件夹路径")
"""
BertTokenizerFast(name_or_path='uer/roberta-base-finetuned-dianping-chinese', vocab_size=21128, model_max_length=1000000000000000019884624838656, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'unk_token': '[UNK]', 'sep_token': '[SEP]', 'pad_token': '[PAD]', 'cls_token': '[CLS]', 'mask_token': '[MASK]'}, clean_up_tokenization_spaces=True)
"""

Step2 句子分词 :

sen = "弱小的我也有大梦想!"
tokens = tokenizer.tokenize(sen)
# ['弱', '小', '的', '我', '也', '有', '大', '梦', '想', '!']

Step3 查看词典:

tokenizer.vocab
"""
{'湾': 3968,'訴': 6260,'##轶': 19824,'洞': 3822,' ̄': 8100,'##劾': 14288,'##care': 11014,'asia': 8339,'##嗑': 14679,'##鹘': 20965,'washington': 12262,'##匕': 14321,'##樟': 16619,'癮': 4628,'day3': 11649,'##宵': 15213,'##弧': 15536,'##do': 8828,'詭': 6279,'3500': 9252,'124': 9377,'##価': 13957,'##玄': 17428,'##積': 18005,'##肝': 18555,
...'##维': 18392,'與': 5645,'##mark': 9882,'偽': 984,...}
"""
tokenizer.vocab_size
# 21128

Step4 索引转换:

# 将词序列转换为id序列
ids = tokenizer.convert_tokens_to_ids(tokens)
ids
# [2483, 2207, 4638, 2769, 738, 3300, 1920, 3457, 2682, 106]
# 将id序列转换为token序列
tokens = tokenizer.convert_ids_to_tokens(ids)
tokens
# ['弱', '小', '的', '我', '也', '有', '大', '梦', '想', '!']
# 将token序列转换为string
str_sen = tokenizer.convert_tokens_to_string(tokens)
str_sen
# '弱 小 的 我 也 有 大 梦 想!'

总结——更便捷的实现方式

# 将字符串转换为id序列,又称之为编码
ids = tokenizer.encode(sen, add_special_tokens=True)
ids
# [101, 2483, 2207, 4638, 2769, 738, 3300, 1920, 3457, 2682, 106, 102]
# 将id序列转换为字符串,又称之为解码
str_sen = tokenizer.decode(ids, skip_special_tokens=False)
str_sen
# '[CLS] 弱 小 的 我 也 有 大 梦 想! [SEP]'

Step5 填充与截断

# 填充
ids = tokenizer.encode(sen, padding="max_length", max_length=15)
ids
# [101, 2483, 2207, 4638, 2769, 738, 3300, 1920, 3457, 2682, 106, 102, 0, 0, 0]
# 截断
ids = tokenizer.encode(sen, max_length=5, truncation=True)
ids
# [101, 2483, 2207, 4638, 102]

Step6 其他输入部分

ids = tokenizer.encode(sen, padding="max_length", max_length=15)
ids
# [101, 2483, 2207, 4638, 2769, 738, 3300, 1920, 3457, 2682, 106, 102, 0, 0, 0]
attention_mask = [1 if idx != 0 else 0 for idx in ids]
token_type_ids = [0] * len(ids)
ids, attention_mask, token_type_ids
"""
([101, 2483, 2207, 4638, 2769, 738, 3300, 1920, 3457, 2682, 106, 102, 0, 0, 0],[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0],[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0])
"""

2.2 Tokenizer快速调用

tokenizer.encode_plus()tokenizer()效果相同

inputs = tokenizer.encode_plus(sen, padding="max_length", max_length=15)
inputs
"""
{'input_ids': [101, 2483, 2207, 4638, 2769, 738, 3300, 1920, 3457, 2682, 106, 102, 0, 0, 0], 
'token_type_ids': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 
'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0]}
"""
inputs = tokenizer(sen, padding="max_length", max_length=15)
inputs
"""
{'input_ids': [101, 2483, 2207, 4638, 2769, 738, 3300, 1920, 3457, 2682, 106, 102, 0, 0, 0], 
'token_type_ids': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 
'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 0, 0, 0]}
"""

2.3 处理batch数据

sens = ["弱小的我也有大梦想","有梦想谁都了不起","追逐梦想的心,比梦想本身,更可贵"]
res = tokenizer(sens)
res
"""
{'input_ids': [[101, 2483, 2207, 4638, 2769, 738, 3300, 1920, 3457, 2682, 102], [101, 3300, 3457, 2682, 6443, 6963, 749, 679, 6629, 102], [101, 6841, 6852, 3457, 2682, 4638, 2552, 8024, 3683, 3457, 2682, 3315, 6716, 8024, 3291, 1377, 6586, 102]], 
'token_type_ids': [[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0], [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]], 
'attention_mask': [[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1, 1], [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]]}
"""
%%time
# 单条循环处理
for i in range(1000):tokenizer(sen)
# CPU times: total: 15.6 ms
# Wall time: 32.5 ms%%time
# 处理batch数据
sen_list = [sen] * 1000
res = tokenizer(sen_list)
# CPU times: total: 0 ns
# Wall time: 6 ms

2.4 Fast / Slow Tokenizer

在这里插入图片描述

sen = "弱小的我也有大Dreaming!"
fast_tokenizer = AutoTokenizer.from_pretrained("uer/roberta-base-finetuned-dianping-chinese")
fast_tokenizer
# BertTokenizerFast(name_or_path='uer/roberta-base-finetuned-dianping-chinese', vocab_size=21128, model_max_length=1000000000000000019884624838656, is_fast=True, padding_side='right', truncation_side='right', special_tokens={'unk_token': '[UNK]', 'sep_token': '[SEP]', 'pad_token': '[PAD]', 'cls_token': '[CLS]', 'mask_token': '[MASK]'}, clean_up_tokenization_spaces=True)inputs = fast_tokenizer(sen, return_offsets_mapping=True)
inputs
# {'input_ids': [101, 2483, 2207, 4638, 2769, 738, 3300, 1920, 10252, 8221, 106, 102], 'token_type_ids': [0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0], 'attention_mask': [1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1], 'offset_mapping': [(0, 0), (0, 1), (1, 2), (2, 3), (3, 4), (4, 5), (5, 6), (6, 7), (7, 12), (12, 15), (15, 16), (0, 0)]}inputs.word_ids()
# [None, 0, 1, 2, 3, 4, 5, 6, 7, 7, 8, None]
slow_tokenizer = AutoTokenizer.from_pretrained("uer/roberta-base-finetuned-dianping-chinese", use_fast=False)
slow_tokenizer
# BertTokenizer(name_or_path='uer/roberta-base-finetuned-dianping-chinese', vocab_size=21128, model_max_length=1000000000000000019884624838656, is_fast=False, padding_side='right', truncation_side='right', special_tokens={'unk_token': '[UNK]', 'sep_token': '[SEP]', 'pad_token': '[PAD]', 'cls_token': '[CLS]', 'mask_token': '[MASK]'}, clean_up_tokenization_spaces=True)

3. Model模型

在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述
在这里插入图片描述

3.1 模型加载与保存

在线下载: 会遇到HTTP连接超时

from transformers import AutoConfig, AutoModel, AutoTokenizer
model = AutoModel.from_pretrained("hfl/rbt3", force_download=True)

离线下载 : 需要挂梯子自己进去下载,在本地创建文件夹

!git clone "https://huggingface.co/hfl/rbt3"
!git lfs clone "https://huggingface.co/hfl/rbt3" --include="*.bin"

离线加载

model = AutoModel.from_pretrained("本地文件夹")

模型加载参数

model = AutoModel.from_pretrained("本地文件夹")
model.config
"""
BertConfig {"_name_or_path": "rbt3","architectures": ["BertForMaskedLM"],"attention_probs_dropout_prob": 0.1,"classifier_dropout": null,"directionality": "bidi","hidden_act": "gelu","hidden_dropout_prob": 0.1,"hidden_size": 768,"initializer_range": 0.02,"intermediate_size": 3072,"layer_norm_eps": 1e-12,"max_position_embeddings": 512,"model_type": "bert","num_attention_heads": 12,"num_hidden_layers": 3,"output_past": true,"pad_token_id": 0,"pooler_fc_size": 768,"pooler_num_attention_heads": 12,"pooler_num_fc_layers": 3,"pooler_size_per_head": 128,"pooler_type": "first_token_transform",
..."transformers_version": "4.28.1","type_vocab_size": 2,"use_cache": true,"vocab_size": 21128
}
"""
config = AutoConfig.from_pretrained("./rbt3/")
config
"""
BertConfig {"_name_or_path": "rbt3","architectures": ["BertForMaskedLM"],"attention_probs_dropout_prob": 0.1,"classifier_dropout": null,"directionality": "bidi","hidden_act": "gelu","hidden_dropout_prob": 0.1,"hidden_size": 768,"initializer_range": 0.02,"intermediate_size": 3072,"layer_norm_eps": 1e-12,"max_position_embeddings": 512,"model_type": "bert","num_attention_heads": 12,"num_hidden_layers": 3,"output_past": true,"pad_token_id": 0,"pooler_fc_size": 768,"pooler_num_attention_heads": 12,"pooler_num_fc_layers": 3,"pooler_size_per_head": 128,"pooler_type": "first_token_transform",
..."transformers_version": "4.28.1","type_vocab_size": 2,"use_cache": true,"vocab_size": 21128
}
"""

3.2 模型调用

sen = "弱小的我也有大梦想!"
tokenizer = AutoTokenizer.from_pretrained("rbt3")
inputs = tokenizer(sen, return_tensors="pt")
inputs
"""
{'input_ids': tensor([[ 101, 2483, 2207, 4638, 2769,  738, 3300, 1920, 3457, 2682, 8013,  102]]), 'token_type_ids': tensor([[0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0]]), 'attention_mask': tensor([[1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1, 1]])}
"""

不带Model Head的模型调用

model = AutoModel.from_pretrained("rbt3", output_attentions=True)
output = model(**inputs)
output
"""
BaseModelOutputWithPoolingAndCrossAttentions(last_hidden_state=tensor([[[ 0.6804,  0.6664,  0.7170,  ..., -0.4102,  0.7839, -0.0262],[-0.7378, -0.2748,  0.5034,  ..., -0.1359, -0.4331, -0.5874],[-0.0212,  0.5642,  0.1032,  ..., -0.3617,  0.4646, -0.4747],...,[ 0.0853,  0.6679, -0.1757,  ..., -0.0942,  0.4664,  0.2925],[ 0.3336,  0.3224, -0.3355,  ..., -0.3262,  0.2532, -0.2507],[ 0.6761,  0.6688,  0.7154,  ..., -0.4083,  0.7824, -0.0224]]],grad_fn=<NativeLayerNormBackward0>), pooler_output=tensor([[-1.2646e-01, -9.8619e-01, -1.0000e+00, -9.8325e-01,  8.0238e-01,-6.6268e-02,  6.6919e-02,  1.4784e-01,  9.9451e-01,  9.9995e-01,-8.3051e-02, -1.0000e+00, -9.8865e-02,  9.9980e-01, -1.0000e+00,9.9993e-01,  9.8291e-01,  9.5363e-01, -9.9948e-01, -1.3219e-01,-9.9733e-01, -7.7934e-01,  1.0720e-01,  9.8040e-01,  9.9953e-01,-9.9939e-01, -9.9997e-01,  1.4967e-01, -8.7627e-01, -9.9996e-01,-9.9821e-01, -9.9999e-01,  1.9396e-01, -1.1277e-01,  9.9359e-01,-9.9153e-01,  4.4752e-02, -9.8731e-01, -9.9942e-01, -9.9982e-01,2.9360e-02,  9.9847e-01, -9.2014e-03,  9.9999e-01,  1.7111e-01,4.5071e-03,  9.9998e-01,  9.9467e-01,  4.9726e-03, -9.0707e-01,6.9056e-02, -1.8141e-01, -9.8831e-01,  9.9668e-01,  4.9800e-01,1.2997e-01,  9.9895e-01, -1.0000e+00, -9.9990e-01,  9.9478e-01,-9.9989e-01,  9.9906e-01,  9.9820e-01,  9.9990e-01, -6.8953e-01,9.9990e-01,  9.9987e-01,  9.4563e-01, -3.7660e-01, -1.0000e+00,1.3151e-01, -9.7371e-01, -9.9997e-01, -1.3228e-02, -2.9801e-01,-9.9985e-01,  9.9662e-01, -2.0004e-01,  9.9997e-01,  3.6876e-01,-9.9997e-01,  1.5462e-01,  1.9265e-01,  8.9871e-02,  9.9996e-01,9.9998e-01,  1.5184e-01, -8.9714e-01, -2.1646e-01, -9.9922e-01,
...1.7911e-02, 4.8672e-01],[4.0732e-01, 3.8137e-02, 9.6832e-03,  ..., 4.4490e-02,2.2997e-02, 4.0793e-01],[1.7047e-01, 3.6989e-02, 2.3646e-02,  ..., 4.6833e-02,2.5233e-01, 1.6721e-01]]]], grad_fn=<SoftmaxBackward0>)), cross_attentions=None)
"""
output.last_hidden_state.size()
# orch.Size([1, 12, 768])
len(inputs["input_ids"][0])
# 12

带Model Head的模型调用

from transformers import AutoModelForSequenceClassification, BertForSequenceClassification
clz_model = AutoModelForSequenceClassification.from_pretrained("rbt3", num_labels=10)
clz_model(**inputs)
# SequenceClassifierOutput(loss=None, logits=tensor([[-0.1776,  0.2208, -0.5060, -0.3938, -0.5837,  1.0171, -0.2616,  0.0495, 0.1728,  0.3047]], grad_fn=<AddmmBackward0>), hidden_states=None, attentions=None)
clz_model.config.num_labels
# 2

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/81356.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

基于讯飞人脸算法(调用API进行人脸比对)

先看结果 必须遥遥领先 所需准备 这里我调用了&#xff1a; 人脸比对 API 文档 | 讯飞开放平台文档中心https://www.xfyun.cn/doc/face/xffaceComparisonRecg/API.html#%E6%8E%A5%E5%8F%A3%E8%AF%B4%E6%98%8E 代码里所涉及的APPID、APISecret、APIKey 皆从讯飞的控制台获取&…

市场,只能被操纵,不能被战胜

所谓市场&#xff0c;不过是千千万参与主体各自独立意志、自主行动所形成的复杂混沌的互动结果。价格&#xff0c;则是这一复杂混沌系统的涌现现象。 无数在市场中追风打浪的人&#xff0c;总是梦想着自己有朝一日能够战胜市场&#xff0c;获得超额回报。于是他们绞尽脑汁&…

Mybatis学习笔记3 在Web中应用Mybatis

Mybatis学习笔记2 增删改查及核心配置文件详解_biubiubiu0706的博客-CSDN博客 技术栈:HTMLServletMybatis 学习目标: 掌握mybatis在web应用中如何使用 Mybatis三大对对象的作用域和生命周期 关于Mybatis中三大对象的作用域和生命周期、 官网说明 ThreadLocal原理及使用 巩…

JAVA入坑之嵌套类

一、嵌套类入门 1.1概述 Java嵌套类是指在一个类中定义另一个类的一种方式&#xff0c;它可以提高代码的可读性、可维护性和封装性。Java嵌套类分为两种类型&#xff1a;静态嵌套类和非静态嵌套类。 静态嵌套类&#xff1a;Static nested classes,即类前面有static修饰符 非静…

【论文解读】Faster sorting algorithm

一、简要介绍 基本的算法&#xff0c;如排序或哈希&#xff0c;在任何一天都被使用数万亿次。随着对计算需求的增长&#xff0c;这些算法的性能变得至关重要。尽管在过去的2年中已经取得了显著的进展&#xff0c;但进一步改进这些现有的算法路线的有效性对人类科学家和计算方法…

2023-09-17 LeetCode每日一题(打家劫舍 II)

2023-09-17每日一题 一、题目编号 213. 打家劫舍 II二、题目链接 点击跳转到题目位置 三、题目描述 你是一个专业的小偷&#xff0c;计划偷窃沿街的房屋&#xff0c;每间房内都藏有一定的现金。这个地方所有的房屋都 围成一圈 &#xff0c;这意味着第一个房屋和最后一个房…

《golang设计模式》第二部分·结构型模式-05-门面模式Facade)

文章目录 1. 概述1.1 角色1.2 类图 2. 代码示例2.1 设计2.2 代码2.2 类图 1. 概述 门面&#xff08;Facade&#xff09;向客户端提供使用子系统的统一接口&#xff0c;用于简化客户端使用子系统的操作。 1.1 角色 门面角色&#xff08;Facade&#xff09; 客户端可以调用的接…

7、DVWA——SQL盲注

文章目录 一、概述二、low2.1 通关思路&#xff08;布尔盲注&#xff09;&#xff08;1&#xff09;判断是否存在SQL注入漏洞&#xff08;2&#xff09;判断属于数字型注入还是字符型注入&#xff08;3&#xff09;判断结果集中的字段数&#xff08;4&#xff09;猜数据库名长度…

ArcGIS Pro将SHP文件转CAD并保留图层名称

相信大家应该都使用过ArcGIS将SHP文件转CAD格式&#xff0c;转换过后所有的要素都在一个图层内&#xff0c;那么有没有办法将SHP文件某个字段的值作为CAD的图层名字呢&#xff0c;答案是肯定的&#xff0c;这里就为大家介绍一下ArcGIS Pro转CAD文件并且保留图层名称的方法&…

Windows编程dll基本知识点

前言 本篇博客主要是记录windows系统下dll开发的相关基本知识点&#xff0c;并使用相关分析工具分析&#xff0c;有利于初学者学习&#xff0c;更是为开发者查缺补漏&#xff1b; 使用dumpbin查看dll,lib,exe相关信息 VS编译器提供了查看链接库相关的工具&#xff0c;安装后…

【c++GDAL】IHS融合

【c&GDAL】IHS融合 基于IHS变换融合&#xff0c;实现多光谱和全色影像之间的融合。IHS分别指亮度(I)、色度(H)、饱和度(S)。IHS变换融合基于亮度I进行变换&#xff0c;色度和饱和度空间保持不变。 IHS融合步骤&#xff1a; &#xff08;1&#xff09;将多光谱RGB影像变换到…

网络安全:保护你的系统

&#x1f337;&#x1f341; 博主猫头虎&#xff08;&#x1f405;&#x1f43e;&#xff09;带您 Go to New World✨&#x1f341; &#x1f984; 博客首页——&#x1f405;&#x1f43e;猫头虎的博客&#x1f390; &#x1f433; 《面试题大全专栏》 &#x1f995; 文章图文…

地牢大师问题(bfs提高训练 + 免去边界处理的特殊方法)

地牢大师问题 文章目录 地牢大师问题前言题目描述题目分析输入处理移动方式【和二维的对比】边界判断问题的解决 代码总结 前言 在之前的博客里面&#xff0c;我们介绍了bfs 基础算法的模版和应用,这里我们再挑战一下自己&#xff0c;尝试一个更高水平的题目&#xff0c;加深一…

Docker部署单点Elasticsearch与Kibana

一 、 创建网络 因为需要部署kibana容器&#xff0c;因此需要让es和kibana容器互联。这里创建一个网络&#xff1a; docker network create es-net # 创建一个网络名称为:es-net 二 、拉取并加载镜像 方式一 docker pull elasticsearch:7.12.1 版本为elasticsearch的7…

列属性与数据完整性

1.2 数据类型——值类型 1.2.1 整型 类型字节范围tinyint1-128~127smallint2-32768~32767mediumint3-8388608~8388607int4-231~231-1bigint8-263~263-1 1、无符号整数&#xff08;unsigned&#xff09;&#xff1a;无符号数没有负数&#xff0c;正数部分是有符号的两倍。 例…

Linux驱动之INPUT子系统框架

目录 一、input 子系统简介 二、input 驱动编写流程 1、注册 input_dev 2、上报输入事件 三、input_event 结构体 按键、鼠标、键盘、触摸屏等都属于输入(input)设备&#xff0c; Linux 内核为此专门做了一个叫做 input子系统的框架来处理输入事件。输入设备本质上还是字符设…

Go语言开发环境搭建指南:快速上手构建高效的Go开发环境

Go 官网&#xff1a;https://go.dev/dl/ Go 语言中文网&#xff1a;https://studygolang.com/dl 下载 Go 的语言包 进入官方网站 Go 官网 或 Go 语言中文网&#xff1a; 选择下载对应操作系统的安装包&#xff1a; 等待下载完成&#xff1a; 安装 Go 的语言包 双击运行上…

udp的简单整理

最近思考udp处理的一些细节&#xff0c;根据公开课&#xff0c;反复思考&#xff0c;终于有所理解&#xff0c;做整理备用。 0&#xff1a;简单汇总 1&#xff1a;udp是基于报文传输的&#xff0c;接收方收取数据时要一次性读完。 2&#xff1a;借助udp进行发包&#xff0c;…

C++数据结构 -- 哈希表

目录 一、哈希概念二、 哈希冲突三、 哈希函数四、 减少哈希冲突常用的方法4.1 闭散列4.1.1 闭散列的开放定址法的增容4.1.2 闭散列的开放定址法的哈希结构的实现 4.3 开散列4.3.1 开散列概念4.3.2 插入元素4.3.2 删除元素4.3.3 开散列的哈希桶的增容4.3.4 开散列的哈希桶(拉链…

快速搭建SpringBoot3.x项目

快速搭建SpringBoot3.x项目 写在前面一、创建项目二、配置多环境三、连接数据库查询数据3.1 新建数据库mybatisdemo并且创建sys_user表3.2 创建实体类3.2 创建Mapper接口3.3 添加mybatis.xml文件3.4 新建service 接口及实现类3.5 创建Controller 四、封装统一结果返回4.1 定义 …