问题
tokenizer = AutoTokenizer.from_pretrained('distilbert-base-uncased')
在用HuggingFace的Transformers加载分词器的时候报错,代码如下:
> load tokenizer model distilbert
Traceback (most recent call last):File "E:\PythonProjects\Sentiment_Analysis_Imdb\main.py", line 171, in <module>cls = Classification(args, logger)File "E:\PythonProjects\Sentiment_Analysis_Imdb\main.py", line 26, in __init__self.tokenizer = AutoTokenizer.from_pretrained('distilbert-base-uncased')File "C:\Users\Hansdas\anaconda3\envs\textcls\lib\site-packages\transformers\models\auto\tokenization_auto.py", line 626, in from_pretrainedreturn tokenizer_class_fast.from_pretrained(pretrained_model_name_or_path, *inputs, **kwargs)File "C:\Users\Hansdas\anaconda3\envs\textcls\lib\site-packages\transformers\tokenization_utils_base.py", line 1748, in from_pretrained commit_hash = extract_commit_hash(resolved_vocab_files[file_id], commit_hash)File "C:\Users\Hansdas\anaconda3\envs\textcls\lib\site-packages\transformers\utils\hub.py", line 225, in extract_commit_hashsearch = re.search(r"snapshots/([^/]+)/", resolved_file)File "C:\Users\Hansdas\anaconda3\envs\textcls\lib\re.py", line 201, in searchreturn _compile(pattern, flags).search(string)
TypeError: expected string or bytes-like object
解决
网断了,连上网就好了