Byte Pair Encoding(BPE)算法及代码笔记

Byte Pair Encoding(BPE)算法

BPE算法是Transformer中构建词表的方法,大致分为如下几个步骤:

  1. 将语料中的文本切分为字符
  2. 统计高频共现二元组
  3. 将共现频率最高的二元组合并加入词表
  4. 重复上述第二和第三直到词表规模达到预先设置的数量,或没有可以合并的二元组为止

以GPT-2中BPE相关的代码为例对代码进行整理

完整代码如下所示

"""
BPE算法:字节对编码算法,将任意UTF-8字符串转换为整数索引序列,方便后续的神经网络运算。bpe is short for Byte Pair Encoder. It translates arbitrary utf-8 strings into
sequences of integers, where each integer represents small chunks of commonly
occuring characters. This implementation is based on openai's gpt2 encoder.py:
https://github.com/openai/gpt-2/blob/master/src/encoder.py
but was mildly modified because the original implementation is a bit confusing.
I also tried to add as many comments as possible, my own understanding of what's
going on.
"""import os
import json
import regex as re
import requestsimport torch# -----------------------------------------------------------------------------def bytes_to_unicode():"""将字节(8bit->2**8->256个)转换为unicode表示的字符。有些字节表示的字符太"丑"了,比如chr(0)为'\x00',OpenAI选择进行额外的转换。Every possible byte (really an integer 0..255) gets mapped by OpenAI to a unicodecharacter that represents it visually. Some bytes have their appearance preservedbecause they don't cause any trouble. These are defined in list bs. For example:chr(33) returns "!", so in the returned dictionary we simply have d[33] -> "!".However, chr(0), for example, is '\x00', which looks ugly. So OpenAI maps thesebytes, into new characters in a range where chr() returns a single nice character.So in the final dictionary we have d[0] -> 'Ā' instead, which is just chr(0 + 2**8).In particular, the space character is 32, which we can see by ord(' '). Instead,this function will shift space (32) by 256 to 288, so d[32] -> 'Ġ'.So this is just a simple one-to-one mapping of bytes 0..255 into unicode charactersthat "look nice", either in their original form, or a funny shifted characterlike 'Ā', or 'Ġ', etc."""# the 188 integers that render fine in their original form and need no shiftingbs = list(range(ord("!"), ord("~")+1))+list(range(ord("¡"), ord("¬")+1))+list(range(ord("®"), ord("ÿ")+1))cs = bs[:] # all integers b in bs will simply map to chr(b) in the output dict# now get the representations of the other 68 integers that do need shifting# each will get mapped chr(256 + n), where n will grow from 0...67 in the loopn = 0for b in range(2**8):if b not in bs:# if this byte is "ugly" then map it to the next available "nice" characterbs.append(b)cs.append(2**8+n)n += 1cs = [chr(n) for n in cs]d = dict(zip(bs, cs))return ddef get_pairs(word):"""获取一个单词中所有可能的字符二元组Return all bigrams as a set of tuples, of consecutive elements in the iterable word."""pairs = set()prev_char = word[0]for char in word[1:]:pairs.add((prev_char, char))prev_char = charreturn pairsclass Encoder:def __init__(self, encoder, bpe_merges):# byte encoder/decoderself.byte_encoder = bytes_to_unicode()self.byte_decoder = {v:k for k, v in self.byte_encoder.items()}# bpe token encoder/decoderself.encoder = encoder  # 将字符串转换为整数索引self.decoder = {v:k for k,v in self.encoder.items()}  # 将整数索引转换为字符串# bpe merge list that defines the bpe "tree", of tuples (a,b) that are to merge to token abself.bpe_ranks = dict(zip(bpe_merges, range(len(bpe_merges))))# the splitting pattern used for pre-tokenization# Should haved added re.IGNORECASE so BPE merges can happen for capitalized versions of contractions <-- original openai comment"""ok so what is this regex looking for, exactly?python re reference: https://docs.python.org/3/library/re.html- the vertical bars | is OR, so re.findall will chunkate text as the pieces match, from left to right- '\'s' would split up things like Andrej's -> (Andrej, 's)- ' ?\p{L}': optional space followed by 1+ unicode code points in the category "letter"- ' ?\p{N}': optional space followed by 1+ unicode code points in the category "number"- ' ?[^\s\p{L}\p{N}]+': optional space, then 1+ things that are NOT a whitespace, letter or number- '\s+(?!\S)': 1+ whitespace characters (e.g. space or tab or etc) UNLESS they are followed by non-whitespaceso this will consume whitespace characters in a sequence but exclude the last whitespace inthat sequence. that last whitespace has the opportunity to then match the optional ' ?' inearlier patterns.- '\s+': 1+ whitespace characters, intended probably to catch a full trailing sequence of whitespaces at end of stringSo TLDR:- we are special casing a few common apostrophe constructs ('s, 't, 're, ...) and making those into separate tokens- we then separate out strings into consecutive chunks of 1) letters, 2) numbers, 3) non-letter-numbers, 4) whitespaces"""self.pat = re.compile(r"""'s|'t|'re|'ve|'m|'ll|'d| ?\p{L}+| ?\p{N}+| ?[^\s\p{L}\p{N}]+|\s+(?!\S)|\s+""")  # 预先使用一些正则表达式提前将字符串切分,例如将字符串划分为连续的字母、数字、空格和其他字符。包括一些英文的规则。self.cache = {}def bpe(self, token):"""对每个预先切分出来的token进行进一步的bpe切分,切分主要依赖于预先统计的bpe_ranks;bpe_ranks: 从大规模语料中统计的bi-gram共现频率this function uses self.bpe_ranks to iteratively merge all the possible bpe tokensup the tree. token is a string of one individual 'word' (after regex tokenization)and after byte encoding, e.g. 'Ġthere'."""# token is a string of one individual 'word', after byte encoding, e.g. 'Ġthere'# memoization, for efficiencyif token in self.cache:  # cache缓存加速bpe算法return self.cache[token]word = tuple(token) # individual characters that make up the token, in a tuplepairs = get_pairs(word) # get all bigramsif not pairs:return tokenwhile True:# find the next lowest rank bigram that can be mergedbigram = min(pairs, key = lambda pair: self.bpe_ranks.get(pair, float('inf')))  # 优先合并共现频率高的二元组if bigram not in self.bpe_ranks:  # 如果剩下的二元组共现频率过低break # no more bigrams are eligible to be mergedfirst, second = bigram# we will now replace all occurences of (first, second) in the list of current# words into one merged token first_second, in the output list new_wordsnew_word = []i = 0while i < len(word):  # 合并二元组(考虑多次出现的情况)# find the next occurence of first in the sequence of current wordstry:j = word.index(first, i)new_word.extend(word[i:j])i = jexcept:new_word.extend(word[i:])break# if this occurence is also followed by second, then merge them into oneif word[i] == first and i < len(word)-1 and word[i+1] == second:new_word.append(first+second)i += 2else:new_word.append(word[i])i += 1# all occurences of (first, second) have been merged to first_secondnew_word = tuple(new_word)word = new_wordif len(word) == 1:breakelse:pairs = get_pairs(word)# concat all words into a string, and use ' ' as the separator. Note that# by now all characters have been byte encoded, guaranteeing that ' ' is# not used in the actual data and is a 'special' delimiter characterword = ' '.join(word)# cache the result and returnself.cache[token] = wordreturn worddef encode(self, text):""" 字符串序列转整数索引序列string goes in, list of integers comes out"""bpe_idx = []# pre-tokenize the input text into string tokens (words, roughly speaking)tokens = re.findall(self.pat, text)  # 预先使用正则表达式粗糙切分# process each token into BPE integersfor token in tokens:  # 每个token内部使用bpe不断合并二元组# encode the token as a bytes (b'') objecttoken_bytes = token.encode('utf-8')# translate all bytes to their unicode string representation and flattentoken_translated = ''.join(self.byte_encoder[b] for b in token_bytes)# perform all the applicable bpe merges according to self.bpe_rankstoken_merged = self.bpe(token_translated).split(' ')# translate all bpe tokens to integerstoken_ix = [self.encoder[bpe_token] for bpe_token in token_merged]# extend our running list of all output integersbpe_idx.extend(token_ix)return bpe_idxdef encode_and_show_work(self, text):""" debugging function, same as encode but returns all intermediate work """bpe_idx = []parts = []tokens = re.findall(self.pat, text)for token in tokens:token_bytes = token.encode('utf-8')token_translated = ''.join(self.byte_encoder[b] for b in token_bytes)token_merged = self.bpe(token_translated).split(' ')token_ix = [self.encoder[bpe_token] for bpe_token in token_merged]bpe_idx.extend(token_ix)parts.append({'token': token,'token_bytes': token_bytes,'token_translated': token_translated,'token_merged': token_merged,'token_ix': token_ix,})out = {'bpe_idx': bpe_idx, # the actual output sequence'tokens': tokens, # result of pre-tokenization'parts': parts, # intermediates for each token part}return outdef decode(self, bpe_idx):""" 整数索引序列恢复成字符串序列list of integers comes in, string comes out """# inverse map the integers to get the tokenstokens_merged = [self.decoder[token] for token in bpe_idx]# inverse the byte encoder, e.g. recovering 'Ġ' -> ' ', and get the bytestokens_flat = ''.join(tokens_merged)tokens_bytes = bytearray([self.byte_decoder[c] for c in tokens_flat])# recover the full utf-8 stringtext = tokens_bytes.decode('utf-8', errors='replace')return textdef get_file(local_file, remote_file):""" downloads remote_file to local_file if necessary """if not os.path.isfile(local_file):print(f"downloading {remote_file} to {local_file}")response = requests.get(remote_file)open(local_file, "wb").write(response.content)def get_encoder():"""从OpenAI官方的GPT-2分词器cache文件初始化Returns an instance of the GPT BPE Encoder/Decoderand handles caching of "database" files."""home_dir = os.path.expanduser('~')cache_dir = os.path.join(home_dir, '.cache', 'mingpt')os.makedirs(cache_dir, exist_ok=True)# load encoder.json that has the raw mappings from token -> bpe indexencoder_local_file = os.path.join(cache_dir, 'encoder.json')encoder_remote_file = 'https://openaipublic.blob.core.windows.net/gpt-2/models/124M/encoder.json'get_file(encoder_local_file, encoder_remote_file)with open(encoder_local_file, 'r') as f:encoder = json.load(f)assert len(encoder) == 50257 # 256 individual byte tokens, 50,000 merged tokens, and 1 special <|endoftext|> token# load vocab.bpe that contains the bpe merges, i.e. the bpe tree structure# in the form tuples (a, b), that indicate that (a, b) is to be merged to one token abvocab_local_file = os.path.join(cache_dir, 'vocab.bpe')vocab_remote_file = 'https://openaipublic.blob.core.windows.net/gpt-2/models/124M/vocab.bpe'get_file(vocab_local_file, vocab_remote_file)with open(vocab_local_file, 'r', encoding="utf-8") as f:bpe_data = f.read()# light postprocessing: strip the version on first line and the last line is a blankbpe_merges = [tuple(merge_str.split()) for merge_str in bpe_data.split('\n')[1:-1]]assert len(bpe_merges) == 50000 # 50,000 merged tokens# construct the Encoder object and returnenc = Encoder(encoder, bpe_merges)return enc# -----------------------------------------------------------------------------class BPETokenizer:""" PyTorch-aware class that wraps the Encoder above """def __init__(self):self.encoder = get_encoder()def __call__(self, text, return_tensors='pt'):# PyTorch only; here because we want to match huggingface/transformers interfaceassert return_tensors == 'pt'# single string input for now, in the future potentially a list of stringsassert isinstance(text, str)# encode and create a "batch dimension" of 1idx = [self.encoder.encode(text)]# wrap into PyTorch tensorout = torch.tensor(idx, dtype=torch.long)return outdef decode(self, idx):# ensure a simple 1D tensor for nowassert idx.ndim == 1# decode indices to texttext = self.encoder.decode(idx.tolist())return text

从Encoder类中bpe方法出发,理解BPE的全过程,以下为bpe方法代码:

def bpe(self, token):# cache缓存加速bpe算法if token in self.cache:  return self.cache[token]word = tuple(token) # individual characters that make up the token, in a tuplepairs = get_pairs(word) # get all bigramsif not pairs:return tokenwhile True:# find the next lowest rank bigram that can be mergedbigram = min(pairs, key = lambda pair: self.bpe_ranks.get(pair, float('inf')))  # 优先合并共现频率高的二元组if bigram not in self.bpe_ranks:  # 如果剩下的二元组共现频率过低break # no more bigrams are eligible to be mergedfirst, second = bigram# we will now replace all occurences of (first, second) in the list of current# words into one merged token first_second, in the output list new_wordsnew_word = []i = 0while i < len(word):  # 合并二元组(考虑多次出现的情况)# find the next occurence of first in the sequence of current wordstry:j = word.index(first, i)new_word.extend(word[i:j])i = jexcept:new_word.extend(word[i:])break# if this occurence is also followed by second, then merge them into oneif word[i] == first and i < len(word)-1 and word[i+1] == second:new_word.append(first+second)i += 2else:new_word.append(word[i])i += 1# all occurences of (first, second) have been merged to first_secondnew_word = tuple(new_word)word = new_wordif len(word) == 1:breakelse:pairs = get_pairs(word)# concat all words into a string, and use ' ' as the separator. Note that# by now all characters have been byte encoded, guaranteeing that ' ' is# not used in the actual data and is a 'special' delimiter characterword = ' '.join(word)# cache the result and returnself.cache[token] = wordreturn word

以下是对bpe方法代码分块进行解读:

"""
在Encoder类中初始化一个缓存空间,在每次对token进行bpe操作时先验证缓存空间中是否包含,若有包含则直接结束。
"""
# cache缓存加速bpe算法
if token in self.cache:  return self.cache[token]
"""
将输入bpe方法的token进行切分,此时输入的token是一个已将文本切分后的单词,使用tuple对单词中所有字符进行拆分形成一个包含token中所有字符的元组。
"""
word = tuple(token) # individual characters that make up the token, in a tuple
"""
使用get_pairs函数通过对已经拆分好的token字符元组获取所有可能的字符二元组
"""
pairs = get_pairs(word) # get all bigrams
"""
输入的word是token中所有字符的有序元组,从元组中的第一个字符开始,每两个相邻的字符组成一个二元组
"""
def get_pairs(word):pairs = set()prev_char = word[0]for char in word[1:]:pairs.add((prev_char, char))prev_char = charreturn pairs
"""
判断输入的token是否产生了二元组,若没有产生二元组则结束
"""
if not pairs:return token
"""
找到生成的二元组中共现频率最高的,其中使用bpe_ranks获得二元组频率排名,通过排名找到排名最小也就是频率最高的二元组
"""
# find the next lowest rank bigram that can be merged
bigram = min(pairs, key = lambda pair: self.bpe_ranks.get(pair, float('inf')))  # 优先合并共现频率高的二元组       
"""
形成二元组对应共现频率的字典,其中bpe_merges是从已经统计好的文件中读取二元组频率数据
"""
self.bpe_ranks = dict(zip(bpe_merges, range(len(bpe_merges))))
"""
读取的文件中每行是一个二元组,行号即为频率,行号越小频率越高
"""
vocab_local_file = os.path.join(cache_dir, 'vocab.bpe')
vocab_remote_file = 'https://openaipublic.blob.core.windows.net/gpt-2/models/124M/vocab.bpe'
get_file(vocab_local_file, vocab_remote_file)
with open(vocab_local_file, 'r', encoding="utf-8") as f:bpe_data = f.read()
# light postprocessing: strip the version on first line and the last line is a blank
bpe_merges = [tuple(merge_str.split()) for merge_str in bpe_data.split('\n')[1:-1]]
"""
bpe_ranks中不存在的频率过低的二元组直接跳过,first代表二元组中的第一个字符,second代表二元组中第二个字符
"""
if bigram not in self.bpe_ranks:  # 如果剩下的二元组共现频率过低break # no more bigrams are eligible to be merged
first, second = bigram
"""
此部分代码是将token中所有的字符和最高频率二元组加入到new_word列表中
"""
# we will now replace all occurences of (first, second) in the list of current
# words into one merged token first_second, in the output list new_words
new_word = []
i = 0
while i < len(word):  # 合并二元组(考虑多次出现的情况)# find the next occurence of first in the sequence of current wordstry:j = word.index(first, i)new_word.extend(word[i:j])i = jexcept:new_word.extend(word[i:])break# if this occurence is also followed by second, then merge them into oneif word[i] == first and i < len(word)-1 and word[i+1] == second:new_word.append(first+second)i += 2else:new_word.append(word[i])i += 1
"""
如果新生成的字符只有一个则直接退出,如果有多个则获得新的字符对继续执行
"""
# all occurences of (first, second) have been merged to first_second
new_word = tuple(new_word)
word = new_word
if len(word) == 1:break
else:pairs = get_pairs(word)
"""
最后将字符通过空格连接为一个字符串,并存入缓存中
"""
word = ' '.join(word)# cache the result and return
self.cache[token] = word

本文以GPT-2中的BPE代码为例,主要记录了其中Encoder类里的bpe方法相关代码的阅读笔记

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/653532.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

c语言之if-else多分支语句结构

c语言的if-else多分支语句结构语法如下 if(表达式1) 语句1 else if(表达式2) 语句2 else if(表达式3&#xff09;语句3 else if&#xff08;表达式4&#xff09; 语句4 示例代码如下 输入年龄&#xff0c;小于12输出“儿童”&#xff0c;大于12小于18输出“青少年”,大于…

获取本机请求时真实ip

前言 有时候我们需要调用别人的接口&#xff0c;需要对自己的真实ip加入白名单才能调通&#xff0c;但是请求发出后会经过层层代理&#xff0c;导致我们不知道自己请求的真实ip&#xff0c;下面这个方法可以拿到 package com.sinosoft.springbootplus.lft.business.dispatch.…

HarmonyOS--@ObjectLink和@Observed

ObjectLink和Observed装饰器用于在涉及嵌套对象或数组元素为对象的场景中进行双向数据同步。 在HarmonyOS应用开发中&#xff0c;ObjectLink和Observed是两个重要的注解&#xff0c;它们主要用于实现分布式数据的跨设备传输和同步&#xff1a; ObjectLink&#xff1a; 通俗解释…

你好,C++对象

你好&#xff0c;对象 面向对象开发对象的定义 类与对象类的定义类的访问限定符及封装类的实例化类对象模型结构体内存对齐规则 this指针this指针的引入 this指针的特性 类的默认成员函数构造函数析构函数拷贝构造函数结语 面向对象开发 对象的定义 对象的含义是指具体的某一…

MySQL 聚集与非聚集索引

文章目录 1.聚集索引1.1 介绍1.2 优点1.3 缺点 2.非聚集索引3.区别参考文献 MySQL 中&#xff0c;根据索引树叶结点存放数据行还是数据行的地址&#xff0c;可以将索引分为两类&#xff1a; 存放数据行&#xff1a;聚集索引存放数据行地址&#xff1a;非聚集索引 InnoDB 使用聚…

Keil-C语言小总结

1、 &取地址符&#xff0c;*取地址内容 int *ptr;//声明指针 2、ptr &c; // 将c的地址赋值给指针变量ptr 3、可选参数函数 4、C宏定义 5、 memset&#xff1a;最快的数据清零函数 void *memset(void *s, int ch, size_t n); 分别是 字符串 要值的数据&#xff08;0…

TensorFlow2实战-系列教程4:数据增强:keras工具包/Data Augmentation

&#x1f9e1;&#x1f49b;&#x1f49a;TensorFlow2实战-系列教程 总目录 有任何问题欢迎在下面留言 本篇文章的代码运行界面均在Jupyter Notebook中进行 本篇文章配套的代码资源已经上传 对于图像数据&#xff0c;将其进行翻转、放缩、平移、旋转操作就可以得到一组新的数据…

分布式ID(3):雪花算法生成ID之UidGenerator(百度开源的分布式唯一ID生成器)

1 UidGenerator官方地址 UidGenerator源码地址: https://github.com/baidu/uid-generator UidGenerator官方说明文档地址: https://github.com/baidu/uid-generator/blob/master/README.zh_cn.md 这边只做简单介绍,详细说明请看官方说明文档。 2 Snowflake算法 Snowfl…

spring boot学习第八篇:操作elastic search的索引和索引中的数据

前提参考&#xff1a;elastic search入门-CSDN博客 前提说明&#xff1a;已经安装好了elastic search 7.x版本&#xff0c;我的es版本是7.11.1 1、 pom.xml文件内容如下&#xff1a; <?xml version"1.0" encoding"UTF-8"?> <project xmlns&q…

空间计算时代催生新一波巨大算力市场需求

什么是空间计算&#xff1f; 空间计算是一种整合虚拟现实&#xff08;VR&#xff09;、增强现实&#xff08;AR&#xff09;、混合现实&#xff08;MR&#xff09;等技术的计算模式&#xff0c;旨在将数字信息与真实世界融合在一起。这种融合创造了一个全新的计算环境&#xff…

【极数系列】docker环境搭建Flink1.18版本(04)

文章目录 引言01 Linux安装Docker1.安装yum-utils软件包2.安装docker3.启动docker4.设置docker自启动5.配置Docker使用systemd作为默认Cgroup驱动6.重启docker 02 docker部署Flink1.18版本1.拉取最新镜像2.检查镜像3.编写dockerFile文件4.执行dockerFile5.检查flink是否启动成功…

一个SSE(流式)接口引发的问题

前言 最近我们公司也是在做认知助手&#xff0c;大模型相关的功能&#xff0c;正在做提示词&#xff0c;机器人对话相关功能。想要提高用户体验&#xff0c;使用SSE请求模式&#xff0c;在不等数据完全拿到的情况下边拿边返回。 之前做过一版&#xff0c;但不是流式返回&…

机房环境动力监控系统:S275远程控制网关助力高效管理

现场问题 1、机房安全隐患 机房存在意外断电、温湿度过高过低、漏水断路等隐患&#xff0c;传统监测手段难以提前发现和预警。 2、机房远程运维困难 因环境改变、非授权活动、设备状态变化等引起的事故&#xff0c;难以满足机房远程运维的可靠管控要求。 3、机房改造成本高…

Django 实现SS

1、简单的sse只要用django内置的StreamingHttpResponse就可以实现 2、django-sse这个第三方库已经有10年没有更新&#xff0c;不要用这个库了。 3、告知前端关闭SSE连接需要发送yield "event: close\ndata: \n\n" 而不能只发送yield "event: close\n" …

菱形打印和十进制ip转二进制

1.菱形打印 用for循环 #!/bin/bashread -p "请输入菱形的大小&#xff1a;" num #打印向上的等腰三角形 for ((i1;i<num;i)) dofor ((jnum-1;j>i;j--))doecho -n " " #打印的是前面的空格donefor ((k1;k<2*i-1;k))doecho -n "*" #打印…

OneNote中的键盘快捷记录(超全)

本文列出了桌面 WindowsOneNote 的键盘快捷方式。 常用快捷方式 执行的操作 按 打开新的 OneNote 窗口。 CtrlM 创建 快速笔记。 CtrlShiftM 或 AltWindows 徽标键N 停靠 OneNote 窗口。 CtrlAltD 撤消前一个操作。 CtrlZ 如果可能&#xff0c;请重做上一个操作。 …

HCIE之BGP正则表达式(四)

BGP 一、AS-Path正则表达式数字| 等同于或的关系[]和.$ 一个字符串的结束_代表任意^一个字符串的开始()括号包围的是一个组合\ 转义字符* 零个或多个&#xff1f;零个或一个一个或多个 二、BGP对等体组三、BGP安全性 一、AS-Path正则表达式 正则表达式是按照一定模版匹配字符串…

《思考的快与慢》部分整理

认知心理学机位大洞见&#xff1a;人类大脑的默认模式是系统1&#xff0c;而不是系统2&#xff0c;人类大脑遵循“能不用脑&#xff0c;就不用脑”的原则。 系统1&#xff08;快系统&#xff09;对应着我们常说的直觉思维 系统2&#xff08;慢系统&#xff09;对应着理性思维…

Linux实验记录:使用RAID(独立冗余磁盘阵列)

前言&#xff1a; 本文是一篇关于Linux系统初学者的实验记录。 参考书籍&#xff1a;《Linux就该这么学》 实验环境&#xff1a; VmwareWorkStation 17——虚拟机软件 RedHatEnterpriseLinux[RHEL]8——红帽操作系统 目录 前言&#xff1a; 备注&#xff1a; 部署磁盘阵…

Vue路由

1. 路由的基本概念 1.1. 什么是路由&#xff1f; 路由的概念 路由的本质就是一种对应关系&#xff0c;比如说我们在url地址中输入我们要访问的url地址之后&#xff0c;浏览器要去请求这个url地址对应的资源。 那么url地址和真实的资源之间就有一种对应的关系&#xff0c;就是…