Router有什么用?
在RAG应用中,Router可以帮助我们基于用户的查询意图来决定使用何种数据类型或数据源,比如是否需要进行语义检索、是否需要进行text2sql查询,是否需要用function call来进行API调用。
Router也可以根据用户的查询意图来决定使用何种组件类型,比如是否可以用agent来解决用户的问题,是否需要用到向量数据库来进行上下文信息补充,是否直接用LLM来回答用户的问题。
Route也可以让我们根据问题意图来使用不同的prompt 模板。
总之,我们使用Router来帮助我们决定query如何被处理和响应。(注:上面的图片都来自于参考资料1)
Router如何实现
基于LLM的Router
鉴于LLM强大的语义理解能力,让LLM作为Router来决定一个query的意图是什么。
- LLM comletion router,在prompt中给LLM定义一些选项,让LLM选择与问题最相关的选项。
Llamaindex里的prompt如下:
# single select, LLMSingleSelector类默认使用这个prompt
DEFAULT_SINGLE_SELECT_PROMPT_TMPL = ("Some choices are given below. It is provided in a numbered list ""(1 to {num_choices}), ""where each item in the list corresponds to a summary.\n""---------------------\n""{context_list}""\n---------------------\n""Using only the choices above and not prior knowledge, return ""the choice that is most relevant to the question: '{query_str}'\n"
)# multiple select, LLMMultiSelector类默认使用这个prompt
DEFAULT_MULTI_SELECT_PROMPT_TMPL = ("Some choices are given below. It is provided in a numbered ""list (1 to {num_choices}), ""where each item in the list corresponds to a summary.\n""---------------------\n""{context_list}""\n---------------------\n""Using only the choices above and not prior knowledge, return the top choices ""(no more than {max_outputs}, but only select what is needed) that ""are most relevant to the question: '{query_str}'\n"
)
- LLM Function router: 利用LLM的function call能力,也就是定义一些处理不同意图的function,让LLM将用户query输出function调用参数。LlamaIndex的pandatic Router 就是基于此来实现的。
Semantic Router
Semantic router: 基于语义相似度的Router。其主要思路是将每一类问题意图给出一些示例问法后并用embedding编码,在routing时编码问题后用语义相似度来决定与问题最相关的意图。
semantic-router是一个开源的Semantic router(安装:pip install -qU semantic-router
)。其使用示例如下:
import os
from getpass import getpass
from semantic_router.encoders import CohereEncoder, OpenAIEncoder
from semantic_router import Route
from semantic_router.layer import RouteLayeros.environ["OPENAI_API_KEY"] = os.getenv("OPENAI_API_KEY") or getpass("Enter OpenAI API Key: "
)## 定义名为politics的Route,以及其对应的一些描述
politics = Route(name="politics",utterances=["isn't politics the best thing ever","why don't you tell me about your political opinions","don't you just love the president","don't you just hate the president","they're going to destroy this country!","they will save the country!",],
)
## 定义名为politics的Route,以及其对应的一些描述
chitchat = Route(name="chitchat",utterances=["how's the weather today?","how are things going?","lovely weather today","the weather is horrendous","let's go to the chippy",],
)# 当前route汇总
routes = [politics, chitchat]
# embedding编码器
encoder = OpenAIEncoder()## 定义Router层,在这一步会将routes中对应的描述(utterances)都向量编码并存储下来
rl = RouteLayer(encoder=encoder, routes=routes)## 对查询做判断属于哪一个Route.
## 实现逻辑: 用query与定义好的问句进行相似度计算,返回与query最相似的top 5个预先定义的问法,并对其根据Route进行分组,取分组分数之和最大(组内分数的聚合策略可选:sum/max/mean)的Route作为候选Route,如果这些匹配的问法里存在大于指定阈值的问法,则将该Route作为匹配Route返回,否则返回空。
rl("don't you love politics?")
# RouteChoice(name='politics', function_call=None, similarity_score=None)## 对查询做判断属于哪些Route.
## 实现逻辑: 用query与定义好的问句进行相似度计算,返回与query最相似的top 5个预先定义的问法,并对其根据Route进行分组,返回全部分数大于指定阈值的组,否则返回空。
rl.retrieve_multiple_routes("Hi! How are you doing in politics??")
#[RouteChoice(name='politics', function_call=None, similarity_score=0.8595844842560181),
# RouteChoice(name='chitchat', function_call=None, similarity_score=0.8356704527362284)]#######
######## dynamic route 可以定义 function_schema, 利用LLM的function call 能力进行函数解析 #######
from datetime import datetime
from zoneinfo import ZoneInfodef get_time(timezone: str) -> str:"""Finds the current time in a specific timezone.:param timezone: The timezone to find the current time in, shouldbe a valid timezone from the IANA Time Zone Database like"America/New_York" or "Europe/London". Do NOT put the placename itself like "rome", or "new york", you must providethe IANA format.:type timezone: str:return: The current time in the specified timezone."""now = datetime.now(ZoneInfo(timezone))return now.strftime("%H:%M")
from semantic_router.llms.openai import get_schemas_openaischemas = get_schemas_openai([get_time])time_route = Route(name="get_time",utterances=["what is the time in new york city?","what is the time in london?","I live in Rome, what time is it?",],function_schemas=schemas,
)
# 添加新定义的route
rl.add(time_route)
# 因为定义的get_time包含function_schemas,所以会在通过相似度计算得到匹配的route之后,用LLM的function call功能来返回待调用的函数参数
response = rl("what is the time in new york city?")
response
# RouteChoice(name='get_time', function_call=[{'function_name': 'get_time', 'arguments': {'timezone': 'America/New_York'}}], similarity_score=None)### 除了向量比较相似性外,还可以与关键词比较一起进行混合匹配,将两者的分数用alpha分配后相加
import os
from semantic_router.encoders import CohereEncoder, BM25Encoder, TfidfEncoder
from getpass import getpassdense_encoder = CohereEncoder()
# sparse_encoder = BM25Encoder()
sparse_encoder = TfidfEncoder()
from semantic_router.hybrid_layer import HybridRouteLayerdl = HybridRouteLayer(encoder=dense_encoder, sparse_encoder=sparse_encoder, routes=routes
)
dl("don't you love politics?")## RouteLayer的evaluate函数计算将问题正确分类的准确度
## RouteLayer的fit函数输入少量的标注数据,在max_iter次数内,每次随机为Route选取一个阈值,将准确率最高的阈值作为拟合出的阈值。
基于文本分类的Router
既然Router是在做类似意图分类的工作,我们可以使用文本分类模型来当Router。如果没有足够的语料来训练的话,可以先选Zero-shot text classfication模型来当做Router, haystack(source code)实现了用zero-shot分类模型来作为Router。
参考资料
- blog:Routing in RAG-Driven Applications
- LLamaindex router文档
- Langchain router文档
- Haystack router文档