简单使用LlamaIndex实现RAG

1 介绍

LlamaIndex是一个专门为大语言模型（LLM）设计的开源数据管理工具，旨在简化和优化LLM在外部数据源中的查询过程。适合在数据索引上构建RAG。

参考的地址

# 官网地址
https://docs.llamaindex.ai/en/stable/# 模块介绍
https://docs.llamaindex.ai/en/stable/module_guides/# Github地址
https://github.com/run-llama/llama_index

使用的组件

# Openai like
https://docs.llamaindex.ai/en/stable/api_reference/llms/openai_like/
# OpenLLM，我没测试组件，它继承了OpenAILike
https://docs.llamaindex.ai/en/stable/api_reference/llms/openllm/# 自定义嵌入模型
https://docs.llamaindex.ai/en/stable/module_guides/models/embeddings/#custom-embedding-model# 自定义LLM模型
https://docs.llamaindex.ai/en/stable/module_guides/models/llms/usage_custom/# ChromaVectorStore向量存储
https://docs.llamaindex.ai/en/stable/api_reference/storage/vector_store/chroma/#llama_index.vector_stores.chroma.ChromaVectorStore# Chroma数据库文档
https://docs.trychroma.com/docs/overview/introduction

需要安装的包

⚠️ 我使用的llma-index的版本：0.12.26

pip install llama-index
pip install llama-index-llms-openai-like
pip install llama-index-vector-stores-chroma
pip install chromadb

2 使用官网构建RAG

⚠️ 国内基本无法直接使用，因为需要OpenAI的模型，所以无法直接使用。那就需要根据自己的需求定制。

from llama_index.core import VectorStoreIndex, SimpleDirectoryReader# 加载数据
documents = SimpleDirectoryReader("E:/data").load_data()
# 转化文档
index = VectorStoreIndex.from_documents(documents)
# 构建查询引擎
query_engine = index.as_query_engine()
# 使用
response = query_engine.query("Some question about the data should go here")
print(response)

3 自定义构建RAG

3.1 RAG构建思路

使用LlamaIndex构建RAG的思路如下图，LlamaIndex需要自定义向量模型和类大模型组件。

graph TDA[（1）构建Documet对象列表，读数据文档] --> BB[（2）构建Node对象列表，使用分割器分割Document，其中分割器有SentenceSplitter、TextSplitter等] --> CC[（3）向量化和存储，自定义嵌入模型和存储到数据库中，可以使用SimpleVectorStore、ChromaVectorStore等] --> DD[（4）构建向量索引库，使用VectorStoreIndex构建向量索引] --> EE[（5）构建检索器，用于用户检索输入的prompt] --> FF[（6）构建响应生成器，自定义大模型生成用户输入的prompt] --> GG[（7）构建查询引擎，组合检索器和响应生成器构建查询引擎] --> HH[（8）使用prompt查询和生成数据]

在这里插入图片描述

3.2 自定义RAG

（1） my_document_custorm_engine.py

import chromadb
from llama_index.core import Document, VectorStoreIndex, get_response_synthesizer
from llama_index.core.node_parser import SentenceSplitter
from llama_index.core.query_engine import RetrieverQueryEngine
from llama_index.core.response_synthesizers import ResponseMode
from llama_index.vector_stores.chroma import ChromaVectorStorefrom my_custom_rag.custom_like_openai import CustomLikeOpenAI# 需要安装的组件
"""
pip install llama-index
pip install llama-index-llms-openai-like
pip install llama-index-vector-stores-chroma
"""from my_custom_rag.custom_embedding import CustomEmbeddings# 自定嵌入模型
my_embedding = CustomEmbeddings()# 构建文档列表，可根据自己的需求设置文本文档列表
document_text = ["河南大学（Henan University），简称“河大”（HENU），位于中国河南省郑州市、开封市，是河南省人民政府与中华人民共和国教育部共建公办高校 [392]、国家“双一流”建设高校 [69]，入选国家“111计划” [2]、中西部高校基础能力建设工程 [313]、中国政府奖学金来华留学生接收院校 [388]。","河南大学创立于1912年，始名河南留学欧美预备学校。后历经中州大学、国立开封中山大学、省立河南大学等阶段，1942年升格为国立河南大学 [153]。1952年院系调整 ，校本部更名为河南师范学院。后经开封师范学院、河南师范大学等阶段，1984年恢复河南大学校名 [153]。2000年6月，原河南大学、开封医学高等专科学校、开封师范高等专科学校合并组建新的河南大学 [154]。2012年，河南大学入选第一批卓越医生教育培养计划项目试点高校 [130]；入选国家级卓越法律人才教育培养基地 [390]；入选第一批国家卓越医生教育培养计划项目试点高校 [391]。","截至2024年6月，学校设有40个学院、93个本科招生专业 、47个硕士学位授权一级学科 、39种硕士专业学位授权类别 、2种博士专业学位授权类别、24个博士学位授权一级学科 、20个博士后科研流动站、13个学科进入ESI世界排名前1% ；有全日制在校生5万余人、教职工4700余人，教师中有院士、学部委员6人，长江学者、国家杰青、“万人计划”领军人才等国家级领军人才26人，国家级青年人才15人；拥有3个国家重点实验室 ，1个国家野外科学观测研究站 ，3个国家地方联合工程研究中心 ，4个河南省实验室 ， 5个教育部和农业部重点实验室 [448]。","河南大学软件学院（Henan University Software College）是全国较早、河南省最早成立的软件学院之一，位于中国河南省开封市。学院是国家示范性软件学院联盟成员单位，信息技术新工科产学研联盟首批会员单位，2020 年度获批河南省特色化示范性软件学院，河南省鲲鹏产业学院建设高校。学院设有软件工程系、网络工程系和公共计算机教学中心，拥有“河南省智能数据处理工程研究中心”、“河南省现代网络技术实验教学示范中心”、“河南省智能网络理论与关键技术国际联合实验室”等省级科研教学平台、“河南省高等学校学科（软件工程）引智基地”、“河南省本科高校大学生校外实践教育基地”和“河南省高校优秀基层教学组织”称号。学院独立承建软件工程和网络工程两个本科专业，软件工程专业为国家一流本科专业， 网络工程专业为河南省一流本科专业。同时，拥有“软件工程技术”二级博士学位授权点，电子信息专业具有硕士学位授予权，建有软件工程博士后科研流动站。"
]# 构建LlamaIndex的对象列表
documents = list()
for text in document_text:documents.append(Document(text=text))
# print(documents)# 将LlamaIndex的Document拆分Nodes
parser = SentenceSplitter()
nodes = parser.get_nodes_from_documents(documents)# 使用嵌入模型对node进行嵌入
for node in nodes:node.embedding = my_embedding.get_text_embedding(node.get_content())# 创建Chroma客户端
client = chromadb.Client()# 创建集合
collection = client.create_collection("my-documents")
chroma_vector_store = ChromaVectorStore(chroma_collection=collection)# 存入向量库中
chroma_vector_store.add(nodes)# 使用索引对象
# 注意：VectorStoreIndex必须有向量库支持，否则会报下面的错误
# Cannot initialize from a vector store that does not store text.
vector_index = VectorStoreIndex.from_vector_store(chroma_vector_store, embed_model=my_embedding)# Llama index中的OpenAILike调用千问和Kimi不能使用，一直报参数错误，所以只能自定义了
llm = CustomLikeOpenAI(model="qwen2.5-14b-instruct",api_base="https://dashscope.aliyuncs.com/compatible-mode/v1",api_key = "sk-XXX"
)# 下面的是LlamaIndex中的OpenAILike
# llm = OpenAILike(
#     model="qwen2.5-14b-instruct",
#     base_url="https://dashscope.aliyuncs.com/compatible-mode/v1",
#     api_key = "sk-XXX"
# )"""
# 显示构建查询引擎=检索器+响应生成器# 创建检索器，similarity_top_k设置返回值的数量
query_retriever = vector_index.as_retriever(similarity_top_k=1)# 响应生成器有compact（默认模式）、refine、simple_summarize等
response_synthesizer = get_response_synthesizer(llm=llm,response_mode=ResponseMode.COMPACT,# streaming=True
)query_engine = RetrieverQueryEngine(retriever=query_retriever,response_synthesizer=response_synthesizer
)"""
# 隐式构建查询引擎，上面两步可以使用1行构建
query_engine = vector_index.as_query_engine(llm=llm,# streaming=True
)# 如果不使用流式输出，直接打印即可
query_data = query_engine.query("用30字介绍一下河南大学")
print(query_data)# # 使用流式输出
# for text in query_data.response_gen:
#     print(text)
#     pass

（2）custom_embedding.py

自定义嵌入模型

from llama_index.core.base.embeddings.base import Embedding
from llama_index.core.embeddings import BaseEmbedding
from sentence_transformers import SentenceTransformer# 构建向量模型，可根据自己的需求，自定义调用互联网和本地模型等
embedder = SentenceTransformer("E:/model/sentencetransformers/distiluse-base-multilingual-cased-v1")class CustomEmbeddings(BaseEmbedding):def _get_query_embedding(self, query: str) -> Embedding:# 生成嵌入列表return embedder.encode(query).tolist()async def _aget_query_embedding(self, query: str) -> Embedding:# 生成嵌入列表return embedder.encode(query).tolist()def _get_text_embedding(self, text: str) -> Embedding:# 生成嵌入列表return embedder.encode(text).tolist()

（3）custom_like_openai.py

自定义大模型

from typing import Anyfrom llama_index.core.base.llms.types import CompletionResponseGen, LLMMetadata, CompletionResponse
from llama_index.core.llms import CustomLLM
from openai import OpenAI
from pydantic import Fieldclass CustomLikeOpenAI(CustomLLM):model: str = Field(description="自定义模型名称")api_key: str = Field(description="自定义API Key")api_base: str = Field(description="自定义API地址")context_window: int = Field(default=32768, description="上下文窗口大小")temperature: float = Field(ge=0, le=1, default=0.3, description="设置温度，值域须为 [0, 1]")num_output: int = Field(default=8192, description="设置max_tokens")def __init__(self, **data):# 必须调用父类初始化super().__init__(**data)# 创建对象self._client = OpenAI(api_key=self.api_key,base_url=self.api_base)@propertydef metadata(self) -> LLMMetadata:"""Get LLM metadata."""return LLMMetadata(context_window=self.context_window,num_output=self.num_output,model_name=self.model)def complete(self, prompt: str, **kwargs: Any) -> CompletionResponse:"""生成文本:param prompt: 添加提示词:param kwargs: 其他相关参数:return: CompletionResponse"""# 构建生成completion = self._client.chat.completions.create(model=self.model,messages=[{"role": "user", "content": prompt}],temperature=self.temperature,max_tokens=self.num_output)# 返回值return CompletionResponse(text=completion.choices[0].message.content)def stream_complete(self, prompt: str, **kwargs: Any) -> CompletionResponseGen:"""生成流式文本:param prompt: 提示词:param kwargs: 其他参数:return: CompletionResponseGen迭代器"""# 根据需要可以不实现，如果不想实现使用下面代码即可# raise NotImplementedError("Streaming not supported")# 构建数据流stream = self._client.chat.completions.create(model=self.model,messages=[{"role": "user", "content": prompt}],temperature=self.temperature,max_tokens=self.num_output,stream=True)# 遍历数据流for chunk in stream:# 获取新文本delta = chunk.choices[0].delta# 判断数据是否存在if delta.content:yield CompletionResponse(text=delta.content, delta=delta.content)

3.3 其他学习

（1）使用SimpleVectorStore存储

from llama_index.core import Document
from llama_index.core.node_parser import SentenceSplitter
from llama_index.core.vector_stores import SimpleVectorStore, VectorStoreQuery
from sentence_transformers import SentenceTransformer# 构建向量模型
embedder = SentenceTransformer("E:/model/sentencetransformers/distiluse-base-multilingual-cased-v1")# 构建文档列表，可根据自己的需求设置文本文档列表
document_text = ["A man is eating food.","A man is eating a piece of bread.","The girl is carrying a baby.","A man is riding a horse.","A woman is playing violin.","Two men pushed carts through the woods.","A man is riding a white horse on an enclosed ground.","A monkey is playing drums.","A cheetah is running behind its prey.",
]# 构建LlamaIndex的对象列表
documents = list()
for text in document_text:documents.append(Document(text=text))
# print(documents)# 将LlamaIndex的Document拆分Nodes
parser = SentenceSplitter()
nodes = parser.get_nodes_from_documents(documents)# 使用嵌入模型对node进行嵌入
for node in nodes:node.embedding = embedder.encode(node.get_content())
# print(nodes)# 下面一般不会用于生产环境，生产环境一般用向量库
# 基于内存的方式存储向量
simple_vector_store = SimpleVectorStore()
simple_vector_store.add(nodes)# 持久化nodes，默认存储在”./storage“
# simple_vector_store.persist()
# 获取持久化数据
# simple_vector_store = SimpleVectorStore.from_persist_path("./storage/vector_store.json")# # 对查询的文本进行嵌入
query_embed = embedder.encode("A man is eating pasta").tolist()
# 查询到的目标数据
target_data_embed = simple_vector_store.query(VectorStoreQuery(query_embedding=query_embed, similarity_top_k=2))print(target_data_embed)

（2）使用索引VectorStoreIndex

import chromadb
from llama_index.core import Document, VectorStoreIndex
from llama_index.core.node_parser import SentenceSplitter
from llama_index.vector_stores.chroma import ChromaVectorStorefrom my_custom_rag.custom_embedding import CustomEmbeddings# 自定嵌入模型
my_embedding = CustomEmbeddings()# 构建文档列表，可根据自己的需求设置文本文档列表
document_text = ["A man is eating food.","A man is eating a piece of bread.","The girl is carrying a baby.","A man is riding a horse.","A woman is playing violin.","Two men pushed carts through the woods.","A man is riding a white horse on an enclosed ground.","A monkey is playing drums.","A cheetah is running behind its prey.",
]# 构建LlamaIndex的对象列表
documents = list()
for text in document_text:documents.append(Document(text=text))
# print(documents)# 将LlamaIndex的Document拆分Nodes
parser = SentenceSplitter()
nodes = parser.get_nodes_from_documents(documents)# 使用嵌入模型对node进行嵌入
for node in nodes:node.embedding = my_embedding.get_text_embedding(node.get_content())# 创建Chroma客户端
client = chromadb.Client()# 创建集合
collection = client.create_collection("my-documents")
chroma_vector_store = ChromaVectorStore(chroma_collection=collection)# 存入向量库中
chroma_vector_store.add(nodes)# 使用索引对象
# 注意：VectorStoreIndex必须有向量库支持，否则会报下面的错误
# Cannot initialize from a vector store that does not store text.
vector_index = VectorStoreIndex.from_vector_store(chroma_vector_store, embed_model=my_embedding)# 创建检索器，similarity_top_k设置返回值的数量
query_retriever = vector_index.as_retriever(similarity_top_k=1)# 查询数据
query_text = "A man is eating pasta"
retrieve_data = query_retriever.retrieve(query_text)print(retrieve_data)