langchain实现基于sql的问答

1. 数据准备

import requestsurl = "https://storage.googleapis.com/benchmarks-artifacts/chinook/Chinook.db"response = requests.get(url)if response.status_code == 200:# Open a local file in binary write modewith open("Chinook.db", "wb") as file:# Write the content of the response (the file) to the local filefile.write(response.content)print("File downloaded and saved as Chinook.db")
else:print(f"Failed to download the file. Status code: {response.status_code}")
File downloaded and saved as Chinook.db

2. 数据库链接

from langchain_community.utilities import SQLDatabasedb = SQLDatabase.from_uri("sqlite:///Chinook.db")
# 数据库结构展示
print(db.dialect)
print(db.get_usable_table_names())
db.run("SELECT * FROM Artist LIMIT 10;")

输出:

sqlite
['Album', 'Artist', 'Customer', 'Employee', 'Genre', 'Invoice', 'InvoiceLine', 'MediaType', 'Playlist', 'PlaylistTrack', 'Track']
"[(1, 'AC/DC'), (2, 'Accept'), (3, 'Aerosmith'), (4, 'Alanis Morissette'), (5, 'Alice In Chains'), (6, 'Antônio Carlos Jobim'), (7, 'Apocalyptica'), (8, 'Audioslave'), (9, 'BackBeat'), (10, 'Billy Cobham')]"

3. 提示词模板

from langchain import hub
from langgraph.prebuilt import create_react_agent
from langchain_openai import ChatOpenAI
from langchain_community.agent_toolkits import SQLDatabaseToolkit
# Pull prompt (or define your own)
prompt_template = hub.pull("langchain-ai/sql-agent-system-prompt")system_message = prompt_template.format(dialect="SQLite", top_k=5)print(prompt_template)
print(system_message)

输出:

d:\soft\anaconda\envs\langchain\Lib\site-packages\langsmith\client.py:354: LangSmithMissingAPIKeyWarning: API key must be provided when using hosted LangSmith APIwarnings.warn(input_variables=['dialect', 'top_k'] input_types={} partial_variables={} metadata={'lc_hub_owner': 'langchain-ai', 'lc_hub_repo': 'sql-agent-system-prompt', 'lc_hub_commit_hash': '31156d5fe3945188ee172151b086712d22b8c70f8f1c0505f5457594424ed352'} messages=[SystemMessagePromptTemplate(prompt=PromptTemplate(input_variables=['dialect', 'top_k'], input_types={}, partial_variables={}, template='You are an agent designed to interact with a SQL database.\nGiven an input question, create a syntactically correct {dialect} query to run, then look at the results of the query and return the answer.\nUnless the user specifies a specific number of examples they wish to obtain, always limit your query to at most {top_k} results.\nYou can order the results by a relevant column to return the most interesting examples in the database.\nNever query for all the columns from a specific table, only ask for the relevant columns given the question.\nYou have access to tools for interacting with the database.\nOnly use the below tools. Only use the information returned by the below tools to construct your final answer.\nYou MUST double check your query before executing it. If you get an error while executing a query, rewrite the query and try again.\n\nDO NOT make any DML statements (INSERT, UPDATE, DELETE, DROP etc.) to the database.\n\nTo start you should ALWAYS look at the tables in the database to see what you can query.\nDo NOT skip this step.\nThen you should query the schema of the most relevant tables.'), additional_kwargs={})]
System: You are an agent designed to interact with a SQL database.
Given an input question, create a syntactically correct SQLite query to run, then look at the results of the query and return the answer.
Unless the user specifies a specific number of examples they wish to obtain, always limit your query to at most 5 results.
You can order the results by a relevant column to return the most interesting examples in the database.
Never query for all the columns from a specific table, only ask for the relevant columns given the question.
You have access to tools for interacting with the database.
Only use the below tools. Only use the information returned by the below tools to construct your final answer.
You MUST double check your query before executing it. If you get an error while executing a query, rewrite the query and try again.DO NOT make any DML statements (INSERT, UPDATE, DELETE, DROP etc.) to the database.To start you should ALWAYS look at the tables in the database to see what you can query.
Do NOT skip this step.
Then you should query the schema of the most relevant tables.

4. 数据库工具箱

toolkit = SQLDatabaseToolkit(db=db, llm=llm)
tools = toolkit.get_tools()
tools

输出:

[QuerySQLDataBaseTool(description="Input to this tool is a detailed and correct SQL query, output is a result from the database. If the query is not correct, an error message will be returned. If an error is returned, rewrite the query, check the query, and try again. If you encounter an issue with Unknown column 'xxxx' in 'field list', use sql_db_schema to query the correct table fields.", db=<langchain_community.utilities.sql_database.SQLDatabase object at 0x00000182B25C3770>),InfoSQLDatabaseTool(description='Input to this tool is a comma-separated list of tables, output is the schema and sample rows for those tables. Be sure that the tables actually exist by calling sql_db_list_tables first! Example Input: table1, table2, table3', db=<langchain_community.utilities.sql_database.SQLDatabase object at 0x00000182B25C3770>),ListSQLDatabaseTool(db=<langchain_community.utilities.sql_database.SQLDatabase object at 0x00000182B25C3770>),QuerySQLCheckerTool(description='Use this tool to double check if your query is correct before executing it. Always use this tool before executing a query with sql_db_query!', db=<langchain_community.utilities.sql_database.SQLDatabase object at 0x00000182B25C3770>, llm=ChatOpenAI(client=<openai.resources.chat.completions.Completions object at 0x00000182B778E750>, async_client=<openai.resources.chat.completions.AsyncCompletions object at 0x00000182B77B0140>, root_client=<openai.OpenAI object at 0x00000182B778C620>, root_async_client=<openai.AsyncOpenAI object at 0x00000182B778E7B0>, model_name='GLM-4-Plus', temperature=0.0, model_kwargs={}, openai_api_key=SecretStr('**********'), openai_api_base='https://open.bigmodel.cn/api/paas/v4/'), llm_chain=LLMChain(verbose=False, prompt=PromptTemplate(input_variables=['dialect', 'query'], input_types={}, partial_variables={}, template='\n{query}\nDouble check the {dialect} query above for common mistakes, including:\n- Using NOT IN with NULL values\n- Using UNION when UNION ALL should have been used\n- Using BETWEEN for exclusive ranges\n- Data type mismatch in predicates\n- Properly quoting identifiers\n- Using the correct number of arguments for functions\n- Casting to the correct data type\n- Using the proper columns for joins\n\nIf there are any of the above mistakes, rewrite the query. If there are no mistakes, just reproduce the original query.\n\nOutput the final SQL query only.\n\nSQL Query: '), llm=ChatOpenAI(client=<openai.resources.chat.completions.Completions object at 0x00000182B778E750>, async_client=<openai.resources.chat.completions.AsyncCompletions object at 0x00000182B77B0140>, root_client=<openai.OpenAI object at 0x00000182B778C620>, root_async_client=<openai.AsyncOpenAI object at 0x00000182B778E7B0>, model_name='GLM-4-Plus', temperature=0.0, model_kwargs={}, openai_api_key=SecretStr('**********'), openai_api_base='https://open.bigmodel.cn/api/paas/v4/'), output_parser=StrOutputParser(), llm_kwargs={}))]

6. 流程图

llm = ChatOpenAI(temperature=0,model="GLM-4-Plus",openai_api_key="your api key",openai_api_base="https://open.bigmodel.cn/api/paas/v4/"
)# Create agent
agent_executor = create_react_agent(llm, tools , state_modifier=system_message
)from IPython.display import Image, displaytry:display(Image(agent_executor.get_graph(xray=True).draw_mermaid_png()))
except Exception:# This requires some extra dependencies and is optionalpass

在这里插入图片描述

7. 示例

example_query = "Which sales agent made the most in sales in 2009?"events = agent_executor.stream({"messages": [("user", example_query)]},stream_mode="values",
)
for event in events:event["messages"][-1].pretty_print()

输出:

================================[1m Human Message [0m=================================Which sales agent made the most in sales in 2009?
==================================[1m Ai Message [0m==================================
Tool Calls:sql_db_list_tables (call_-9197044058595346666)Call ID: call_-9197044058595346666Args:
=================================[1m Tool Message [0m=================================
Name: sql_db_list_tablesAlbum, Artist, Customer, Employee, Genre, Invoice, InvoiceLine, MediaType, Playlist, PlaylistTrack, Track
==================================[1m Ai Message [0m==================================
Tool Calls:sql_db_schema (call_-9197044092955150896)Call ID: call_-9197044092955150896Args:table_names: Employee,Invoice
=================================[1m Tool Message [0m=================================
Name: sql_db_schemaCREATE TABLE "Employee" ("EmployeeId" INTEGER NOT NULL, "LastName" NVARCHAR(20) NOT NULL, "FirstName" NVARCHAR(20) NOT NULL, "Title" NVARCHAR(30), "ReportsTo" INTEGER, "BirthDate" DATETIME, "HireDate" DATETIME, "Address" NVARCHAR(70), "City" NVARCHAR(40), "State" NVARCHAR(40), "Country" NVARCHAR(40), "PostalCode" NVARCHAR(10), "Phone" NVARCHAR(24), "Fax" NVARCHAR(24), "Email" NVARCHAR(60), PRIMARY KEY ("EmployeeId"), FOREIGN KEY("ReportsTo") REFERENCES "Employee" ("EmployeeId")
)/*
3 rows from Employee table:
EmployeeId	LastName	FirstName	Title	ReportsTo	BirthDate	HireDate	Address	City	State	Country	PostalCode	Phone	Fax	Email
1	Adams	Andrew	General Manager	None	1962-02-18 00:00:00	2002-08-14 00:00:00	11120 Jasper Ave NW	Edmonton	AB	Canada	T5K 2N1	+1 (780) 428-9482	+1 (780) 428-3457	andrew@chinookcorp.com
2	Edwards	Nancy	Sales Manager	1	1958-12-08 00:00:00	2002-05-01 00:00:00	825 8 Ave SW	Calgary	AB	Canada	T2P 2T3	+1 (403) 262-3443	+1 (403) 262-3322	nancy@chinookcorp.com
3	Peacock	Jane	Sales Support Agent	2	1973-08-29 00:00:00	2002-04-01 00:00:00	1111 6 Ave SW	Calgary	AB	Canada	T2P 5M5	+1 (403) 262-3443	+1 (403) 262-6712	jane@chinookcorp.com
*/CREATE TABLE "Invoice" ("InvoiceId" INTEGER NOT NULL, "CustomerId" INTEGER NOT NULL, "InvoiceDate" DATETIME NOT NULL, "BillingAddress" NVARCHAR(70), "BillingCity" NVARCHAR(40), "BillingState" NVARCHAR(40), "BillingCountry" NVARCHAR(40), "BillingPostalCode" NVARCHAR(10), "Total" NUMERIC(10, 2) NOT NULL, PRIMARY KEY ("InvoiceId"), FOREIGN KEY("CustomerId") REFERENCES "Customer" ("CustomerId")
)/*
3 rows from Invoice table:
InvoiceId	CustomerId	InvoiceDate	BillingAddress	BillingCity	BillingState	BillingCountry	BillingPostalCode	Total
1	2	2009-01-01 00:00:00	Theodor-Heuss-Straße 34	Stuttgart	None	Germany	70174	1.98
2	4	2009-01-02 00:00:00	Ullevålsveien 14	Oslo	None	Norway	0171	3.96
3	8	2009-01-03 00:00:00	Grétrystraat 63	Brussels	None	Belgium	1000	5.94
*/
==================================[1m Ai Message [0m==================================
Tool Calls:sql_db_query_checker (call_-9197047906887043597)Call ID: call_-9197047906887043597Args:query: SELECT e.FirstName, e.LastName, SUM(i.Total) as TotalSales FROM Employee e JOIN Invoice i ON e.EmployeeId = i.CustomerId WHERE strftime('%Y', i.InvoiceDate) = '2009' GROUP BY e.EmployeeId ORDER BY TotalSales DESC LIMIT 1
=================================[1m Tool Message [0m=================================
Name: sql_db_query_checker```sql
SELECT e.FirstName, e.LastName, SUM(i.Total) as TotalSales 
FROM Employee e 
JOIN Invoice i ON e.EmployeeId = i.CustomerId 
WHERE strftime('%Y', i.InvoiceDate) = '2009' 
GROUP BY e.EmployeeId 
ORDER BY TotalSales DESC 
LIMIT 1
```
==================================[1m Ai Message [0m==================================
Tool Calls:sql_db_query (call_-9197041344174917718)Call ID: call_-9197041344174917718Args:query: SELECT e.FirstName, e.LastName, SUM(i.Total) as TotalSales FROM Employee e JOIN Invoice i ON e.EmployeeId = i.CustomerId WHERE strftime('%Y', i.InvoiceDate) = '2009' GROUP BY e.EmployeeId ORDER BY TotalSales DESC LIMIT 1
=================================[1m Tool Message [0m=================================
Name: sql_db_query[('Nancy', 'Edwards', 24.75)]
==================================[1m Ai Message [0m==================================The sales agent who made the most in sales in 2009 is Nancy Edwards, with a total of $24.75 in sales.

参考链接:https://langchain-ai.github.io/langgraph/tutorials/sql-agent/
如果有任何问题,欢迎在评论区提问。

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/888249.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

pip更换国内源,加速Python包下载(附2024年12月最新国内镜像源列表)

pip是什么 pip 是 Python 包管理工具&#xff0c;它允许用户从 Python 包索引&#xff08;PyPI&#xff09;安装和管理软件包。pip 是 Python 的官方包安装程序&#xff0c;它提供了一个命令行界面&#xff0c;用户可以通过它来安装、卸载、查看和管理 Python 包。以下是 pip …

安全关系型数据库查询新选择:Rust 语言的 rust-query 库深度解析

在当今这个数据驱动的时代&#xff0c;数据库作为信息存储和检索的核心组件&#xff0c;其重要性不言而喻。然而&#xff0c;对于开发者而言&#xff0c;如何在保证数据安全的前提下&#xff0c;高效地进行数据库操作却是一项挑战。传统的 SQL 查询虽然强大&#xff0c;但存在诸…

linux-10 关于shell(九)认证、授权、审计

之前提到过的一些基本应用&#xff0c;对Linux系统而言&#xff0c;安装完成以后&#xff0c;它给我们提供一个登录界面&#xff0c;对吧&#xff1f;这个登录界面说白了就是验证用户的&#xff0c;身份的&#xff0c;我昨天提到过&#xff0c;一般而言&#xff0c;每一个使用者…

VSCode中“Run Code”运行程序时,终端出现中文乱码解决方法

问题描述 在VSCode中“Run Code”运行程序时&#xff0c;终端输出结果出现中文乱码现象&#xff1a; 解决方法 1. 检查系统cmd的默认编码 查看Windows终端当前编码方式的命令&#xff1a; chcp输出结果是一段数字代码&#xff0c;如936&#xff0c;这说明当前的cmd编码方式…

【Python】ASCII-generator 将图像、文本或视频转换为 ASCII 艺术 生成字符图(测试代码)

目录 预览效果安装环境报错分析基本例程总结 欢迎关注 『Python』 系列&#xff0c;持续更新中 欢迎关注 『Python』 系列&#xff0c;持续更新中 预览效果 原图 黑白图 彩色图 安装环境 拉取代码 https://github.com/vietnh1009/ASCII-generatorpython3.8 pip install…

量化交易系统开发-实时行情自动化交易-8.2.发明者FMZ平台

19年创业做过一年的量化交易但没有成功&#xff0c;作为交易系统的开发人员积累了一些经验&#xff0c;最近想重新研究交易系统&#xff0c;一边整理一边写出来一些思考供大家参考&#xff0c;也希望跟做量化的朋友有更多的交流和合作。 接下来会对于发明者FMZ平台介绍。 发明…

Qt桌面应用开发 第十天(综合项目二 翻金币)

目录 1.主场景搭建 1.1重载绘制事件&#xff0c;绘制背景图和标题图片 1.2设置窗口标题&#xff0c;大小&#xff0c;图片 1.3退出按钮对应关闭窗口&#xff0c;连接信号 2.开始按钮创建 2.1封装MyPushButton类 2.2加载按钮上的图片 3.开始按钮跳跃效果 3.1按钮向上跳…

【maven-4】IDEA 配置本地 Maven 及如何使用 Maven 创建 Java 工程

IntelliJ IDEA&#xff08;以下简称 IDEA&#xff09;是一款功能强大的集成开发环境&#xff0c;广泛应用于 Java 开发。下面将详细介绍如何在 IDEA 中配置本地 Maven&#xff0c;并创建一个 Maven Java 工程&#xff0c;快速上手并高效使用 Maven 进行 Java 开发。 1. Maven …

详细了解索引规约

索引规约 在大厂中数据量非常庞大&#xff0c;也有很多高并发场景&#xff0c;因此在大厂中使用索引规约主要是为了规范索引的创建、使用及管理&#xff0c;确保数据库性能的高效与稳定&#xff0c;避免因随意或不合理创建索引带来诸如占用过多存储资源、影响数据更新效率等问…

利用Ubuntu批量下载modis图像(New)

由于最近modis原来批量下载的代码不再直接给出&#xff0c;因此&#xff0c;再次梳理如何利用Ubuntu下载modis数据。 之前的下载代码为十分长&#xff0c;现在只给出一部分&#xff0c;需要自己再补充另一部分。之前的为&#xff1a; 感谢郭师兄的指导&#xff08;https://blo…

vue3图片报错转换为空白不显示的方法

vue3图片报错转换为空白不显示的方法 直接上代码&#xff1a; <el-table-column label"领料人" align"center"><template #default"scope"><el-imagev-if"scope.row.receiver":src"scope.row.receiver"style…

在OpenHarmony系统下开发支持Android应用的双框架系统

在 OpenHarmony 系统下开发支持 Android 应用的双框架系统&#xff0c;主要的目标是实现 OpenHarmony 本身作为底层操作系统&#xff0c;并通过兼容层或者桥接技术&#xff0c;允许 Android 应用在其上运行。双框架系统的架构设计会涉及到 OpenHarmony 和 Android 的结合&#…

混沌工程/混沌测试/云原生测试/云平台测试

背景 私有云/公有云/混合云等具有复杂&#xff0c;分布式&#xff0c;环境多样性等特点&#xff0c;许多特殊场景引发的线上问题很难被有效发现。所以需要引入混沌工程&#xff0c;建立对系统抵御生产环境中失控条件的能力以及信心&#xff0c;提高系统面对未知风险得能力。 …

C++之 String 类的模拟实现

本文只简述string类模拟实现的重点&#xff0c;其余不再过多赘述 一、模拟实现string类的构造函数 本文主要实现下图两个构造函数&#xff0c;即string()和string(const string& str) 而关于string的底层&#xff0c;其实就是数组&#xff0c;在物理逻辑上是连续的空间&am…

数据结构基础之《(9)—归并排序》

一、什么是归并排序 1、整体是递归&#xff0c;左边排好序右边排好序merge让整体有序 2、让其整体有序的过程里用了排外序方法 3、利用master公式来求解时间复杂度 4、当然可以用非递归实现 二、归并排序说明 1、首先有一个f函数 void f(arr, L, R) 说明&#xff1a;在arr上…

Pytorch深度学习笔记

1、大于或等于三维的张量没有名称&#xff0c;统一叫张量。 点-----标量&#xff08;Scalar&#xff09;----0阶张量是标量----只有数值大小&#xff0c;没有方向&#xff0c;部分有正负之分 线-----向量&#xff08;Vector&#xff09;----1阶张量是向量----有大小和方向&…

UIE与ERNIE-Layout:智能视频问答任务初探

内容来自百度飞桨ai社区UIE与ERNIE-Layout&#xff1a;智能视频问答任务初探&#xff1a; 如有侵权&#xff0c;请联系删除 1 环境准备 In [2] # 安装依赖库 !pip install paddlenlp --upgrade !pip install paddleocr --upgrade !pip install paddlespeech --upgrade In …

[代码随想录06]哈希表的使用,有效字母异位词,两数组交集,快乐数,两数之和

前言 哈希表是什么&#xff1f;一句话带你理解&#xff0c;简单来说我们对于杂乱的数据&#xff0c;怎么快速找到数据&#xff0c;如何做呢&#xff1f;一般的做法就是遍历复杂度为o(N)去找寻一个数据&#xff0c;但是吧&#xff0c;我们这样思考的话&#xff0c;还是花了大量时…

三维路径规划|基于黑翅鸢BKA优化算法的三维路径规划Matlab程序

三维路径规划|基于黑翅鸢BKA优化算法的三维路径规划Matlab程序 文章目录 前言三维路径规划|基于黑翅鸢BKA优化算法的三维路径规划Matlab程序基于黑翅鸢BKA优化算法的三维路径规划一、研究基本原理二、黑翅鸢BKA优化算法的基本步骤&#xff1a;三、详细流程四、总结 二、实验结果…

【问题】webdriver.Chrome()设置参数executable_path报不存在

场景1: 标红报错unresolved reference executable_path 场景2: 执行报错TypeError: __init__() got an unexpected keyword argument executable_path 原因&#xff1a; 上述两种场景是因为selenium4开始不再支持某些初始化参数。比如executable_path 解决&#xff1a; 方案…