langchain 部署组件-LangServe

原文:🦜️🏓 LangServe | 🦜️🔗 Langchain

LangServe 

🚩 We will be releasing a hosted version of LangServe for one-click deployments of LangChain applications. Sign up here to get on the waitlist.

Overview​

LangServe helps developers deploy LangChain runnables and chains as a REST API.

This library is integrated with FastAPI and uses pydantic for data validation.

In addition, it provides a client that can be used to call into runnables deployed on a server. A javascript client is available in LangChainJS.

Features​

  • Input and Output schemas automatically inferred from your LangChain object, and enforced on every API call, with rich error messages
  • API docs page with JSONSchema and Swagger (insert example link)
  • Efficient /invoke/batch and /stream endpoints with support for many concurrent requests on a single server
  • /stream_log endpoint for streaming all (or some) intermediate steps from your chain/agent
  • Playground page at /playground with streaming output and intermediate steps
  • Built-in (optional) tracing to LangSmith, just add your API key (see Instructions])
  • All built with battle-tested open-source Python libraries like FastAPI, Pydantic, uvloop and asyncio.
  • Use the client SDK to call a LangServe server as if it was a Runnable running locally (or call the HTTP API directly)
  • LangServe Hub

Limitations​

  • Client callbacks are not yet supported for events that originate on the server
  • OpenAPI docs will not be generated when using Pydantic V2. Fast API does not support mixing pydantic v1 and v2 namespaces. See section below for more details.

Hosted LangServe​

We will be releasing a hosted version of LangServe for one-click deployments of LangChain applications. Sign up here to get on the waitlist.

Security​

  • Vulnerability in Versions 0.0.13 - 0.0.15 -- playground endpoint allows accessing arbitrary files on server. Resolved in 0.0.16.

Installation​

For both client and server:

pip install "langserve[all]"

or pip install "langserve[client]" for client code, and pip install "langserve[server]" for server code.

LangChain CLI 🛠️​

Use the LangChain CLI to bootstrap a LangServe project quickly.

To use the langchain CLI make sure that you have a recent version of langchain-cli installed. You can install it with pip install -U langchain-cli.

langchain app new ../path/to/directory

Examples​

Get your LangServe instance started quickly with LangChain Templates.

For more examples, see the templates index or the examples directory.

Server​

Here's a server that deploys an OpenAI chat model, an Anthropic chat model, and a chain that uses the Anthropic model to tell a joke about a topic.

#!/usr/bin/env python
from fastapi import FastAPI
from langchain.prompts import ChatPromptTemplate
from langchain.chat_models import ChatAnthropic, ChatOpenAI
from langserve import add_routesapp = FastAPI(title="LangChain Server",version="1.0",description="A simple api server using Langchain's Runnable interfaces",
)add_routes(app,ChatOpenAI(),path="/openai",
)add_routes(app,ChatAnthropic(),path="/anthropic",
)model = ChatAnthropic()
prompt = ChatPromptTemplate.from_template("tell me a joke about {topic}")
add_routes(app,prompt | model,path="/joke",
)if __name__ == "__main__":import uvicornuvicorn.run(app, host="localhost", port=8000)

Docs​

If you've deployed the server above, you can view the generated OpenAPI docs using:

⚠️ If using pydantic v2, docs will not be generated for invoke/batch/stream/stream_log. See Pydantic section below for more details.

curl localhost:8000/docs

make sure to add the /docs suffix.

⚠️ Index page / is not defined by design, so curl localhost:8000 or visiting the URL will return a 404. If you want content at / define an endpoint @app.get("/").

Client​

Python SDK

from langchain.schema import SystemMessage, HumanMessage
from langchain.prompts import ChatPromptTemplate
from langchain.schema.runnable import RunnableMap
from langserve import RemoteRunnableopenai = RemoteRunnable("http://localhost:8000/openai/")
anthropic = RemoteRunnable("http://localhost:8000/anthropic/")
joke_chain = RemoteRunnable("http://localhost:8000/joke/")joke_chain.invoke({"topic": "parrots"})# or async
await joke_chain.ainvoke({"topic": "parrots"})prompt = [SystemMessage(content='Act like either a cat or a parrot.'),HumanMessage(content='Hello!')
]# Supports astream
async for msg in anthropic.astream(prompt):print(msg, end="", flush=True)prompt = ChatPromptTemplate.from_messages([("system", "Tell me a long story about {topic}")]
)# Can define custom chains
chain = prompt | RunnableMap({"openai": openai,"anthropic": anthropic,
})chain.batch([{ "topic": "parrots" }, { "topic": "cats" }])

In TypeScript (requires LangChain.js version 0.0.166 or later):

import { RemoteRunnable } from "langchain/runnables/remote";const chain = new RemoteRunnable({url: `http://localhost:8000/joke/`,
});
const result = await chain.invoke({topic: "cats",
});

Python using requests:

import requests
response = requests.post("http://localhost:8000/joke/invoke/",json={'input': {'topic': 'cats'}}
)
response.json()

You can also use curl:

curl --location --request POST 'http://localhost:8000/joke/invoke/' \--header 'Content-Type: application/json' \--data-raw '{"input": {"topic": "cats"}}'

Endpoints​

The following code:

...
add_routes(app,runnable,path="/my_runnable",
)

adds of these endpoints to the server:

  • POST /my_runnable/invoke - invoke the runnable on a single input
  • POST /my_runnable/batch - invoke the runnable on a batch of inputs
  • POST /my_runnable/stream - invoke on a single input and stream the output
  • POST /my_runnable/stream_log - invoke on a single input and stream the output, including output of intermediate steps as it's generated
  • GET /my_runnable/input_schema - json schema for input to the runnable
  • GET /my_runnable/output_schema - json schema for output of the runnable
  • GET /my_runnable/config_schema - json schema for config of the runnable

These endpoints match the LangChain Expression Language interface -- please reference this documentation for more details.

Playground​

You can find a playground page for your runnable at /my_runnable/playground. This exposes a simple UI to configure and invoke your runnable with streaming output and intermediate steps.

Widgets​

The playground supports widgets and can be used to test your runnable with different inputs.

In addition, for configurable runnables, the playground will allow you to configure the runnable and share a link with the configuration:

Sharing

Legacy Chains​

LangServe works with both Runnables (constructed via LangChain Expression Language) and legacy chains (inheriting from Chain). However, some of the input schemas for legacy chains may be incomplete/incorrect, leading to errors. This can be fixed by updating the input_schema property of those chains in LangChain. If you encounter any errors, please open an issue on THIS repo, and we will work to address it.

Deployment​

Deploy to GCP​

You can deploy to GCP Cloud Run using the following command:

gcloud run deploy [your-service-name] --source . --port 8001 --allow-unauthenticated --region us-central1 --set-env-vars=OPENAI_API_KEY=your_key

Pydantic​

LangServe provides support for Pydantic 2 with some limitations.

  1. OpenAPI docs will not be generated for invoke/batch/stream/stream_log when using Pydantic V2. Fast API does not support [mixing pydantic v1 and v2 namespaces].
  2. LangChain uses the v1 namespace in Pydantic v2. Please read the following guidelines to ensure compatibility with LangChain

Except for these limitations, we expect the API endpoints, the playground and any other features to work as expected.

Advanced​

Handling Authentication​

If you need to add authentication to your server, please reference FastAPI's security documentation and middleware documentation.

Files​

LLM applications often deal with files. There are different architectures that can be made to implement file processing; at a high level:

  1. The file may be uploaded to the server via a dedicated endpoint and processed using a separate endpoint
  2. The file may be uploaded by either value (bytes of file) or reference (e.g., s3 url to file content)
  3. The processing endpoint may be blocking or non-blocking
  4. If significant processing is required, the processing may be offloaded to a dedicated process pool

You should determine what is the appropriate architecture for your application.

Currently, to upload files by value to a runnable, use base64 encoding for the file (multipart/form-data is not supported yet).

Here's an example that shows how to use base64 encoding to send a file to a remote runnable.

Remember, you can always upload files by reference (e.g., s3 url) or upload them as multipart/form-data to a dedicated endpoint.

Custom Input and Output Types​

Input and Output types are defined on all runnables.

You can access them via the input_schema and output_schema properties.

LangServe uses these types for validation and documentation.

If you want to override the default inferred types, you can use the with_types method.

Here's a toy example to illustrate the idea:

from typing import Anyfrom fastapi import FastAPI
from langchain.schema.runnable import RunnableLambdaapp = FastAPI()def func(x: Any) -> int:"""Mistyped function that should accept an int but accepts anything."""return x + 1runnable = RunnableLambda(func).with_types(input_schema=int,
)add_routes(app, runnable)

Custom User Types​

Inherit from CustomUserType if you want the data to de-serialize into a pydantic model rather than the equivalent dict representation.

At the moment, this type only works server side and is used to specify desired decoding behavior. If inheriting from this type the server will keep the decoded type as a pydantic model instead of converting it into a dict.

from fastapi import FastAPI
from langchain.schema.runnable import RunnableLambdafrom langserve import add_routes
from langserve.schema import CustomUserTypeapp = FastAPI()class Foo(CustomUserType):bar: intdef func(foo: Foo) -> int:"""Sample function that expects a Foo type which is a pydantic model"""assert isinstance(foo, Foo)return foo.bar# Note that the input and output type are automatically inferred!
# You do not need to specify them.
# runnable = RunnableLambda(func).with_types( # <-- Not needed in this case
#     input_schema=Foo,
#     output_schema=int,
# 
add_routes(app, RunnableLambda(func), path="/foo")

Playground Widgets​

The playground allows you to define custom widgets for your runnable from the backend.

  • A widget is specified at the field level and shipped as part of the JSON schema of the input type
  • A widget must contain a key called type with the value being one of a well known list of widgets
  • Other widget keys will be associated with values that describe paths in a JSON object

General schema:

type JsonPath = number | string | (number | string)[];
type NameSpacedPath = { title: string; path: JsonPath }; // Using title to mimick json schema, but can use namespace
type OneOfPath = { oneOf: JsonPath[] };type Widget = {type: string // Some well known type (e.g., base64file, chat etc.)[key: string]: JsonPath | NameSpacedPath | OneOfPath;
};

File Upload Widget​

Allows creation of a file upload input in the UI playground for files that are uploaded as base64 encoded strings. Here's the full example.

Snippet:

try:from pydantic.v1 import Field
except ImportError:from pydantic import Fieldfrom langserve import CustomUserType# ATTENTION: Inherit from CustomUserType instead of BaseModel otherwise
#            the server will decode it into a dict instead of a pydantic model.
class FileProcessingRequest(CustomUserType):"""Request including a base64 encoded file."""# The extra field is used to specify a widget for the playground UI.file: str = Field(..., extra={"widget": {"type": "base64file"}})num_chars: int = 100

Example widget:

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/160648.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

OpenLayers入门,OpenLayers6的WebGLPointsLayer图层样式和运算符详解,四种symbolType类型案例

专栏目录: OpenLayers入门教程汇总目录 前言 本章讲解使用OpenLayers6的WebGL图层显示大量点情况下,列举出所有WebGLPointsLayer图层所支持的所有样式运算符大全。 补充说明 本篇主要介绍OpenLayers6.x版本的webgl图层,OpenLayers7.x和OpenLayers8.x主要更新内容就是webgl…

GB28181学习(十七)——基于jrtplib实现tcp被动和主动发流

前言 GB/T28181-2022实时流的传输方式介绍&#xff1a;https://blog.csdn.net/www_dong/article/details/134255185 基于jrtplib实现tcp被动和主动收流介绍&#xff1a;https://blog.csdn.net/www_dong/article/details/134451387 本文主要介绍下级平台或设备发流功能&#…

生活如果真能像队列一样的话

生活如果真能像队列一样&#xff0c;那该多好啊。 —————————————————————————————————————————— 背包&#xff0c;队列 可以先看他们的API&#xff1a;都含有一个无参构造函数&#xff0c;添加单个元素的方法&#xff0c;测试集合…

php项目从宝塔面板切换转到phpstudy小皮面板

宝塔面板转phpstudy面板 版本 宝塔面板8.0.1 phpstudy面板8.1.1.3 步骤 1、宝塔面板,找到项目文件夹,打包、下载到本地、解压 2、本地windows系统安装phpstudy面板,选择尽可能一样的配置 比如宝塔php7.4.33,可能phpstudy面板只有php7.4.3,也行 大环境一定要一致,比如…

力扣算法练习BM46—最小的K个数

题目 给定一个长度为 n 的可能有重复值的数组&#xff0c;找出其中不去重的最小的 k 个数。例如数组元素是4,5,1,6,2,7,3,8这8个数字&#xff0c;则最小的4个数字是1,2,3,4(任意顺序皆可)。 数据范围&#xff1a;0≤k,n≤10000&#xff0c;数组中每个数的大小0≤val≤1000 要…

linux signal 机制

ref&#xff1a; Linux操作系统学习笔记&#xff08;十六&#xff09;进程间通信之信号 | Ty-Chens Home https://www.cnblogs.com/renxinyuan/p/3867593.html 当执行kill -9 PID时系统发生了什么 -

Codeforces Round 910 (Div. 2) D. Absolute Beauty

D. Absolute Beauty 有两个长度为 n n n 的整数数组 a 1 , a 2 , … , a n a_1,a_2,\ldots,a_n a1​,a2​,…,an​ 和 b 1 , b 2 , … , b n b_1,b_2,\ldots,b_n b1​,b2​,…,bn​ 。他将数组 b b b 的美丽值定义为 ∑ i 1 n ∣ a i − b i ∣ . \sum_{i1}^{n} |a_i - b…

基于材料生成算法优化概率神经网络PNN的分类预测 - 附代码

基于材料生成算法优化概率神经网络PNN的分类预测 - 附代码 文章目录 基于材料生成算法优化概率神经网络PNN的分类预测 - 附代码1.PNN网络概述2.变压器故障诊街系统相关背景2.1 模型建立 3.基于材料生成优化的PNN网络5.测试结果6.参考文献7.Matlab代码 摘要&#xff1a;针对PNN神…

JDK命令使用总结

目录 javacjava javac 将源码(*.java)编译成字节码(*.class) javac HelloWorld.javajava 运行字节码(*.class) 不能加后缀名 java HelloWorld直接运行单文件源码(*.java) Java11以上才支持 java HelloWorld.java

ROSNS3(一)

https://github.com/malintha/rosns3 第一步&#xff1a;clone和构建rosns3客户端 第二步&#xff1a;运行 最详细的ubuntu 安装 docker教程 - 知乎 1. unable to find source space /home/muta/src 解决方法&#xff1a; 将副将将碰到的bug&#xff0c;解决方法_#include &…

【C++ Primer Plus学习记录】递增运算符(++)和递减运算符(--)

递增运算符&#xff08;&#xff09;和递减运算符&#xff08;--&#xff09;&#xff1a;前缀版本位于操作数前面&#xff0c;如x&#xff1b;后缀版本位于操作数后面&#xff0c;如x。两个版本对操作数的影响是一样的&#xff0c;但是影响的时间不同。这就像吃饭前买单和吃饭…

Python从零开始快速搭建一个语音对话机器人

文章目录 01-初心缘由02-准备工作03-语音机器人的搭建思路04-语音生成音频文件05-音频文件转文字STT06-与图灵机器人对话07-文字转语音08-语音对话机器人的完整代码09-结束语10-有问必答关于Python技术储备一、Python所有方向的学习路线二、Python基础学习视频三、精品Python学…

SSH连接远程服务器报错:WARNING: REMOTE HOST IDENTIFICATION HAS CHANGED 解决方法

一.错误描述 报错信息里提示了路径信息/root/.ssh/known_hosts:20 二.解决方案 方法一 输入以下指令&#xff1a; ssh-keygen -R XXX&#xff08;需要连接远程服务器的ip&#xff09; 按照我的例子ip:10.165.7.136&#xff0c;会返回以下信息: 重新尝试连接&#xff1a; 输…

C++学习 --set

目录 1&#xff0c; 什么是set 2&#xff0c; 创建set 2-1&#xff0c; 标准数据类型 2-2&#xff0c; 自定义数据类型 2-3&#xff0c; 其他创建方式 3&#xff0c; 操作set 3-1&#xff0c; 赋值 3-2&#xff0c; 添加元素&#xff08;insert&#xff09; 3-2-1&…

MySQL的乐观锁和悲观锁

1、乐观锁&#xff1a; 乐观锁在操作数据的时候&#xff0c;是保持一种乐观的状态&#xff0c;认为别的线程是不会同时修改数据的&#xff0c;所以是不会上锁的&#xff0c;但是在更新的时候&#xff0c;会判断一下在这个期间内是否有别的线程修改过数据。 主要的流程&#x…

规划类3d全景线上云展馆帮助企业轻松拓展海外市场

科技3D线上云展馆作为一种基于VR虚拟现实和互联网技术的新一代展览平台。可以在线上虚拟空间中模拟真实的展馆&#xff0c;让观众无需亲自到场&#xff0c;即可获得沉浸式的参观体验。通过这个展馆&#xff0c;您可以充分、全面、立体展示您的产品、服务以及各种创意作品&#…

python运算符重载之成员关系和属性运算

1 python运算符重载之成员关系和属性运算 1.1 重载成员关系运算符 1.1.1 contains,iter,getitem python使用成员关系运算符in时&#xff0c; 按优先级调用方法&#xff1a;contains>iter>getitem&#xff1b; class MyIters:def __init__(self,value):self.datavalu…

2023年【安全生产监管人员】考试题及安全生产监管人员找解析

题库来源&#xff1a;安全生产模拟考试一点通公众号小程序 安全生产监管人员考试题参考答案及安全生产监管人员考试试题解析是安全生产模拟考试一点通题库老师及安全生产监管人员操作证已考过的学员汇总&#xff0c;相对有效帮助安全生产监管人员找解析学员顺利通过考试。 1、…

【树莓派】Camera Module 使用

工具 RPI4RPI Camera Module 2 硬件安装 直接插到板子的相机带子插口上即可 使用前提 libcamera-hello运行这个命令能够成功&#xff0c;否则需要装相应的包 在 RPI4 上测试 libcamera-jpeg -o 00001.jpg -t 2000 --width 640 --height 480t 表示程序运行&#xff08;预…