天池大赛Higress插件官方demo详细部署+调试
契机
⚙ 使用Higress AI网关优化AI调用成本。就是基于向量召回相似问题的缓存,降低LLM API调用成本。就是开发一个网关插件做QA缓存嘛。前文已经成功复现了hello-world插件,这次结合官方提供的AI-Cache插件自己动手改改,再写点注释放到天池大赛去跑跑分,环境搭建起来确实有很多要注意的地方,所以记录下来。
前期准备
文档中所有变量都是${your_qwen_token}这种形式,需要你自己替换
#docker仓库准备,不多赘述,以后docker login的时候需要这个页面设置的访问凭证
#这个是调试插件CI/CD的关键
https://cr.console.aliyun.com/cn-hangzhou/instance/repositories#申请千问token
[https://help.aliyun.com/zh/dashscope/opening-service](https://help.aliyun.com/zh/dashscope/opening-service)
#保存变量${your_qwen_token}#上传文件到千问,文件下载位置见下图
#地址:https://tianchi.aliyun.com/competition/entrance/532192/informatio
#下载下来,解压缩得到doc.md
curl --location --request POST 'https://dashscope.aliyuncs.com/compatible-mode/v1/files' \--header 'Authorization: ${your_qwen_token}' \--form 'file=@./doc.md' \--form 'purpose=file-extract'
#得到结果如下
{"id":"${your_file_id}","object":"file","bytes":79439,"created_at":1719468299,"filename":"doc.md","purpose":"file-extract","status":"processed"}
#保存变量${your_file_id}
本地搭建调试
docker运行higress
#本地新建docker-compose.yml如下
#我们只需要网关即可,不需要其他的httpbin容器version: '3.9'
services:higress:#这个镜像包含redis,并且包含了ai-proxy插件image: registry.cn-hangzhou.aliyuncs.com/ztygw/aio-redis:1.4.0-rc.1environment:#开启日志输出- GATEWAY_COMPONENT_LOG_LEVEL=misc:error,wasm:debugports:#管理页面端口- "8080:8080/tcp"#llm端口- "8001:8001/tcp"#redis端口- "6379:6379/tcp"restart: always#直接启动起来
docker compose up
higress管理页面配置
此时容器运行起来了,访问http://localhost:8001,进入higress管理页面,密码随便
创建服务来源
首先创建官方文档中的DNS类型的服务,域名是 dashscope.aliyuncs.com,端口是443
然后创建一个redis固定地址服务来源,服务地址写127.0.0.1:6379,名称直接写redis
最后你的服务来源应该如下
路由配置
创建一条前缀匹配/的路由,转发给上面创建的服务,并附加注解:
higress.io/backend-protocol: HTTPS
higress.io/proxy-ssl-name: dashscope.aliyuncs.com
higress.io/proxy-ssl-server-name: on
就按照下图填写就完了
配置AI代理插件
这里要把插件打开,并且把 y o u r q w e n t o k e n , {your_qwen_token}, yourqwentoken,{your_file_id}填写上去
LLM访问验证
上面配置好了,此时你的llm就可以使用了
#测试访问
#注意这里是8080端口
curl 'http://localhost:8080/api/openai/v1/chat/completions' \-H 'Accept: application/json, text/event-stream' \-H 'Content-Type: application/json' \--data-raw '{"model":"qwen-long","frequency_penalty":0,"max_tokens":800,"stream":false,"messages":[{"role":"user","content":"higress项目主仓库的github地址是什么"}],"presence_penalty":0,"temperature":0.7,"top_p":0.95}'#如果返回如下格式说明成功
{"id": "from-cache","choices": [{"index": 0,"message": {"role": "assistant","content": "Higress项目的GitHub主仓库地址为: https://github.com/higress-group/higress-group.github.io"},"finish_reason": "stop"}],"model": "gpt-4o","object": "chat.completion","usage": {"prompt_tokens": 0,"completion_tokens": 0,"total_tokens": 0}
}
官方demo添加
加点日志
上面我们已经把项目拷贝下来了,找到官方ai-cache的demo的parseConfig方法,在这里加点日志,等下我们去观察日志插件是否生效
func parseConfig(json gjson.Result, c *PluginConfig, log wrapper.Log) error {log.Info("开始读取配置...")// 读取redis的基本配置c.RedisInfo.ServiceName = json.Get("redis.serviceName").String()if c.RedisInfo.ServiceName == "" {log.Error("Redis 服务名不能为空")return errors.New("redis service name must not be empty")}log.Infof("Redis 服务名: %s", c.RedisInfo.ServiceName)c.RedisInfo.ServicePort = int(json.Get("redis.servicePort").Int())if c.RedisInfo.ServicePort == 0 {if strings.HasSuffix(c.RedisInfo.ServiceName, ".static") {// use default logic port which is 80 for static servicec.RedisInfo.ServicePort = 80} else {c.RedisInfo.ServicePort = 6379}}log.Infof("Redis 服务端口: %d", c.RedisInfo.ServicePort)c.RedisInfo.Username = json.Get("redis.username").String()log.Infof("Redis 用户名: %s", c.RedisInfo.Username)c.RedisInfo.Password = json.Get("redis.password").String()log.Info("Redis 密码已读取")c.RedisInfo.Timeout = int(json.Get("redis.timeout").Int())if c.RedisInfo.Timeout == 0 {c.RedisInfo.Timeout = 1000}log.Infof("Redis 超时时间: %d ms", c.RedisInfo.Timeout)c.CacheKeyFrom.RequestBody = json.Get("cacheKeyFrom.requestBody").String()if c.CacheKeyFrom.RequestBody == "" {c.CacheKeyFrom.RequestBody = "messages.@reverse.0.content"}log.Infof("Cache Key From RequestBody: %s", c.CacheKeyFrom.RequestBody)c.CacheValueFrom.ResponseBody = json.Get("cacheValueFrom.responseBody").String()if c.CacheValueFrom.ResponseBody == "" {c.CacheValueFrom.ResponseBody = "choices.0.message.content"}log.Infof("Cache Value From ResponseBody: %s", c.CacheValueFrom.ResponseBody)c.CacheStreamValueFrom.ResponseBody = json.Get("cacheStreamValueFrom.responseBody").String()if c.CacheStreamValueFrom.ResponseBody == "" {c.CacheStreamValueFrom.ResponseBody = "choices.0.delta.content"}log.Infof("Cache Stream Value From ResponseBody: %s", c.CacheStreamValueFrom.ResponseBody)c.ReturnResponseTemplate = json.Get("returnResponseTemplate").String()if c.ReturnResponseTemplate == "" {c.ReturnResponseTemplate = `{"id":"from-cache","choices":[{"index":0,"message":{"role":"assistant","content":"%s"},"finish_reason":"stop"}],"model":"gpt-4o","object":"chat.completion","usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}`}log.Info("Return Response Template 已读取")c.ReturnStreamResponseTemplate = json.Get("returnStreamResponseTemplate").String()if c.ReturnStreamResponseTemplate == "" {c.ReturnStreamResponseTemplate = `data:{"id":"from-cache","choices":[{"index":0,"delta":{"role":"assistant","content":"%s"},"finish_reason":"stop"}],"model":"gpt-4o","object":"chat.completion","usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}` + "\n\ndata:[DONE]\n\n"}log.Info("Return Stream Response Template 已读取")c.CacheKeyPrefix = json.Get("cacheKeyPrefix").String()if c.CacheKeyPrefix == "" {c.CacheKeyPrefix = DefaultCacheKeyPrefix}log.Infof("Cache Key Prefix: %s", c.CacheKeyPrefix)c.redisClient = wrapper.NewRedisClusterClient(wrapper.FQDNCluster{FQDN: c.RedisInfo.ServiceName,Port: int64(c.RedisInfo.ServicePort),})log.Info("Redis 客户端实例已创建")err := c.redisClient.Init(c.RedisInfo.Username, c.RedisInfo.Password, int64(c.RedisInfo.Timeout))if err != nil {log.Errorf("Redis 客户端初始化失败: %v", err)return err}log.Info("Redis 客户端初始化成功")log.Info("配置初始化成功")return nil
}
还有一个问题onHttpRequestHeaders函数
最后有一个*return types.HeaderStopIteration
最好先改成return types.ActionContinue我不太懂HeaderStopIteration含义,之前卡住的时候我改成ActionContinue就好了*
打包插件+push
#进入ai-cache的目录
cd ~/higress/plugins/wasm-go/extensions/ai-cache#用tinygo打包
tinygo build -o main.wasm -scheduler=none -target=wasi -gc=custom -tags="custommalloc nottinygc_finalizer" ./#需要看看本地有main.wasm生成没有
#作者验证过,macos+arm打包不了#当前目录新建一个DockerFile
vim DockerFile
#写入
FROM scratch
COPY main.wasm plugin.wasm#登陆阿里云docker
docker login --username=${your_docker_username} registry.cn-hangzhou.aliyuncs.com
#输入密码${your_docker_psw}#开始build,注意我这里版本是1.0.0
docker build -t registry.cn-hangzhou.aliyuncs.com/${your_docker_namespace}/${your_docker_repository}:1.0.0 -f Dockerfile .#推送到远程docker
docker push registry.cn-hangzhou.aliyuncs.com/${your_docker_namespace}/${your_docker_repository}:1.0.0#此时得到你的插件地址了
registry.cn-hangzhou.aliyuncs.com/${your_docker_namespace}/${your_docker_repository}:1.0.0
添加ai-cache插件
继续访问higress管理页面http://localhost:8001,新增插件
插件名称:ai-cache
镜像地址:上面你推送过去的地址,这个的ocl://前缀可以不填写,他是自己加上的
执行阶段:认证阶段
优先级:99
~现在插件没有启动,还要其他配置
访问日志查看
#进入higress容器内部,比如我本地CONTAINER ID = ac11f4f3588a
docker exec -it ${your_container_id} bash#查看日志
#由于我们之前配置了环境变量GATEWAY_COMPONENT_LOG_LEVEL=misc:error,wasm:debug
tail -f /var/log/higress/gateway.log
配置+启动插件
这里要先复制,再开启,配置如下
cacheKeyFrom:requestBody: "messages.@reverse.0.content"
cacheStreamValueFrom:responseBody: "choices.0.delta.content"
cacheValueFrom:responseBody: "choices.0.message.content"
redis:serviceName: "redis.static"timeout: 2000
returnResponseTemplate: |{"id":"from-cache","choices":[{"index":0,"message":{"role":"assistant","content":"%s"},"finish_reason":"stop"}],"model":"gpt-4o","object":"chat.completion","usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}
returnStreamResponseTemplate: |-data:{"id":"from-cache","choices":[{"index":0,"delta":{"role":"assistant","content":"%s"},"finish_reason":"stop"}],"model":"gpt-4o","object":"chat.completion","usage":{"prompt_tokens":0,"completion_tokens":0,"total_tokens":0}}data:[DONE]
此时看看刚才我们开启的日志,出现以下字样说明没问题了
验证ai-cache
#测试访问
#注意这里是8080端口
curl 'http://localhost:8080/api/openai/v1/chat/completions' \-H 'Accept: application/json, text/event-stream' \-H 'Content-Type: application/json' \--data-raw '{"model":"qwen-long","frequency_penalty":0,"max_tokens":800,"stream":false,"messages":[{"role":"user","content":"higress项目主仓库的github地址是什么"}],"presence_penalty":0,"temperature":0.7,"top_p":0.95}'#连续两次访问,如果间隔很短,就说明生效了#我们之前把redis映射出来了,可以用redis客户端上去看看key,这里就不多赘述了
迭代升级
后续去higress管理页面,修改ai-cache的镜像地址就行
所以每次代码更新,需要打包代码,打包镜像,推送到docker仓库,修改插件镜像地址