bge-large-zh-v1.5
下载模型到指定路径:
modelscope download --model BAAI/bge-large-zh-v1.5 --local_dir ./bge-large-zh-v1.5
自定义 embedding 模型,custom-bge-large-zh-v1.5.json:
{"model_name": "custom-bge-large-zh-v1.5","dimensions": 1024,"max_tokens": 512,"language": ["zh"],"model_id": "BAAI/bge-large-zh-v1.5","model_uri": "/path/to/bge-large-zh-v1.5"
}
注册自定义模型:
xinference register --model-type embedding --file custom-bge-large-zh-v1.5.json --persist
启动自定义模型:
xinference launch --model-name custom-bge-large-zh-v1.5 --model-type embedding
bge-reranker-v2-m3
下载模型到指定路径:
modelscope download --model AI-ModelScope/bge-reranker-v2-m3 --local_dir ./bge-reranker-v2-m3
自定义 rerank 模型custom-bge-reranker-v2-m3.json
{"model_name": "custom-bge-reranker-v2-m3","type": "normal","language": ["en", "zh", "multilingual"],"model_id": "BAAI/bge-reranker-v2-m3","model_uri": "/path/to/bge-reranker-v2-m3"
}
注册自定义模型:
xinference register --model-type rerank --file ./custom-bge-reranker-v2-m3.json --persist
出现错误:
Traceback (most recent call last):File "//env/bin/xinference", line 8, in <module>sys.exit(cli())File "//env/lib/python3.10/site-packages/click/core.py", line 1161, in __call__return self.main(*args, **kwargs)File "//env/lib/python3.10/site-packages/click/core.py", line 1082, in mainrv = self.invoke(ctx)File "//env/lib/python3.10/site-packages/click/core.py", line 1697, in invokereturn _process_result(sub_ctx.command.invoke(sub_ctx))File "//env/lib/python3.10/site-packages/click/core.py", line 1443, in invokereturn ctx.invoke(self.callback, **ctx.params)File "//env/lib/python3.10/site-packages/click/core.py", line 788, in invokereturn __callback(*args, **kwargs)File "//env/lib/python3.10/site-packages/xinference/deploy/cmdline.py", line 407, in register_modelclient.register_model(File "//env/lib/python3.10/site-packages/xinference/client/restful/restful_client.py", line 1188, in register_modelraise RuntimeError(
RuntimeError: Failed to register model, detail: Not Found
成功(因为xinference部署在9999端口):
xinference register --endpoint http://localhost:9999 --model-type rerank --file ./custom-bge-reranker-v2-m3.json --persist
启动自定义模型:
xinference launch --model-type rerank --model-name custom-bge-reranker-v2-m3 --endpoint http://localhost:9999
验证模型加载成功,输出中会显示已加载的模型。
curl http://localhost:9999/v1/models
{"object":"list","data":[{"id":"custom-bge-large-zh-v1.5","object":"model","created":0,"owned_by":"xinference","model_type":"embedding","address":"0.0.0.0:39987","accelerators":[],"model_name":"custom-bge-large-zh-v1.5","dimensions":1024,"max_tokens":512,"language":["zh"],"model_revision":null,"replica":1},{"id":"custom-bge-reranker-v2-m3","object":"model","created":0,"owned_by":"xinference","model_type":"rerank","address":"0.0.0.0:44611","accelerators":[],"type":"normal","model_name":"custom-bge-reranker-v2-m3","language":["en","zh","multilingual"],"model_revision":null,"replica":1}]}(env)