说在前面
- 操作系统:windows11
- ollama版本:0.5.4
- vscode版本:1.96.2
- continue插件版本:0.8.66
ollama安装
- 访问官网,点击下载安装即可
- 默认装在了C盘,比较蛋疼;但是可以指定路径安装:
OllamaSetup.exe /DIR="d:\some\location"
- 安装后确定安装完成
D:\ollama>ollama -v ollama version is 0.5.4
安装qwen2.5-coder
- 在官网搜索qwen2.5-coder
- 复制安装命令
D:\ollama>ollama run qwen2.5-coder pulling manifest pulling 60e05f210007... 2% ▕ ▏ 79 MB/4.7 GB 9.9 MB/s 7m43s
- 安装完之后就可以进行对话了
>>> 帮我用golang写一个斐波那契数列当然可以!以下是一个用Go语言编写的生成斐波那契数列的示例代码:```go package mainimport ("fmt" )// Fibonacci 计算斐波那契数列的第n个数 func Fibonacci(n int) int {if n <= 1 {return n}a, b := 0, 1for i := 2; i <= n; i++ {a, b = b, a+b}return b }// PrintFibonacci 打印斐波那契数列的前n个数 func PrintFibonacci(n int) {for i := 0; i < n; i++ {fmt.Println(Fibonacci(i))} }func main() {n := 10 // 你可以更改这个值来打印不同长度的斐波那契数列PrintFibonacci(n) } ```这个程序定义了两个函数: 1. `Fibonacci` 函数用于计算斐波那契数列的第n个数。 2. `PrintFibonacci` 函数用于打印斐波那契数列的前n个数。在 `main` 函数中,我们调用 `PrintFibonacci(n)` 来打印前10个斐波那契数列的数。你可以根据需要更改 `n` 的值来打印不同长 度的斐波那契数列。>>> Send a message (/? for help)
- 运行后窗口不要关闭
安装continue插件
- 在vscode的插件管理窗口搜索continue
- 点击安装即可
- 选择左侧插件按钮,并选择
Add Chat Model
- 选择
Ollama
,点击Connect
后看到Autodetect - qwen2.5-coder
- 然后就可以进行对话了
配置continue自动补全
- 点击vscode右下角的continue按钮
- 选择
Configure autocomplete options
- 将配置文件修改为以下内容:
{"models": [{"model": "AUTODETECT","title": "Autodetect","provider": "ollama"}],"tabAutocompleteModel": {"title": "qwen","provider": "ollama","model": "qwen2.5-coder:latest","apiKey": ""},"contextProviders": [{"name": "code","params": {}},{"name": "docs","params": {}},{"name": "diff","params": {}},{"name": "terminal","params": {}},{"name": "problems","params": {}},{"name": "folder","params": {}},{"name": "codebase","params": {}}],"slashCommands": [{"name": "share","description": "Export the current chat session to markdown"},{"name": "cmd","description": "Generate a shell command"},{"name": "commit","description": "Generate a git commit message"}] }
- 然后就可以愉快的自动补全了
离线环境下部署大模型
-
以上为在有网络的环境下进行安装的过程,在内网环境下也是类似的,
ollama
、continue插件
的安装都比较简单,不再赘述 -
主要说下模型的部署
-
ollama目前支持两种类型的模型导入:
- Build from a Safetensors model
The model directory should contain the Safetensors weights for a supported architecture.FROM <model directory>
Currently supported model architectures:- Llama (including Llama 2, Llama 3, Llama 3.1, and Llama 3.2)
- Mistral (including Mistral 1, Mistral 2, and Mixtral)
- Gemma (including Gemma 1 and Gemma 2)
- Phi3
- Build from a GGUF file
The GGUF file location should be specified as an absolute path or relative to theFROM ./ollama-model.gguf
Modelfile
location.
- Build from a Safetensors model
-
根据文档说明,qwen2.5-coder只能使用GGUF文件来导入
-
从huggingface或者魔塔下载
-
这里我选择7B大小的模型,Qwen2.5-Coder-7B-Instruct-GGUF
-
点击模型文件,选择下图红框中的下载即可
-
然后在我们的
ollama
窗口执行/show modelfile
命令,
-
将输出内容保存为文件
Modelfile
,并修改FROM
后面的内容,例如:# Modelfile generated by "ollama show" # To build a new Modelfile based on this, replace FROM with: # FROM qwen2.5-coder:latestFROM ./qwen2.5-coder-7b-instruct-q6_k.gguf TEMPLATE """{{- if .Suffix }}<|fim_prefix|>{{ .Prompt }}<|fim_suffix|>{{ .Suffix }}<|fim_middle|> {{- else if .Messages }} {{- if or .System .Tools }}<|im_start|>system {{- if .System }} {{ .System }} {{- end }} {{- if .Tools }}# ToolsYou may call one or more functions to assist with the user query.You are provided with function signatures within <tools></tools> XML tags: <tools> {{- range .Tools }} {"type": "function", "function": {{ .Function }}} {{- end }} </tools>For each function call, return a json object with function name and arguments within <tool_call></tool_call> XML tags: <tool_call> {"name": <function-name>, "arguments": <args-json-object>} </tool_call> {{- end }}<|im_end|> {{ end }} {{- range $i, $_ := .Messages }} {{- $last := eq (len (slice $.Messages $i)) 1 -}} {{- if eq .Role "user" }}<|im_start|>user {{ .Content }}<|im_end|> {{ else if eq .Role "assistant" }}<|im_start|>assistant {{ if .Content }}{{ .Content }} {{- else if .ToolCalls }}<tool_call> {{ range .ToolCalls }}{"name": "{{ .Function.Name }}", "arguments": {{ .Function.Arguments }}} {{ end }}</tool_call> {{- end }}{{ if not $last }}<|im_end|> {{ end }} {{- else if eq .Role "tool" }}<|im_start|>user <tool_response> {{ .Content }} </tool_response><|im_end|> {{ end }} {{- if and (ne .Role "assistant") $last }}<|im_start|>assistant {{ end }} {{- end }} {{- else }} {{- if .System }}<|im_start|>system {{ .System }}<|im_end|> {{ end }}{{ if .Prompt }}<|im_start|>user {{ .Prompt }}<|im_end|> {{ end }}<|im_start|>assistant {{ end }}{{ .Response }}{{ if .Response }}<|im_end|>{{ end }}""" SYSTEM You are Qwen, created by Alibaba Cloud. You are a helpful assistant.
-
将该文件与下载好的GGUF文件放到同一个文件夹
-
然后运行ollama命令
ollama create qwen2.5-coder -f ./Modelfile
-
然后就安装完成了
问题
- 在离线安装好模型之后,使用continue时会出现以下错误:
找了半天资料后,将配置文件中Error calling Ollama /api/generate endpoint: Error: Error: HTTP 404 Not Found
"model": "AUTODETECT",
改成对应的模型名后就正常了{"models": [{"model": "AUTODETECT","title": "Autodetect","provider": "ollama"}],"tabAutocompleteModel": {"title": "qwen","provider": "ollama","model": "qwen2.5-coder:latest","apiKey": ""},