Windows 玩转大模型第二天：流式输出和角色扮演（提示词）（全部代码和详细部署流程）

Windows 玩转大模型第一天：大模型本地部署，调用大模型API可直接工程化应用（全部代码和详细部署流程）-CSDN博客

流式输出是指模型在生成内容时不是一次性输出全部结果，而是逐步、连续地生成输出。这种方式类似于人类在进行口语交流或写作时的思考过程，即边思考边表达。流式输出的优点包括更高的灵活性和交互性，能够在长文本生成中动态调整内容，同时允许用户在生成过程中即时提供反馈或修改指导方向。

提示词是提供给模型的输入文本，用于引导模型的输出方向和风格。简单来说，提示词就是告诉模型用户希望得到什么样的信息或回答。提示词的设定对模型的输出结果有决定性影响。通过精心设计的提示词，可以引导模型在特定领域内生成更准确、更符合预期的回答或内容。提示词不仅限于文本，也可以是图片、音频等多种形式，根据模型的不同能力和应用场景而定。

一、流式输出

import requests
import json# 发送 API 请求获取回复
url = "http://localhost:11434/api/generate"
payload = {"model": "llama3", "prompt": "中文回答，写个500字日记"}
response = requests.post(url, json=payload, stream=True)  # 确保使用 stream=True# 尝试解析流式 JSON 响应
try:# 处理流式响应中的多个 JSON 对象for line in response.iter_lines():if line:  # 确保行不为空try:# 解析单个 JSON 对象response_json = json.loads(line.decode('utf-8'))# 直接打印新响应部分，不换行if 'response' in response_json:print(response_json['response'], end='', flush=True)# 检查是否是最后一个片段if 'done' in response_json and response_json['done']:print("\nStream ended.")breakexcept json.JSONDecodeError as e:print("\nJSON decode error:", e)continue
except json.JSONDecodeError:print("Initial JSON decode error: potentially non-stream response")

这下就舒服多了，速度感觉也快了很多。

二、角色扮演（提示词）

现在大模型叫LLaMa，如何让它变成其他角色。

1. 新建一个txt文件：

FROM llama3# set the temperature to 1 [higher is more creative, lower is more coherent]
PARAMETER temperature 1# set the system message
SYSTEM """
You are Mario from Super Mario Bros. Answer as Mario, the assistant, only.
"""

2.重命名为Modelfile，并删除后缀名。

3.然后运行：

ollama create mario -f ./Modelfileollama list

4.再运行：就可以进行角色扮演了。

ollama run mario

如果不需要了，可以删除模型：

ollama rm mario

代码调用api，只需修改为：payload = {"model": "mario", "prompt": "你是谁，请使用中文回答" }

import requests
import json# 发送 API 请求获取回复
url = "http://localhost:11434/api/generate"
#payload = {"model": "llama3", "prompt": "中文回答，你是谁"}
payload = {"model": "mario", "prompt": "你是谁，请使用中文回答" }
response = requests.post(url, json=payload, stream=True)  # 确保使用 stream=True# 尝试解析流式 JSON 响应
try:# 处理流式响应中的多个 JSON 对象for line in response.iter_lines():if line:  # 确保行不为空try:# 解析单个 JSON 对象response_json = json.loads(line.decode('utf-8'))# 直接打印新响应部分，不换行if 'response' in response_json:print(response_json['response'], end='', flush=True)# 检查是否是最后一个片段if 'done' in response_json and response_json['done']:print("\nStream ended.")breakexcept json.JSONDecodeError as e:print("\nJSON decode error:", e)continue
except json.JSONDecodeError:print("Initial JSON decode error: potentially non-stream response")