chatglm3-6b部署及微调

modelscope: https://modelscope.cn/models/ZhipuAI/chatglm3-6b/files
github: https://github.com/THUDM/ChatGLM3
镜像: ubuntu20.04-cuda11.8.0-py38-torch2.0.1-tf2.13.0-1.9.4
v100 16G现存单卡

安装

软件依赖

pip install --upgrade pippip install deepspeed -Upip install modelscope>=1.9.0pip install protobuf 'transformers>=4.30.2' cpm_kernels 'torch>=2.0' gradio mdtex2html sentencepiece accelerate

下载及调用

from modelscope import AutoTokenizer, AutoModel, snapshot_download
model_dir = snapshot_download("ZhipuAI/chatglm3-6b", revision = "v1.0.2")
tokenizer = AutoTokenizer.from_pretrained(model_dir, trust_remote_code=True)
model = AutoModel.from_pretrained(model_dir, trust_remote_code=True).half().cuda()
model = model.eval()
response, history = model.chat(tokenizer, "你好", history=[])
print(response)
response, history = model.chat(tokenizer, "晚上睡不着应该怎么办", history=history)
print(response)

在这里插入图片描述

微调

数据集: https://modelscope.cn/datasets/damo/MSAgent-Bench/summary
项目: https://github.com/modelscope/swift

项目下载

mkdir py
git clone https://github.com/modelscope/swift.git
cd swift# 多环境设置(可选)
# python -m venv swift
# source swift/bin/activate

安装依赖:

# 已安装忽略
pip install ms-swift# 已安装忽略
pip install modelscope>=1.9.0# 设置pip全局镜像和安装相关的python包
pip config set global.index-url https://mirrors.aliyun.com/pypi/simple/
git clone https://github.com/modelscope/swift.git
cd swift
pip install .[llm]
# 下面的脚本需要在此目录下执行
cd examples/pytorch/llm# 如果你想要使用deepspeed
pip install deepspeed -U# 如果你想要使用基于auto_gptq的qlora训练. (推荐, 效果优于bnb)
# 使用auto_gptq的模型: qwen-7b-chat-int4, qwen-14b-chat-int4, qwen-7b-chat-int8, qwen-14b-chat-int8
pip install auto_gptq optimum -U# 如果你想要使用基于bnb的qlora训练.
pip install bitsandbytes -U

脚本sft.sh

将脚本放在swift/examples/pytorch/llm/scripts/chatglm3_6b/lora_ddp_ds这个目录下

单显卡: CUDA_VISIBLE_DEVICES=0
模型ID: model_id_or_path ZhipuAI/chatglm3-6b
模型版本: model_revision v1.0.2
dtype: 如果是老显卡比如V100 是不支持bf16的需要指定为: fp16
模板类型: template_type chatglm3
数据集: dataset damo-agent-mini-zh 这里采用达摩院的agent
lora_rank和lora_alpha 注意: lora_alpha一定要是lora_rank 2倍质量最高
hub_token: 你的modelscope平台的token该参数只有在push_to_hub设置为True时才生效.
gradient_accumulation_steps 根据你的服务器性能调整大小性能不好则值相对较小 v100
剩余其他参数默认即可

# v100 16G 单卡
nproc_per_node=1PYTHONPATH=../../.. \
CUDA_VISIBLE_DEVICES=0 \
torchrun \--nproc_per_node=$nproc_per_node \--master_port 29500 \llm_sft.py \--model_id_or_path ZhipuAI/chatglm3-6b \--model_revision v1.0.2 \--sft_type lora \--tuner_backend swift \--template_type chatglm3 \--dtype fp16 \--output_dir output \--dataset damo-agent-mini-zh \--train_dataset_sample -1 \--num_train_epochs 1 \--max_length 4096 \--lora_rank 8 \--lora_alpha 16 \--lora_dropout_p 0.05 \--lora_target_modules AUTO \--gradient_checkpointing true \--batch_size 1 \--weight_decay 0. \--learning_rate 1e-4 \--gradient_accumulation_steps 16 \--max_grad_norm 0.5 \--warmup_ratio 0.03 \--eval_steps 100 \--save_steps 100 \--save_total_limit 2 \--logging_steps 10 \--push_to_hub false \--hub_model_id chatglm3-6b-lora \--hub_private_repo true \--hub_token 'token' \--deepspeed_config_path 'ds_config/zero2.json' \--only_save_model true \

运行脚本

注意: 要在 swift/examples/pytorch/llm 这个目录下进行记得给脚本权限chmod +x llm/*.py

./scripts/chatglm3_6b/lora_ddp_ds/sft.sh

常见问题

1.显卡驱动

RuntimeError: The NVIDIA driver on your system is too old (found version 11080). Please update your GPU driver by downloading and installing a new version from the URL: http://www.nvidia.com/Download/index.aspx Alternatively, go to: https://pytorch.org to install a PyTorch version that has been compiled with your version of the CUDA driver.

解决方案

错误提示显卡驱动较老其实可能是torch版本太高导致的问题我们用的是2.0.1 请检查你的版本是否是2.0.1

# 查看torch版本
python
import torch
print(torch.__version__)# 查看CUDA版本
nvidia-smi# 卸载过高的版本
pip uninstall torch# 访问官方查看对应版本: https://pytorch.org/get-started/previous-versions/  以cuda 11.8 pytorch:2.0.1 举例  
conda install pytorch==2.0.1 torchvision==0.15.2 torchaudio==2.0.2 pytorch-cuda=11.8 -c pytorch -c nvidia