LlaMA2微调实战

LLaMA2-SFT

LLaMA2-SFT, Llama-2-7B微调(transformers)/LORA(peft)/推理

Gtihub地址

https://github.com/yongzhuo/Llama2-SFT

prompt

text_1 = f"".join(["[INST] <<SYS>>\n    ""You are a helpful, respectful and honest assistant. ""Always answer as helpfully as possible, while being safe."" Your answers should not include any harmful, unethical, racist, sexist, toxic, dangerous, ""or illegal content. Please ensure that your responses are socially unbiased and positive in nature.\n\n""    If a question does not make any sense, or is not factually coherent, ""explain why instead of answering something not correct. ""If you don't know the answer to a question, please don't share false information.\n""<</SYS>>\n\n{0} [/INST] "]).format(data_point.get('instruction', '').strip() +"\t"+ data_point.get('input', '').strip())我们缩短后为
text_1 = f"[INST] <<SYS>>\n    You are a helpful, respectful and honest assistant.<</SYS>>" \f"\n\n{0} [/INST] ".format(data_point.get('instruction', '').strip() + "\t" + data_point.get('input', '').strip())

踩过的坑(截至20230727)

1. LLaMA2的weights权重不能为fp16(即必须为fp32, 或者bf16), 否则会溢出;

微调样例

地址: llama2_sft/ft_llama2配置: llama2_sft/ft_llama2/config.py
训练: python train.py
推理: python predict.py
验证: python evaluation.py
接口: python post_api.py

实验日志

微调日志(ADVGEN)

在这里插入图片描述

推理样例(LoRA, R=8)

在这里插入图片描述

参考/感谢

https://github.com/facebookresearch/llama
https://github.com/huggingface/peft
https://github.com/Instruction-Tuning-with-GPT-4/GPT-4-LLM
https://github.com/THUDM/ChatGLM-6B
math23k

免责申明

本项目相关资源仅供学术研究之用，使用涉及第三方代码的部分时，请严格遵循相应的开源协议。模型生成的内容受模型计算、随机性和量化精度损失等因素影响，本项目不对其准确性作出保证。对于模型输出的任何内容，本项目不承担任何法律责任，亦不对因使用相关资源和输出结果而可能产生的任何损失承担责任。

大模型权重的详细协议见facebookresearch/llama
facebookresearch/llama](https://github.com/facebookresearch/llama)

本文来自互联网用户投稿，该文观点仅代表作者本人，不代表本站立场。本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如若转载，请注明出处：http://www.mzph.cn/news/11706.shtml

如若内容造成侵权/违法违规/事实不符，请联系多彩编程网进行投诉反馈email:809451989@qq.com，一经查实，立即删除！

LlaMA2微调实战

LLaMA2-SFT

Gtihub地址

prompt

踩过的坑(截至20230727)

微调样例

实验日志

微调日志(ADVGEN)

推理样例(LoRA, R=8)

参考/感谢

免责申明

相关文章

数组专题攻破新学习笔记

ubuntu远程控制小车运行rviz时报错

nodejs+vue+elementui汽车销售网站

游戏小记-全屏与无边框

Kafka 入门到起飞 - Kafka怎么做到保障消息不会重复消费的？消费者组是什么？

西安电子科技大学

浅析C++临时变量

14.Netty源码之模拟简单的HTTP服务器

2023年全国程序员薪酬排行天梯榜

【通过改变压缩视频的分辨率实现高效的视频语义分割】CVPR2022论文精度

云原生周刊：K8s v1.28 中的结构化身份验证配置

7.1 String StringBuffer 和 StringBuilder 的区别是什么? String 为什么是不可变的?

LLM Data Pipelines: 解析大语言模型训练数据集处理的复杂流程

复现YOLOv8改进最新MPDIoU:有效和准确的边界盒回归的损失，打败G/E/CIoU，效果明显！！！

【业务功能篇56】SpringBoot 日志SLF4J Logback

ext4 - mballoc块分配机制

Python实现指定区域桌面变化监控并报警

基于IP地址的证书实现https

全方位支持图文和音视频、100+增强功能，Facebook开源数据增强库AugLy

C语言每日一题：4.消失的数字+数字在升序数组中出现的次数+整数转换