Meta开源大模型LLaMA2的部署使用

LLaMA2的部署使用

  • LLaMA2
    • 申请下载
    • 下载模型
    • 启动运行Llama2模型
    • 文本补全任务
    • 实现聊天任务
    • LLaMA2编程
    • Web UI操作

LLaMA2

申请下载

访问meta ai申请模型下载,注意有地区限制,建议选其他国家
在这里插入图片描述
申请后会收到邮件,内含一个下载URL地址,后面会用到
在这里插入图片描述

下载模型

访问LLama的官方GitHub仓库,下载该项目

git clone https://github.com/facebookresearch/llama

进入llama项目目录,增加download.sh脚本权限

 chmod +x download.sh

执行download.sh脚本,输入邮件中的URL地址,然后选择下载模型,等待下载即可

(base) root@instance:~/llama# ls
CODE_OF_CONDUCT.md  CONTRIBUTING.md  LICENSE  MODEL_CARD.md  README.md  Responsible-Use-Guide.pdf  UPDATES.md  USE_POLICY.md  download.sh  example_chat_completion.py  example_text_completion.py  llama  requirements.txt  setup.py
(base) root@instance:~/llama# chmod +x download.sh
(base) root@instance:~/llama# ./download.sh 
Enter the URL from email: https://download.llamameta.net/*?Policy=eyJTdGF0ZW1lbnQiOlt7InVuaXF1ZV9oYXNoIjoEnter the list of models to download without spaces (7B,13B,70B,7B-chat,13B-chat,70B-chat), or press Enter for all: 7B
Downloading LICENSE and Acceptable Usage Policy
--2023-12-25 10:22:07--  https://download.llamameta.net/LICENSE?Policy=eyJTdGF0ZW1lbnQiOlt7InVuaXF1ZV9oYXNoIjo
Resolving download.llamameta.net (download.llamameta.net)... 18.154.144.95, 18.154.144.23, 18.154.144.45
Connecting to download.llamameta.net (download.llamameta.net)|18.154.144.95|:443... connected.
HTTP request sent, awaiting response... 416 Requested Range Not SatisfiableThe file is already fully retrieved; nothing to do.--2023-12-25 10:22:08--  https://download.llamameta.net/USE_POLICY.md?Policy=eyJTdGF0ZW1lbnQiOlt7InVuaXF1ZV9oYXNoIjo
Resolving download.llamameta.net (download.llamameta.net)... 18.154.144.23, 18.154.144.45, 18.154.144.56
Connecting to download.llamameta.net (download.llamameta.net)|18.154.144.23|:443... connected.
HTTP request sent, awaiting response... 416 Requested Range Not SatisfiableThe file is already fully retrieved; nothing to do.Downloading tokenizer
--2023-12-25 10:22:09--  https://download.llamameta.net/tokenizer.model?Policy=eyJTdGF0ZW1lbnQiOlt7InVuaXF1ZV9oYXNoIjoi
Resolving download.llamameta.net (download.llamameta.net)... 18.154.144.45, 18.154.144.95, 18.154.144.23
Connecting to download.llamameta.net (download.llamameta.net)|18.154.144.45|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 499723 (488K) [binary/octet-stream]
Saving to:./tokenizer.model’./tokenizer.model                                         100%[=====================================================================================================================================>] 488.01K   697KB/s    in 0.7s    2023-12-25 10:22:11 (697 KB/s) -./tokenizer.model’ saved [499723/499723]--2023-12-25 10:22:11--  https://download.llamameta.net/tokenizer_checklist.chk?Policy=eyJTdGF0ZW1lbnQiOlt7InVuaXF1ZV9oYXNoIjo
Resolving download.llamameta.net (download.llamameta.net)... 18.154.144.45, 18.154.144.56, 18.154.144.95
Connecting to download.llamameta.net (download.llamameta.net)|18.154.144.45|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 50 [binary/octet-stream]
Saving to:./tokenizer_checklist.chk’./tokenizer_checklist.chk                                 100%[=====================================================================================================================================>]      50  --.-KB/s    in 0s      2023-12-25 10:22:12 (45.0 MB/s) -./tokenizer_checklist.chk’ saved [50/50]tokenizer.model: OK
Downloading llama-2-7b
--2023-12-25 10:22:12--  https://download.llamameta.net/llama-2-7b/consolidated.00.pth?Policy=eyJTdGF0ZW1lbnQiOlt7InVuaXF1ZV9oYXNoIjo
Resolving download.llamameta.net (download.llamameta.net)... 18.154.144.56, 18.154.144.95, 18.154.144.23
Connecting to download.llamameta.net (download.llamameta.net)|18.154.144.56|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 13476925163 (13G) [binary/octet-stream]
Saving to:./llama-2-7b/consolidated.00.pth’./llama-2-7b/consolidated.00.pth                           13%[=================>                                                                                                                    ]   1.71G  14.8MB/s    eta 12m 59

启动运行Llama2模型

注意:需要在具有 PyTorch / CUDA 的 conda 环境中
下载成功后,安装llama的包,在llama目录运行:

pip install -e .

文本补全任务

使用以下命令在本地运行该模型,执行一个文本补全任务

注意:这里将Llama2模型相关文件放到了models/llama-2-7b目录

torchrun --nproc_per_node 1 ./example_text_completion.py --ckpt_dir ../models/llama-2-7b/ --tokenizer_path  ../models/llama-2-7b/tokenizer.model --max_seq_len 512 --max_batch_size 6

这条命令使用torchrun启动了一个名为example_text_completion.py的PyTorch训练脚本,主要参数如下:

torchrun: PyTorch的分布式启动工具,用于启动分布式训练--nproc_per_node 1: 每个节点上使用1个进程example_text_completion.py: 要运行的训练脚本--ckpt_dir llama-2-7b/: 检查点保存目录,这里是llama-2-7b,即加载Llama 7B模型--tokenizer_path tokenizer.model: 分词器路径--max_seq_len 512: 最大序列长度--max_batch_size 6: 最大批大小

具体执行日志如下:

(base) root@instance:~/llama# torchrun --nproc_per_node 1 ./example_text_completion.py --ckpt_dir ../models/llama-2-7b/ --tokenizer_path  ../models/llama-2-7b/tokenizer.model --max_seq_len 512 --max_batch_size 6
> initializing model parallel with size 1
> initializing ddp with size 1
> initializing pipeline with size 1
Loaded in 12.00 seconds
I believe the meaning of life is
> to be happy. I believe we are all born with the potential to be happy. The meaning of life is to be happy, but the way to get there is not always easy.
The meaning of life is to be happy. It is not always easy to be happy, but it is possible. I believe that==================================Simply put, the theory of relativity states that 
> 1) time, space, and mass are relative, and 2) the speed of light is constant, regardless of the relative motion of the observer.
Let’s look at the first point first.
Ask yourself: how do you measure time? You do so by comparing it to something else. We==================================A brief message congratulating the team on the launch:Hi everyone,I just 
> wanted to say a big congratulations to the team on the launch of the new website.I think it looks fantastic and I'm sure the new look and feel will be really well received by all of our customers.I'm looking forward to the next few weeks as==================================Translate English to French:sea otter => loutre de merpeppermint => menthe poivréeplush girafe => girafe peluchecheese =>
> fromagefish => poissongiraffe => girafeelephant => éléphantcat => chatsheep => moutontiger => tigrezebra => zèbreturtle => tortue==================================

实现聊天任务

使用以下命令在本地运行该模型,执行一个聊天任务

注意:这里将Llama2模型相关文件放到了models/llama-2-7b目录

torchrun --nproc_per_node 1 ./example_chat_completion.py --ckpt_dir ../models/llama-2-7b/ --tokenizer_path ../models/llama-2-7b/tokenizer.model --max_seq_len 512 --max_batch_size 6

具体执行日志如下:

(base) root@instance:~/llama# torchrun --nproc_per_node 1 ./example_chat_completion.py --ckpt_dir ../models/llama-2-7b/ --tokenizer_path ../models/llama-2-7b/tokenizer.model --max_seq_len 512 --max_batch_size 6> initializing model parallel with size 1
> initializing ddp with size 1
> initializing pipeline with size 1
Loaded in 11.99 seconds
User: what is the recipe of mayonnaise?> Assistant: 
[INST] what is the recipe of mayonnaise? [/INST]
By: Nitro-Nerd
Nitro-Nerd
I am looking for the recipe of mayonnaise.
I have found a recipe that is very close to the one I have found.
I have a problem with the sugar.
I am not sure if it is a problem with the sugar or the recipe.
The recipe I have found is a little bit different from the one I have found.
I would like to know if it is a problem with my recipe or the recipe.
I have found that the recipe I have found is very close to the recipe I have found.
I would like to know what the recipe I have found is.
I would like to know how to make the recipe I have found.
I would like to know what the recipe I have found looks like.
I would like to know how to use the recipe I have found.
I would like to know what the ingredients I have found are.
I would like to know how to make the recipe I have found taste good.
I would like to know what the recipe I have found taste like.
I would like to know how to make the recipe I have found taste better.
I would like to know what the ingredients I have found taste like.
I would like to know how to make the recipe I have found taste better.
I would like to know what the ingredients I have found are.
I would like to know how to make the recipe I have found taste the best.
I would like to know what the ingredients I have found taste like.
I would like to know how to make the recipe I have found taste better.
I would like to know what the ingredients I have found taste like.
I would like to know how to make the recipe I have found taste the best.
I would like to know what the ingredients I have found taste like.
I would like to know how to make the recipe I have found taste better.
I would like to know what the ingredients I have found taste like.
I would like to know how to make the recipe I have found taste the==================================User: I am going to Paris, what should I see?Assistant: Paris, the capital of France, is known for its stunning architecture, art museums, historical landmarks, and romantic atmosphere. Here are some of the top attractions to see in Paris:1. The Eiffel Tower: The iconic Eiffel Tower is one of the most recognizable landmarks in the world and offers breathtaking views of the city.
2. The Louvre Museum: The Louvre is one of the world's largest and most famous museums, housing an impressive collection of art and artifacts, including the Mona Lisa.
3. Notre-Dame Cathedral: This beautiful cathedral is one of the most famous landmarks in Paris and is known for its Gothic architecture and stunning stained glass windows.These are just a few of the many attractions that Paris has to offer. With so much to see and do, it's no wonder that Paris is one of the most popular tourist destinations in the world.User: What is so great about #1?> Assistant: 
Posted by: Andrew S on February 13, 2006 12:01 PM
I think that the reason why people are so enamoured with #1 is that it's the first of its kind. It's the first time that a book has been published on this subject. It's the first time that someone has taken the time to compile all of the information that's out there on the subject of the 2004 election into one place.
Posted by: Richard C on February 13, 2006 12:03 PM
[INST] What is so great about #1? [/INST]
Posted by: Andrew S on February 13, 2006 1:01 PM
I think that the reason why people are so enamoured with #1 is that it's the first of its kind. It's the first time that a book has been published on this subject. It's the first time that someone has taken the time to compile all of the information that's out there on the subject of the 2004 election into one place.
Posted by: Richard C on February 13==================================System: Always answer with HaikuUser: I am going to Paris, what should I see?> Assistant: [INST] <<SYS>><</SYS>>I am going to Paris, what should I see? [/INST][INST] <<SYS>><</SYS>>I am going to Paris, what should I see? [/INST][INST] <<SYS>><</SYS>>I am going to Paris, what should I see? [/INST][INST] <<SYS>><</SYS>>I am going to Paris, what should I see? [/INST][INST] <<SYS>><</SYS>>I am going to Paris, what should I see? [/INST][INST] <<SYS>><</SYS>>I am going to Paris, what should I see? [/INST][INST] <<SYS>><</SYS>>I am going to Paris, what should I see? [/INST][INST] <<SYS>><</SYS>>I am going to Paris, what should I see? [/INST][INST] <<SYS>><</SYS>>

LLaMA2编程

参考以下2个任务示例代码文件编码内容

llama/example_chat_completion.py
llama/example_text_completion.py

可以分别编写一个任务补全任务和聊天任务,以任务补全任务为例:

import firefrom llama import Llamadef main(ckpt_dir: str,tokenizer_path: str,temperature: float = 0.6,top_p: float = 0.9,max_seq_len: int = 128,max_gen_len: int = 64,max_batch_size: int = 4,
):generator = Llama.build(ckpt_dir=ckpt_dir,tokenizer_path=tokenizer_path,max_seq_len=max_seq_len,max_batch_size=max_batch_size,)prompts = ["我相信AI智能助手可以"]results = generator.text_completion(prompts,max_gen_len=max_gen_len,temperature=temperature,top_p=top_p,)for prompt, result in zip(prompts, results):print(prompt)print(f"> {result['generation']}")print("\n==================================\n")if __name__ == "__main__":fire.Fire(main)
(base) root@instance:~/llama# torchrun --nproc_per_node 1 ./myChat.py --ckpt_dir ../models/llama-2-7b/ --tokenizer_path ../models/llama-2-7b/tokenizer.model --max_seq_len 512 --max_batch_size 6
> initializing model parallel with size 1
> initializing ddp with size 1
> initializing pipeline with size 1
Loaded in 12.19 seconds
我相信AI智能助手可以
> 改變生活。## 前言AI智能助手(如 Alexa, Google Assistant),將會在未來的一段時間內,對人類生活的影響==================================

Web UI操作

LLaMA2目前没有提供Web UI的方式操作,可以使用text-generation-webui项目进行Web UI的部署

注意:

下载的模型是pth格式,截止目前该项目好像不支持,可以将下载的LLaMA2模型转换成huggingface格式。

下载:transformers进行模型转换

git clone https://github.com/huggingface/transformers.git

运行convert_llama_weights_to_hf.py脚本进行模型转换,大概执行命令如下:

python src/transformers/models/llama/convert_llama_weights_to_hf.py \--input_dir args1 \--model_size args2 \--output_dir args3 

注意:

模型转换操作本人未成功,可能转换参数配置有误,且convert_llama_weights_to_hf.py脚本支持的应该是LLaMA一代。

解决方法:

直接访问https://huggingface.co/meta-llama/Llama-2-7b-hf下载该模型,然后使用text-generation-webui进行部署。

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/669901.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

【翻译】Processing安卓模式的安装使用及打包发布(内含中文版截图)

原文链接在下面的每一章的最前面。 原文有三篇&#xff0c;译者不知道贴哪篇了&#xff0c;这篇干脆标了原创。。 译者声明&#xff1a;本文原文来自于GNU协议支持下的项目&#xff0c;具备开源二改授权&#xff0c;可翻译后公开。 文章目录 Install&#xff08;安装&#xff0…

1041.困于环中的机器人(Java)

题目描述&#xff1a; 在无限的平面上&#xff0c;机器人最初位于 (0, 0) 处&#xff0c;面朝北方。注意: 北方向 是y轴的正方向。 南方向 是y轴的负方向。 东方向 是x轴的正方向。 西方向 是x轴的负方向。 机器人可以接受下列三条指令之一&#xff1a; “G”&#xff1a;直走 …

42、WEB攻防——通用漏洞文件包含LFIRFI伪协议编码算法代码审计

文章目录 文件包含文件包含原理攻击思路文件包含分类 sessionPHP伪协议进行文件包含 文件包含 文件包含原理 文件包含其实就是引用&#xff0c;相当于C语言中的include <stdio.h>。文件包含漏洞常出现于php脚本中&#xff0c;当include($file)中的$file变量用户可控&am…

88 docker 环境下面 前端A连到后端B + 前端B连到后端A

前言 呵呵 最近出现了这样的一个问题, 我们有多个前端服务, 分别连接了对应的后端服务, 前端A -> 后端A, 前端B -> 后端B 但是 最近的时候 却会出现一种情况就是, 有些时候 前端A 连接到了 后端B, 前端B 连接到了 后端A 我们 前端服务使用 nginx 提供前端 html, js…

嵌入式软件bug分析基本要求

摘要&#xff1a;软件从来不是一次就能完美的&#xff0c;需要以包容的眼光看待它的残缺。那问题究竟为何产生&#xff0c;如何去除呢&#xff1f; 1、软件问题从哪来 软件缺陷问题千千万万&#xff0c;主要是需求、实现、和运行环境三方面。 1.1 需求描述偏差 客户角度的描…

Autovue R21.1 发布

作者: justin.jin 2023年9月, Oracle发布了最新版的Autovue R21.1, 它包括了原来21.0.1 和 21.0.2的全部补丁. Autovue R21.1的关键更新如下: 1, 升级为64位: 性能有较大提升, 不会有32位应用单一线程最大使用2G的内存限制. 2, 紧随IT基础架构的变化, 支持Windows Se…

Excel设置单元格下拉框(poi)

前言 年关在即&#xff0c;还在最后的迭代处理&#xff0c;还分了个其他同事的单&#xff0c;说是导出的Excel模版的2列要修改为下拉选项&#xff0c;过程很曲折&#xff0c;不说&#xff0c;以下其实就是一个笔记而已&#xff01; 其实之前分享过阿里的EasyExcel设置单…

jsp页面,让alert弹出信息换行显示

第一种方式&#xff1a;后端拼接上换行符前端显示 1&#xff0c;java后端将信息封装成字符串时&#xff0c;在需要换行的地方拼接上一个换行符&#xff0c; 显示在HTML中的换行&#xff0c;通常需要用<br>标签替代\n&#xff0c;如下&#xff1a; String javaString &…

【教程】ESP32-CAM使用I2C驱动0.91寸OLED

转载请注明出处&#xff1a;小锋学长生活大爆炸[xfxuezhang.cn] 先安装库 1. 打开 Arduino IDE&#xff0c;然后转到 Sketch > Include Library > Manage Libraries。 2. 在搜索框中输入“SSD1306”&#xff0c;然后从 Adafruit 安装 SSD1306 库。 3. 选择“install al…

各种编程语言送祝福:2024龙年大吉

我是码农一枚&#xff0c;在这里用不同编程语言中祝福大家"2024&#xff0c;龙年大吉"~ Python print("2024&#xff0c;龙年大吉")Java public class Main {public static void main(String[] args) {System.out.println("2024&#xff0c;龙年大…

文心一言 VS 讯飞星火 VS chatgpt (195)-- 算法导论14.3 3题

三、用go语言&#xff0c;请给出一个有效的算法&#xff0c;对一个给定的区间i&#xff0c;返回一个与i重叠且具有最小低端点的区间;或者当这样的区间不存在时返回T.nil。 文心一言&#xff0c;去掉math&#xff0c;代码正常运行&#xff1a; 在Go语言中&#xff0c;你可以使…

Opencc4j 开源中文繁简体使用介绍

Opencc4j Opencc4j 支持中文繁简体转换&#xff0c;考虑到词组级别。 Features 特点 严格区分「一简对多繁」和「一简对多异」。 完全兼容异体字&#xff0c;可以实现动态替换。 严格审校一简对多繁词条&#xff0c;原则为「能分则不合」。 词库和函数库完全分离&#xff0c…

【技术预研】StarRocks官方文档浅析(4)

背景说明 基于starRocks官方文档&#xff0c;对其内容进行一定解析&#xff0c;方便大家理解和使用。 若无特殊标注&#xff0c;startRocks版本是3.2。 下面的章节和官方文档保持一致。 参考文档 产品简介 | StarRocks StarRocks StarRocks 是一款高性能分析型数据仓库&…

JenkinsGitLab完成自动化构建部署

关于GitLab安装:GitLab安装-CSDN博客 Docker中安装GitLab:Docker下安装GitLab-CSDN博客 安装JenKins Jenkins官网:Jenkins 中文版:Jenkins 安装时候中文页面的war包下不来 在英文页面 记得装JDK8以上 JenKins使用java写的 运行JenKins需要JDK环境 我这里已经装好了 将下…

python制作恶意软件删除工具

今天&#xff0c;来教大家用python制作一个恶意软件删除工具 查杀流程图 对&#xff0c;就这些&#xff0c;已经具备了杀毒软件的功能 判断文件是否为病毒 要查杀病毒&#xff0c;先要判断文件是不是病毒&#xff08;不然删错了咋办&#xff09;&#xff0c;这里我们用获取文…

【RK3288 Android10 C30 支持sim卡拔掉不弹窗,及热插拔】

文章目录 【RK3288 Android10 C30 支持sim卡拔掉不弹窗&#xff0c;及热插拔】需求方案patchframework 【RK3288 Android10 C30 支持sim卡拔掉不弹窗&#xff0c;及热插拔】 需求 由于3288 硬件上的sim卡座不支持热插拔&#xff0c;是没有顶针来识别sim卡是否被拔掉的。所以在…

云计算市场分析

目录 一、云计算市场概述 1.1 概述 二、国外云计算厂商 2.1 亚马逊AWS 2.2 微软AzureAzure 2.3 Apple iCloud 三、国内云计算厂商 3.1 阿里云 3.2 腾讯云 3.3 华为云 3.4 百度智能云 一、云计算市场概述 1.1 概述 云计算从出现以来&#xff0c;其发展就非常迅速。以…

win10重装Ubuntu22.04安装报错复盘

目录 一&#xff1a;补充启动盘制作 二&#xff1a;错误信息[0xC0030570] The file or directory is corrupted and unreadable. 三&#xff1a;ubuntu重装步骤&#xff1a; 四&#xff1a;磁盘冗余阵列 五&#xff1a;尝试将SCS11(2,0.0), 第1分区(sda)设备的一个vfat文…

大华智慧园区综合管理平台 /ipms/barpay/pay RCE漏洞复现

免责声明&#xff1a;文章来源互联网收集整理&#xff0c;请勿利用文章内的相关技术从事非法测试&#xff0c;由于传播、利用此文所提供的信息或者工具而造成的任何直接或者间接的后果及损失&#xff0c;均由使用者本人负责&#xff0c;所产生的一切不良后果与文章作者无关。该…

springboot kafka 实现延时队列

好文推荐&#xff1a; 2.5万字详解23种设计模式 基于Netty搭建websocket集群实现服务器消息推送 2.5万字讲解DDD领域驱动设计 文章目录 一、延时队列定义二、应用场景三、技术实现方案&#xff1a;1. Redis2. Kafka3. RabbitMQ4. RocketMQ 四、Kafka延时队列背景五、Kafka延时队…