LLM(十)| Tiny-Vicuna-1B:Tiny Models轻量化系列Top One


       然而,nano Models(比如新诞生的GeminiNano)、Tiny Models(就像TinyLlama家族)和Microsoft Phi1和2等另辟蹊径,希望一些较小的大模型也可以部署到生产环境中,给更多的企业和个人带来福音。


      Tiny Vicuna是一个Llama模型(Vicuna是使用从ShareGPT收集的用户共享对话进行微调Llama的大模型)。这个小模型是TinyLlama项目的一部分,该项目旨在通过适当的优化,在3万亿tokens上预训练的1.1B Llama模型,但由于Tiny Vicuna 1B是用WizardVicuna数据集微调的TinyLLama 1.1B,因此被称为Tiny Vicuna!运行Tiny Vicuna 1B量化版本,只需要不超过700 Mb的RAM!


       我们将使用Huggingface上Jiayi-Pan的Tiny-Vicuna-1B的量化GGUF模型文件。因为即使它是一个只有11亿个参数的模型,在CPU上全精度运行也需要将近10 GB的RAM。


mkdir TinyVicunacd TinyVicunapython3.10 -m venv venv #I am using python 3.10python -m venv venv  #if you are on Windows#to activate the Virtual Environmentsource venv/bin/activate  #for macvenv\Scripts\activate     #for windows users


pip install llama-cpp-pythonpip install gradiopip install psutilpip install plotly




from llama_cpp import Llamamodelfile = "./tiny-vicuna-1b.q5_k_m.gguf"contextlength=2048stoptoken = '<s>'################ LOADING THE MODELS  ################################ Set gpu_layers to the number of layers to offload to GPU. # Set to 0 if no GPU acceleration is available on your system.####################################################################llm = Llama(  model_path=modelfile,  # Download the model file first  n_ctx=contextlength,  # The max sequence length to use - note that longer sequence lengths require much more resources  #n_threads=2,            # The number of CPU threads to use, tailor to your system and the resulting performance)######### INFERENCE #######################response = llm(prompt,                 max_tokens=max_new_tokens,                 stop=['Q:', stoptoken],                 temperature = temperature,                repeat_penalty = repeat_penalty,                top_p = top_p,                echo=False)print(response)


注意1:如果在Windows上运行,modelfile不需要./,它应该简化为modelfile = “tiny-vicuna-1b.q5_k_m.gguf”





       我们搜索Vicuna 1.5版本,因为原始模型表明该模型是WizardVicuna数据集上TinyLlama的微调版本。它应该与Vicuna-v1.5系列完全兼容。



A chat between a curious user and an artificial intelligence assistant. The assistant gives helpful, detailed, and polite answers to the user's questions. USER: {prompt} ASSISTANT:


prompt = f"USER: What is Science? ASSISTANT:"



  • 通用问题;
  • 从上下文中提取信息(RAG);
  • 格式化输出。

3.1 通用问题





 Morning Routine: - Wake up at around 6am, shower and get dressed.- Have breakfast (breakfast can be anything from eggs, bacon or pancakes) and coffee/tea.- Check emails and work on important tasks for the day.- Take a quick walk outside to clear his mind before starting work.- Start working on his daily routine:    - Opening email and answering any urgent messages.   - Creating to-do lists for the day's activities.   - Checking social media accounts for updates or breaking news.   - Reviewing important documents/files.   - Making phone calls if necessary.   - Attending meetings or appointments as needed.   - Taking breaks and stretching to avoid burnout.- Lunch:    - Eating a light lunch, such as sandwiches or salads with hummus or guacamole.   - Drinking water or tea/coffee.- Afternoon Routine:    - Working on any important tasks from the morning's work.   - Checking emails and answering urgent messages.   - Taking breaks to stretch, walk around, or do some light exercise.   - Spending time with family/friends if possible.- Evening Routine:    - Watching TV or Netflix before going to bed.   - Reading a book or watching an episode of his favorite show.   - Taking a warm bath or reading a book while relaxing in bed.- Bedtime Routine:    - Winding down by taking a hot bath, reading a book, and listening to soothing music.   - Going back to sleep with the lights off.

3.2 从上下文中提取信息(RAG)



Reply to the question only using the provided context.[context]WHAT IS HIERARCHY 4.0? whether you own build manage maintain or operate an oil plant inevitably issues arise that require immediate action and resolution. With big data flowing in constantly from all sectors making sense of everything while troubleshootingissues without wasting time can be a huge challenge.  So what's the solution?introducing hierarchy 4.0 and Innovative software solution for control Safety Systems . Hierarchy 4.0 presents an interactive diagram of the entire plant revealing cause and effect Behavior with readings provided in a hierarchical view allowing for a deep understanding of the system's strategyAll data is collected from multiple sources visualized as a diagram and optimized through a customized dashboard allowing users to run a logic simulation from live data or pick a moment from their history.  Your simulation is based on actual safety Logics not just on a math modelNow every users can prepare an RCA report 90 percent faster in just a few minutes. Hierarchy can be used for any project phase starting from engineering to commissioning and up to operation and maintenance while supporting hazop Hazard analysis by reducing human error and avoiding incorrect documentation. Hierarchy 4.0 supports plant operators in decision making taking into account both the safety and the operability of their assets.  Hierarchy 4.0 Embraces a block log approach: it automatically calculates all Logics affected by anoverride and gives a full understanding of constraints.  Not convinced let's look at the data!During its first project hierarchy 4.0 prevented a revenue loss of 45 million dollars.  Plants that utilize hierarchy 4.0 save up to 95 of their time and deliver a return on investment up to five times in value from day one and experience a Personnel utilization and plant efficiency increase by 20 percent per year.Try our demo and make the move to hierarchy 4.0 today[end of context]Question: what is Hierarchy 4.0?


Answer:Hierarchy 4.0是一种用于控制安全系统的创新软件解决方案,通过交互式图表和优化的仪表板,可以深入了解系统的策略。它允许用户从实时数据中运行逻辑模拟,或从历史记录中选择一个时刻,从而减少人为错误并避免错误的文档。RCA报告可以使用层次结构4.0的块日志方法在几分钟内准备好。这支持工厂运营商在考虑其资产的安全性和可操作性的情况下进行决策。

3.3 格式化输出



[text]... insert here your text...... the move to hierarchy 4.0 today[end of text]- SUMMARY:



Write the main points of the following text.[text]WHAT IS HIERARCHY 4.0? whether you ... the move to hierarchy 4.0 today[end of text]format the output as a markdown list.


- Hierarchy 4.0 is an innovative software solution for control safety systems that provides a deep understanding of the system's strategy through interactive diagrams, optimized with a customized dashboard and supported by RCA reports.- All data is collected from multiple sources visualized as a hierarchical view allowing users to run a logic simulation from live data or pick a moment from their history.- Your simulation is based on actual safety logics not just on a math model, making it faster and more accurate for any project phase starting from engineering to commissioning and up to operation and maintenance.- Hierarchy can be used for any project phase starting from engineering to commissioning and up to operation and maintenance while supporting hazop Hazard analysis by reducing human error and avoiding incorrect documentation.- Hierarchy 4.0 supports plant operators in decision making taking into account both the safety and the operability of their assets, with a block log approach that calculates all Logics affected by an override automatically.- Not convinced let's look at the data! During its first project hierarchy 4.0 prevented a revenue loss of 45 million dollars, saving up to 95 of their time and delivering a return on investment up to five times in value from day one and experiencing a Personnel utilization and plant efficiency increase by 20 percent per year.- Try our demo and make the move to hierarchy 4.0 today!




  • 准备虚拟环境并将其激活(可选);
  • 根据上面第二小节的Step2,安装依赖项;
  • 下载q5 GGUF模型文件;
  • 下载python文件;
  • 在第39行代码中指定模型文件
#only "tiny-vicuna-1b.q5_k_m.gguf" if you are on Windows39|  modelfile = "./tiny-vicuna-1b.q5_k_m.gguf"  


python 40-vicuna1B_PG_MEM.py



  • 关闭“浏览器”选项卡;
  • 在“终端”窗口上键入^C


[1] https://blog.stackademic.com/tiny-vicuna-1b-is-the-lightweight-champion-of-the-tiny-models-f05e459b8018

[2] https://github.com/fabiomatricardi/KingOfTheTiny





一个用c写的黑框框迷宫 适合新手入门学习 也适合大学生小作业 下面附上代码 总体思路 初始化游戏界面&#xff1a;设置迷宫的大小&#xff08;WIDTH和HEIGH&#xff09;&#xff0c;生成迷宫地图&#xff08;map&#xff09;&#xff0c;包括墙壁、空地、起点和终点。显示…


问题源代码 方法&#xff1a;calculate_circuit_parameter 构造题目&#xff1a; 模板&#xff1a; 已知电阻R1为 10Ω&#xff0c;电阻R2为 5Ω&#xff0c;电压表示数为2.5V&#xff0c;求电源电压U&#xff1f; 给合上面题目&#xff0c;利用Python程序&#xff0c;可以任…

【论文笔记合集】卷积神经网络之深度可分离卷积(Depthwise Separable Convolution)

本文作者&#xff1a; slience_me 我看的论文地址&#xff1a;MobileNets: Efficient Convolutional Neural Networks for Mobile Vision Applications 内容 1. 标准卷积 假设输入为DFDFM&#xff0c;输出为输入为DFDFN&#xff0c;卷积核为DKDKM&#xff0c;共有N个卷积核进…


解题思路&#xff1a; 使用哈希表来解决该问题 因为题中要求是深拷贝 首先对原链表遍历&#xff0c;将原链表每个节点和新链表每个节点形成对应关系&#xff0c;存入到哈希表中&#xff0c;key为原链表的节点&#xff0c;value为新链表的节点。 之后重置辅助链表指向原链表头节…


在墨刀中实现轮播图效果&#xff0c;可以按照以下步骤进行操作&#xff1a; 1.添加轮播图组件&#xff1a;在墨刀的组件面板中&#xff0c;找到轮播图组件并将其拖拽到画布上。 2.上传轮播图&#xff1a;在右侧的属性面板中&#xff0c;你可以上传你的轮播图图片。点击“”按钮…


1、定义 发布pvc之后可以生成pv&#xff0c;还可以在共享服务器上直接生成挂载目录 pvc直接绑定和使用pv 2、动态pv依赖两个组件 &#xff08;1&#xff09;provisioner卷插件&#xff1a;k8s本身支持的动态pv创建不包括nfs&#xff0c;需要声明和安装一个外部插件provisio…

NET Core发布 HTTP Error 500.31 - Failed to load ASP.NET Core runtime

记录一下踩过的坑&#xff1a; 首先&#xff0c;不论是500.31还是500.30 &#xff0c;首先确保安装了三个文件 1.NET Core RunTime 2.NET SDK 3.NET Hosting 其次&#xff0c;确保三个文件的版本一致&#xff0c;如下&#xff1a; 要装就统一装同一个大版本&#xff0c;不要东…

Linux第28步_编译“修改正点原子TF-A源码中的Makefile并编译生成新的TF-A 固件”

了解学习内容&#xff1a; 1)、正点原子STM32MP157开发板使用的主控型号是STM32MP157DAA1&#xff1b; 2)、“linux /atk-mp1/atk-mp1/alientek_tf-a/tf-a-stm32mp-2.2.r1”目录下的文件是正点原子STM32MP157D开发板的“TF-A源码”。 3)、“linux /atk-mp1/atk-mp1/alientek…


web get请求获取图片 <div class"p2"><img id"imgId" src"/get/code"><a href"#">看不清&#xff0c;换一张</a> </div> 后台代码: /*获取动态验证码*/ ResponseBody RequestMapping(value "/…


操作环境&#xff1a; MATLAB 2022a 1、算法描述 POA&#xff08;孔雀优化算法&#xff09;是一种基于孔雀羽毛开屏行为启发的优化算法。这种算法模仿孔雀通过展开其色彩斑斓的尾羽来吸引雌性的自然行为。在算法中&#xff0c;每个孔雀代表一个潜在的解决方案&#xff0c;而…


我们来看看2024年人工智能专业的企业和个人都在紧张报考的两项AI认证证书报考&#xff1a; 为进一步贯彻落实中共中央印发《关于深化人才发展体制机制改革的意见》和国务院印发《关于“十四五”数字经济发展规划》等有关工作的部署要求&#xff0c;深入实施人才强国战略和创新驱…

Django教程第5章 | Web开发实战-数据统计图表(echarts、highchart)

专栏系列&#xff1a;Django学习教程 前言 highchart&#xff0c;国外。 echarts&#xff0c;国内。 本项目集成 hightchart和echarts图表库实现数据统计功能。 包括&#xff1a;折线图&#xff0c;柱状图&#xff0c;饼图和数据集图。 效果图 echats Highcharts 源代码…


在程序员的职业生涯中&#xff0c;保持竞争力是非常重要的&#xff0c;因为技术行业不断发展和变化。以下是一些建议&#xff0c;可以帮助程序员保持竞争力&#xff1a; 持续学习新技术和编程语言&#xff1a;技术行业的发展非常快&#xff0c;新的编程语言和技术不断涌现。作为…



高级分布式系统-第15讲 分布式机器学习--概念与学习框架

高级分布式系统汇总&#xff1a;高级分布式系统目录汇总-CSDN博客 分布式机器学习的概念 人工智能蓬勃发展的原因&#xff1a;“大” 大数据&#xff1a;为人工智能技术的发展奠定了坚实的物质基础。 大规模机器学习模型&#xff1a;具备超强的表达能力&#xff0c;可以解决…


程序中使用了json配置文件&#xff0c;位置在$rootPath/src/main/resources/config.json, 调试时使用IDE&#xff0c;但运行时使用Jar包&#xff0c;加载config.json配置文件的代码如下&#xff1a; public ConfigParser(String configFileName) throws IOException {try{Inp…


从2023年开始&#xff0c;除手机外的无线终端设备也逐步向WIFI6/6E进行升级更新&#xff0c;基于802.11ax技术的设备能够进一步满足用户体验新一代Wi-Fi标准时获得优质的性能和覆盖范围。 用户对于WIFI模块&#xff0c;通常会关注WIFI模块的吞吐量&#xff0c;拿到样品之后&am…


Access数据库模糊查询的方法 1.使通配符: 在 Access 数据库中&#xff0c;有两种通配符可供使用&#xff0c;分别是"?"和"*"。 "?"表示匹配一个字符。 "*"表示匹配任意多个字符 1例如&#xff1a;如果要查询姓氏以"王"开…


摘要 本文全面调研了不同的脱氧核糖核酸(DNA)-基于密码学和隐写术技术。基于DNA的密码学是一个新兴领域,利用DNA分子的大规模并行性和巨大的存储容量来编码和解码信息。近年来,由于其相对传统密码学方法的潜在优势,如高存储容量、低错误率和对环境因素的抗性,该领域引起…

Linux例行性工作 at和crontab命令

1&#xff0c;例行性工作 例行性工作 —— 在某一时刻&#xff0c;必须要做的事情 —— 定时任务 &#xff08;比如&#xff1a;闹钟&#xff09; 例行性工作分为两种&#xff1a;“单一的例行性工作 at”和“循环的例行性工作 crontab” 2&#xff0c;单一执行的例行性工作 …