Gemma2 2B 模型的model.safetensors.index.json文件解析

Gemma2 2B 模型的 model.safetensors.index.json 文件解析

在使用 Gemma2 2B 模型或其他大型预训练模型时,model.safetensors.index.json 文件起到了索引的作用,它帮助我们了解模型的结构、参数存储方式以及如何加载模型的具体权重。本博客将深入解析该文件的内容和用途。
下载到本地的文件如下所示:

在这里插入图片描述


1. 文件结构概述

model.safetensors.index.json 文件的主要结构包括两个关键部分:

  1. Metadata 元数据:包含模型的总大小信息。
  2. Weight Map 权重映射:定义模型参数与实际存储文件的对应关系。

示例内容如下:

{"metadata": {"total_size": 10457367552},"weight_map": {"model.embed_tokens.weight": "model-00001-of-00003.safetensors","model.layers.0.input_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.10.mlp.down_proj.weight": "model-00002-of-00003.safetensors"}
}

2. Metadata 元数据解析

total_size

  • 作用:表示所有模型参数文件的总大小(以字节为单位)。
  • 示例10457367552 字节约等于 10.45 GB
  • 意义
    1. 帮助用户评估存储需求。
    2. 检查文件是否下载完整,与预期大小匹配。

3. Weight Map 权重映射解析

weight_map

  • 作用
    将模型的各层参数映射到具体的 .safetensors 文件。
  • 格式
    • 键:模型参数的名称,表示权重在模型中的位置。
    • 值:存储这些权重的 .safetensors 文件。
  • 示例解析
    • model.embed_tokens.weight: 嵌入层的权重存储在 model-00001-of-00003.safetensors 文件中。
    • model.layers.0.mlp.up_proj.weight: 第 0 层 MLP 的上投影矩阵参数位于 model-00001-of-00003.safetensors
    • model.layers.10.mlp.down_proj.weight: 第 10 层 MLP 的下投影矩阵参数位于 model-00002-of-00003.safetensors

用途

  1. 分布式存储:大型模型被拆分为多个小文件,方便管理和加载。
  2. 增量更新:支持部分更新,不必重写整个模型。
  3. 动态加载:根据需求按需加载模型的某些部分。

4. 模型分片机制

为什么需要分片?

  1. 存储限制:单个文件过大可能超出文件系统限制。
  2. 加载效率:分片可以按需加载,提高内存利用率。
  3. 分布式训练:多个 GPU 或节点可以并行处理不同的参数分片。

如何定位分片?

  • 文件命名规则:model-<编号>-of-<总数>.safetensors
    • model-00001-of-00003.safetensors 表示 3 个分片中的第 1 个。
  • 使用索引文件确保参数名和文件名一一对应。

5. Safetensors 格式简介

优势

  1. 安全性:防止恶意代码注入,保障权重文件的安全加载。
  2. 效率高:二进制存储格式,支持快速读取和写入。
  3. 跨平台兼容性:适用于 CPU 和 GPU 环境。

加载示例

from safetensors.torch import load_file# 加载特定分片
weights = load_file("model-00001-of-00003.safetensors")
print(weights.keys())

6. 实际应用场景

1. 模型加载过程

  1. 根据 model.safetensors.index.json 文件读取分片信息。
  2. 根据需要加载某些分片到 GPU,减少内存占用。
  3. 动态合并加载的参数,恢复完整模型结构。

2. 文件一致性检查

  • 利用 total_size 验证下载的文件总大小是否正确,确保数据完整性。

3. 参数微调

  • 用户可以根据需求只加载特定层权重进行微调,避免加载不必要的参数。

7. 总结

model.safetensors.index.json 文件是大型模型权重管理的重要工具,尤其适合 Gemma2 2B 这样的多层神经网络。通过解析该文件,可以了解模型的存储布局、参数分片策略以及如何高效加载和管理模型权重。

关键要点

  1. 元数据部分提供总大小信息,便于存储规划和完整性检查。
  2. 权重映射部分详细记录模型参数与存储文件的对应关系,支持灵活加载。
  3. Safetensors 格式提高了加载速度和安全性,适合大规模模型的分布式部署。

希望这篇博客能帮助您更好地理解 model.safetensors.index.json 文件的作用和实现原理,助力您的模型开发和部署工作!

后记

2024年12月30日13点45分于上海,在GPT4o大模型辅助下完成。

附录

下面是完整的Gemma2 2B 模型的model.safetensors.index.json文件:

{"metadata": {"total_size": 10457367552},"weight_map": {"model.embed_tokens.weight": "model-00001-of-00003.safetensors","model.layers.0.input_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.0.mlp.down_proj.weight": "model-00001-of-00003.safetensors","model.layers.0.mlp.gate_proj.weight": "model-00001-of-00003.safetensors","model.layers.0.mlp.up_proj.weight": "model-00001-of-00003.safetensors","model.layers.0.post_attention_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.0.post_feedforward_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.0.pre_feedforward_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.0.self_attn.k_proj.weight": "model-00001-of-00003.safetensors","model.layers.0.self_attn.o_proj.weight": "model-00001-of-00003.safetensors","model.layers.0.self_attn.q_proj.weight": "model-00001-of-00003.safetensors","model.layers.0.self_attn.v_proj.weight": "model-00001-of-00003.safetensors","model.layers.1.input_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.1.mlp.down_proj.weight": "model-00001-of-00003.safetensors","model.layers.1.mlp.gate_proj.weight": "model-00001-of-00003.safetensors","model.layers.1.mlp.up_proj.weight": "model-00001-of-00003.safetensors","model.layers.1.post_attention_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.1.post_feedforward_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.1.pre_feedforward_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.1.self_attn.k_proj.weight": "model-00001-of-00003.safetensors","model.layers.1.self_attn.o_proj.weight": "model-00001-of-00003.safetensors","model.layers.1.self_attn.q_proj.weight": "model-00001-of-00003.safetensors","model.layers.1.self_attn.v_proj.weight": "model-00001-of-00003.safetensors","model.layers.10.input_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.10.mlp.down_proj.weight": "model-00002-of-00003.safetensors","model.layers.10.mlp.gate_proj.weight": "model-00002-of-00003.safetensors","model.layers.10.mlp.up_proj.weight": "model-00002-of-00003.safetensors","model.layers.10.post_attention_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.10.post_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.10.pre_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.10.self_attn.k_proj.weight": "model-00002-of-00003.safetensors","model.layers.10.self_attn.o_proj.weight": "model-00002-of-00003.safetensors","model.layers.10.self_attn.q_proj.weight": "model-00002-of-00003.safetensors","model.layers.10.self_attn.v_proj.weight": "model-00002-of-00003.safetensors","model.layers.11.input_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.11.mlp.down_proj.weight": "model-00002-of-00003.safetensors","model.layers.11.mlp.gate_proj.weight": "model-00002-of-00003.safetensors","model.layers.11.mlp.up_proj.weight": "model-00002-of-00003.safetensors","model.layers.11.post_attention_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.11.post_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.11.pre_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.11.self_attn.k_proj.weight": "model-00002-of-00003.safetensors","model.layers.11.self_attn.o_proj.weight": "model-00002-of-00003.safetensors","model.layers.11.self_attn.q_proj.weight": "model-00002-of-00003.safetensors","model.layers.11.self_attn.v_proj.weight": "model-00002-of-00003.safetensors","model.layers.12.input_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.12.mlp.down_proj.weight": "model-00002-of-00003.safetensors","model.layers.12.mlp.gate_proj.weight": "model-00002-of-00003.safetensors","model.layers.12.mlp.up_proj.weight": "model-00002-of-00003.safetensors","model.layers.12.post_attention_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.12.post_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.12.pre_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.12.self_attn.k_proj.weight": "model-00002-of-00003.safetensors","model.layers.12.self_attn.o_proj.weight": "model-00002-of-00003.safetensors","model.layers.12.self_attn.q_proj.weight": "model-00002-of-00003.safetensors","model.layers.12.self_attn.v_proj.weight": "model-00002-of-00003.safetensors","model.layers.13.input_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.13.mlp.down_proj.weight": "model-00002-of-00003.safetensors","model.layers.13.mlp.gate_proj.weight": "model-00002-of-00003.safetensors","model.layers.13.mlp.up_proj.weight": "model-00002-of-00003.safetensors","model.layers.13.post_attention_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.13.post_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.13.pre_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.13.self_attn.k_proj.weight": "model-00002-of-00003.safetensors","model.layers.13.self_attn.o_proj.weight": "model-00002-of-00003.safetensors","model.layers.13.self_attn.q_proj.weight": "model-00002-of-00003.safetensors","model.layers.13.self_attn.v_proj.weight": "model-00002-of-00003.safetensors","model.layers.14.input_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.14.mlp.down_proj.weight": "model-00002-of-00003.safetensors","model.layers.14.mlp.gate_proj.weight": "model-00002-of-00003.safetensors","model.layers.14.mlp.up_proj.weight": "model-00002-of-00003.safetensors","model.layers.14.post_attention_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.14.post_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.14.pre_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.14.self_attn.k_proj.weight": "model-00002-of-00003.safetensors","model.layers.14.self_attn.o_proj.weight": "model-00002-of-00003.safetensors","model.layers.14.self_attn.q_proj.weight": "model-00002-of-00003.safetensors","model.layers.14.self_attn.v_proj.weight": "model-00002-of-00003.safetensors","model.layers.15.input_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.15.mlp.down_proj.weight": "model-00002-of-00003.safetensors","model.layers.15.mlp.gate_proj.weight": "model-00002-of-00003.safetensors","model.layers.15.mlp.up_proj.weight": "model-00002-of-00003.safetensors","model.layers.15.post_attention_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.15.post_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.15.pre_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.15.self_attn.k_proj.weight": "model-00002-of-00003.safetensors","model.layers.15.self_attn.o_proj.weight": "model-00002-of-00003.safetensors","model.layers.15.self_attn.q_proj.weight": "model-00002-of-00003.safetensors","model.layers.15.self_attn.v_proj.weight": "model-00002-of-00003.safetensors","model.layers.16.input_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.16.mlp.down_proj.weight": "model-00002-of-00003.safetensors","model.layers.16.mlp.gate_proj.weight": "model-00002-of-00003.safetensors","model.layers.16.mlp.up_proj.weight": "model-00002-of-00003.safetensors","model.layers.16.post_attention_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.16.post_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.16.pre_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.16.self_attn.k_proj.weight": "model-00002-of-00003.safetensors","model.layers.16.self_attn.o_proj.weight": "model-00002-of-00003.safetensors","model.layers.16.self_attn.q_proj.weight": "model-00002-of-00003.safetensors","model.layers.16.self_attn.v_proj.weight": "model-00002-of-00003.safetensors","model.layers.17.input_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.17.mlp.down_proj.weight": "model-00002-of-00003.safetensors","model.layers.17.mlp.gate_proj.weight": "model-00002-of-00003.safetensors","model.layers.17.mlp.up_proj.weight": "model-00002-of-00003.safetensors","model.layers.17.post_attention_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.17.post_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.17.pre_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.17.self_attn.k_proj.weight": "model-00002-of-00003.safetensors","model.layers.17.self_attn.o_proj.weight": "model-00002-of-00003.safetensors","model.layers.17.self_attn.q_proj.weight": "model-00002-of-00003.safetensors","model.layers.17.self_attn.v_proj.weight": "model-00002-of-00003.safetensors","model.layers.18.input_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.18.mlp.down_proj.weight": "model-00002-of-00003.safetensors","model.layers.18.mlp.gate_proj.weight": "model-00002-of-00003.safetensors","model.layers.18.mlp.up_proj.weight": "model-00002-of-00003.safetensors","model.layers.18.post_attention_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.18.post_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.18.pre_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.18.self_attn.k_proj.weight": "model-00002-of-00003.safetensors","model.layers.18.self_attn.o_proj.weight": "model-00002-of-00003.safetensors","model.layers.18.self_attn.q_proj.weight": "model-00002-of-00003.safetensors","model.layers.18.self_attn.v_proj.weight": "model-00002-of-00003.safetensors","model.layers.19.input_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.19.mlp.down_proj.weight": "model-00002-of-00003.safetensors","model.layers.19.mlp.gate_proj.weight": "model-00002-of-00003.safetensors","model.layers.19.mlp.up_proj.weight": "model-00002-of-00003.safetensors","model.layers.19.post_attention_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.19.post_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.19.pre_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.19.self_attn.k_proj.weight": "model-00002-of-00003.safetensors","model.layers.19.self_attn.o_proj.weight": "model-00002-of-00003.safetensors","model.layers.19.self_attn.q_proj.weight": "model-00002-of-00003.safetensors","model.layers.19.self_attn.v_proj.weight": "model-00002-of-00003.safetensors","model.layers.2.input_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.2.mlp.down_proj.weight": "model-00001-of-00003.safetensors","model.layers.2.mlp.gate_proj.weight": "model-00001-of-00003.safetensors","model.layers.2.mlp.up_proj.weight": "model-00001-of-00003.safetensors","model.layers.2.post_attention_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.2.post_feedforward_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.2.pre_feedforward_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.2.self_attn.k_proj.weight": "model-00001-of-00003.safetensors","model.layers.2.self_attn.o_proj.weight": "model-00001-of-00003.safetensors","model.layers.2.self_attn.q_proj.weight": "model-00001-of-00003.safetensors","model.layers.2.self_attn.v_proj.weight": "model-00001-of-00003.safetensors","model.layers.20.input_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.20.mlp.down_proj.weight": "model-00002-of-00003.safetensors","model.layers.20.mlp.gate_proj.weight": "model-00002-of-00003.safetensors","model.layers.20.mlp.up_proj.weight": "model-00002-of-00003.safetensors","model.layers.20.post_attention_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.20.post_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.20.pre_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.20.self_attn.k_proj.weight": "model-00002-of-00003.safetensors","model.layers.20.self_attn.o_proj.weight": "model-00002-of-00003.safetensors","model.layers.20.self_attn.q_proj.weight": "model-00002-of-00003.safetensors","model.layers.20.self_attn.v_proj.weight": "model-00002-of-00003.safetensors","model.layers.21.input_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.21.mlp.down_proj.weight": "model-00002-of-00003.safetensors","model.layers.21.mlp.gate_proj.weight": "model-00002-of-00003.safetensors","model.layers.21.mlp.up_proj.weight": "model-00002-of-00003.safetensors","model.layers.21.post_attention_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.21.post_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.21.pre_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.21.self_attn.k_proj.weight": "model-00002-of-00003.safetensors","model.layers.21.self_attn.o_proj.weight": "model-00002-of-00003.safetensors","model.layers.21.self_attn.q_proj.weight": "model-00002-of-00003.safetensors","model.layers.21.self_attn.v_proj.weight": "model-00002-of-00003.safetensors","model.layers.22.input_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.22.mlp.down_proj.weight": "model-00002-of-00003.safetensors","model.layers.22.mlp.gate_proj.weight": "model-00002-of-00003.safetensors","model.layers.22.mlp.up_proj.weight": "model-00002-of-00003.safetensors","model.layers.22.post_attention_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.22.post_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.22.pre_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.22.self_attn.k_proj.weight": "model-00002-of-00003.safetensors","model.layers.22.self_attn.o_proj.weight": "model-00002-of-00003.safetensors","model.layers.22.self_attn.q_proj.weight": "model-00002-of-00003.safetensors","model.layers.22.self_attn.v_proj.weight": "model-00002-of-00003.safetensors","model.layers.23.input_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.23.mlp.down_proj.weight": "model-00002-of-00003.safetensors","model.layers.23.mlp.gate_proj.weight": "model-00002-of-00003.safetensors","model.layers.23.mlp.up_proj.weight": "model-00002-of-00003.safetensors","model.layers.23.post_attention_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.23.post_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.23.pre_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.23.self_attn.k_proj.weight": "model-00002-of-00003.safetensors","model.layers.23.self_attn.o_proj.weight": "model-00002-of-00003.safetensors","model.layers.23.self_attn.q_proj.weight": "model-00002-of-00003.safetensors","model.layers.23.self_attn.v_proj.weight": "model-00002-of-00003.safetensors","model.layers.24.input_layernorm.weight": "model-00003-of-00003.safetensors","model.layers.24.mlp.down_proj.weight": "model-00003-of-00003.safetensors","model.layers.24.mlp.gate_proj.weight": "model-00002-of-00003.safetensors","model.layers.24.mlp.up_proj.weight": "model-00003-of-00003.safetensors","model.layers.24.post_attention_layernorm.weight": "model-00003-of-00003.safetensors","model.layers.24.post_feedforward_layernorm.weight": "model-00003-of-00003.safetensors","model.layers.24.pre_feedforward_layernorm.weight": "model-00003-of-00003.safetensors","model.layers.24.self_attn.k_proj.weight": "model-00002-of-00003.safetensors","model.layers.24.self_attn.o_proj.weight": "model-00002-of-00003.safetensors","model.layers.24.self_attn.q_proj.weight": "model-00002-of-00003.safetensors","model.layers.24.self_attn.v_proj.weight": "model-00002-of-00003.safetensors","model.layers.25.input_layernorm.weight": "model-00003-of-00003.safetensors","model.layers.25.mlp.down_proj.weight": "model-00003-of-00003.safetensors","model.layers.25.mlp.gate_proj.weight": "model-00003-of-00003.safetensors","model.layers.25.mlp.up_proj.weight": "model-00003-of-00003.safetensors","model.layers.25.post_attention_layernorm.weight": "model-00003-of-00003.safetensors","model.layers.25.post_feedforward_layernorm.weight": "model-00003-of-00003.safetensors","model.layers.25.pre_feedforward_layernorm.weight": "model-00003-of-00003.safetensors","model.layers.25.self_attn.k_proj.weight": "model-00003-of-00003.safetensors","model.layers.25.self_attn.o_proj.weight": "model-00003-of-00003.safetensors","model.layers.25.self_attn.q_proj.weight": "model-00003-of-00003.safetensors","model.layers.25.self_attn.v_proj.weight": "model-00003-of-00003.safetensors","model.layers.3.input_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.3.mlp.down_proj.weight": "model-00001-of-00003.safetensors","model.layers.3.mlp.gate_proj.weight": "model-00001-of-00003.safetensors","model.layers.3.mlp.up_proj.weight": "model-00001-of-00003.safetensors","model.layers.3.post_attention_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.3.post_feedforward_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.3.pre_feedforward_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.3.self_attn.k_proj.weight": "model-00001-of-00003.safetensors","model.layers.3.self_attn.o_proj.weight": "model-00001-of-00003.safetensors","model.layers.3.self_attn.q_proj.weight": "model-00001-of-00003.safetensors","model.layers.3.self_attn.v_proj.weight": "model-00001-of-00003.safetensors","model.layers.4.input_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.4.mlp.down_proj.weight": "model-00001-of-00003.safetensors","model.layers.4.mlp.gate_proj.weight": "model-00001-of-00003.safetensors","model.layers.4.mlp.up_proj.weight": "model-00001-of-00003.safetensors","model.layers.4.post_attention_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.4.post_feedforward_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.4.pre_feedforward_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.4.self_attn.k_proj.weight": "model-00001-of-00003.safetensors","model.layers.4.self_attn.o_proj.weight": "model-00001-of-00003.safetensors","model.layers.4.self_attn.q_proj.weight": "model-00001-of-00003.safetensors","model.layers.4.self_attn.v_proj.weight": "model-00001-of-00003.safetensors","model.layers.5.input_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.5.mlp.down_proj.weight": "model-00001-of-00003.safetensors","model.layers.5.mlp.gate_proj.weight": "model-00001-of-00003.safetensors","model.layers.5.mlp.up_proj.weight": "model-00001-of-00003.safetensors","model.layers.5.post_attention_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.5.post_feedforward_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.5.pre_feedforward_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.5.self_attn.k_proj.weight": "model-00001-of-00003.safetensors","model.layers.5.self_attn.o_proj.weight": "model-00001-of-00003.safetensors","model.layers.5.self_attn.q_proj.weight": "model-00001-of-00003.safetensors","model.layers.5.self_attn.v_proj.weight": "model-00001-of-00003.safetensors","model.layers.6.input_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.6.mlp.down_proj.weight": "model-00001-of-00003.safetensors","model.layers.6.mlp.gate_proj.weight": "model-00001-of-00003.safetensors","model.layers.6.mlp.up_proj.weight": "model-00001-of-00003.safetensors","model.layers.6.post_attention_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.6.post_feedforward_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.6.pre_feedforward_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.6.self_attn.k_proj.weight": "model-00001-of-00003.safetensors","model.layers.6.self_attn.o_proj.weight": "model-00001-of-00003.safetensors","model.layers.6.self_attn.q_proj.weight": "model-00001-of-00003.safetensors","model.layers.6.self_attn.v_proj.weight": "model-00001-of-00003.safetensors","model.layers.7.input_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.7.mlp.down_proj.weight": "model-00001-of-00003.safetensors","model.layers.7.mlp.gate_proj.weight": "model-00001-of-00003.safetensors","model.layers.7.mlp.up_proj.weight": "model-00001-of-00003.safetensors","model.layers.7.post_attention_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.7.post_feedforward_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.7.pre_feedforward_layernorm.weight": "model-00001-of-00003.safetensors","model.layers.7.self_attn.k_proj.weight": "model-00001-of-00003.safetensors","model.layers.7.self_attn.o_proj.weight": "model-00001-of-00003.safetensors","model.layers.7.self_attn.q_proj.weight": "model-00001-of-00003.safetensors","model.layers.7.self_attn.v_proj.weight": "model-00001-of-00003.safetensors","model.layers.8.input_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.8.mlp.down_proj.weight": "model-00002-of-00003.safetensors","model.layers.8.mlp.gate_proj.weight": "model-00001-of-00003.safetensors","model.layers.8.mlp.up_proj.weight": "model-00002-of-00003.safetensors","model.layers.8.post_attention_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.8.post_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.8.pre_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.8.self_attn.k_proj.weight": "model-00001-of-00003.safetensors","model.layers.8.self_attn.o_proj.weight": "model-00001-of-00003.safetensors","model.layers.8.self_attn.q_proj.weight": "model-00001-of-00003.safetensors","model.layers.8.self_attn.v_proj.weight": "model-00001-of-00003.safetensors","model.layers.9.input_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.9.mlp.down_proj.weight": "model-00002-of-00003.safetensors","model.layers.9.mlp.gate_proj.weight": "model-00002-of-00003.safetensors","model.layers.9.mlp.up_proj.weight": "model-00002-of-00003.safetensors","model.layers.9.post_attention_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.9.post_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.9.pre_feedforward_layernorm.weight": "model-00002-of-00003.safetensors","model.layers.9.self_attn.k_proj.weight": "model-00002-of-00003.safetensors","model.layers.9.self_attn.o_proj.weight": "model-00002-of-00003.safetensors","model.layers.9.self_attn.q_proj.weight": "model-00002-of-00003.safetensors","model.layers.9.self_attn.v_proj.weight": "model-00002-of-00003.safetensors","model.norm.weight": "model-00003-of-00003.safetensors"}
}

仅供参考

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/891232.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

JSON结构快捷转XML结构API集成指南

JSON结构快捷转XML结构API集成指南 引言 在当今的软件开发世界中&#xff0c;数据交换格式的选择对于系统的互操作性和效率至关重要。JSON&#xff08;JavaScript Object Notation&#xff09;和XML&#xff08;eXtensible Markup Language&#xff09;是两种广泛使用的数据表…

期权懂|期权入门知识:开通50ETF期权需要什么条件?

锦鲤三三每日分享期权知识&#xff0c;帮助期权新手及时有效地掌握即市趋势与新资讯&#xff01; 开通50ETF期权需要什么条件&#xff1f; 一、基本资格要求 &#xff08;1&#xff09;年龄限制&#xff1a;投资者必须年满18周岁&#xff0c;具备完全民事行为能力。 &#…

实景三维点云处理专业软件ArcGIS根据DSM生成地表点云集

常见的实景三维处理软件及其特色功能如下&#xff1a; 一、专业实景三维建模软件 Agisoft Metashape 高精度建模&#xff1a;能够生成高精度的三维模型&#xff0c;精度可以达到厘米级甚至毫米级&#xff0c;适用于需要详细测量和分析的项目&#xff0c;如文物保护和建筑测量。…

实战指南:Shiro、CAS打造完美单点登录体验

引言 想象一下&#xff0c;在日常工作中&#xff0c;我们经常需要进行系统认证和授权。当用户尝试登录一个网站时&#xff0c;他们需要提供用户名和密码&#xff0c;网站会检查这些信息&#xff0c;确认用户是谁。这就是认证的过程。 一旦用户被认证&#xff0c;他们可能会尝…

cuda-cuDnn

cuda sudo /bin/sh cuda_11.7.0_515.43.04_linux.run cudnn cuDNN Archive | NVIDIA Developer Linux 系统 CUDA 多版本共存以及切换 – 颢天 安装cuda # 如果已经安装过驱动&#xff0c;驱动不需要再安装&#xff0c;取消勾选 安装cuDNN&#xff0c;cuda-cuDNN对应关系见…

QComboBox中使用树形控件进行选择

事情是这样的&#xff0c;要在一个ComboBox中通过树形结构进行内容的选择。 默认的QComboBox展开是下拉的列表。因此需要定制一下。 效果就是这样的 实现上面效果的核心代码就是下面这样的 MainWindow::MainWindow(QWidget *parent) : QMainWindow(parent) { treenew…

【网络协议】路由信息协议 (RIP)

未经许可&#xff0c;不得转载。 路由信息协议&#xff08;Routing Information Protocol&#xff0c;简称 RIP&#xff09;是一种使用跳数&#xff08;hop count&#xff09;作为路由度量标准的路由协议&#xff0c;用于确定源网络和目标网络之间的最佳路径。 文章目录 什么是…

LoRA微调系列笔记

系列文章目录 第一章&#xff1a;LoRA微调系列笔记 第二章&#xff1a;Llama系列关键知识总结 第三章&#xff1a;LLaVA模型讲解与总结 文章目录 系列文章目录LoRA&#xff1a;Low-Rank Adaptation of Large Language Models目的&#xff1a;依据&#xff1a;优势&#xff1a;…

SRS 服务器入门:实时流媒体传输的理想选择

在当今视频流媒体需求爆炸式增长的时代&#xff0c;如何选择一款高效、稳定且功能强大的流媒体服务器成为了许多开发者和企业关注的焦点。而 SRS&#xff08;Simple Realtime Server&#xff09;作为一款开源的流媒体服务器&#xff0c;以其卓越的性能和灵活的功能&#xff0c;…

C++第五六单元测试

1【单选题】在公有派生类的成员函数不能直接访问基类中继承来的某个成员&#xff0c;则该成员一定是基类中的&#xff08; C &#xff09;。&#xff08;2.0分&#xff09; A、公有成员B、保护成员C、私有成员D、保护成员或私有成员 注意从类外访问与从派生类中访问 2【单…

使用Python可视化有压缩格式的Bitmap(BMP)图像调色板数据

使用Python可视化有压缩格式的Bitmap BMP图像调色板数据 参考文章一、调色板数据二、测试代码三、测试结果 参考文章 有压缩格式的Bitmap(BMP)图像显示调色板数据和图像数据Bitmap(BMP)图像信息分析主要说明带压缩的形式Bitmap(BMP)图像信息验证 一、调色板数据 Color Palette…

「Mac畅玩鸿蒙与硬件49」UI互动应用篇26 - 数字填色游戏

本篇教程将带你实现一个数字填色小游戏&#xff0c;通过简单的交互逻辑&#xff0c;学习如何使用鸿蒙开发组件创建趣味性强的应用。 关键词 UI互动应用数字填色动态交互逻辑判断游戏开发 一、功能说明 数字填色小游戏包含以下功能&#xff1a; 数字选择&#xff1a;用户点击…

html+css+js网页设计 美食 美食家6个页面

htmlcssjs网页设计 美食 美食家6个页面 网页作品代码简单&#xff0c;可使用任意HTML辑软件&#xff08;如&#xff1a;Dreamweaver、HBuilder、Vscode 、Sublime 、Webstorm、Text 、Notepad 等任意html编辑软件进行运行及修改编辑等操作&#xff09;。 获取源码 1&#xf…

标准库以及HAL库——按键控制LED灯代码

按键控制LED本质还是控制GPIO,和点亮一个LED灯没什么区别 点亮一个LED灯&#xff1a;是直接控制输出引脚&#xff0c;GPIO初始化推挽输出即可 按键控制LED&#xff1a;是按键输入信号从而控制输出引脚&#xff0c;GPIO初始化推挽输出一个引脚以外还得加一个GPIO上拉输入 但是…

Java的list中状态属性相同返回true的实现方案

文章目录 项目背景方案一、for循环实现实现思路 方案二、stream实现实现思路 项目背景 在项目中会遇到list中多个状态判断&#xff0c;状态值相等时&#xff0c;总体返回为true。 方案一、for循环实现 实现思路 遍历list&#xff0c;当出现不一致时&#xff0c;直接跳出循环…

模型选择+过拟合欠拟合

训练误差和泛化误差 训练误差&#xff1a;模型在训练数据上的误差 泛化误差&#xff1a;模型在新数据上的误差 验证数据集&#xff1a;一个用来评估模型好坏的数据集 例如拿出50%的数据作为训练 测试数据集&#xff1a;只能用一次 K则交叉验证 在没有足够数据时使用 算法…

Web安全攻防入门教程——hvv行动详解

Web安全攻防入门教程 Web安全攻防是指在Web应用程序的开发、部署和运行过程中&#xff0c;保护Web应用免受攻击和恶意行为的技术与策略。这个领域不仅涉及防御措施的实现&#xff0c;还包括通过渗透测试、漏洞挖掘和模拟攻击来识别潜在的安全问题。 本教程将带你入门Web安全攻防…

语音识别基础算法——动态时间规整算法

前言 动态时间规整算法&#xff0c;Dynamic Time Wraping&#xff0c;缩写为DTW&#xff0c;是语音识别领域的一个基础算法。 算法的提出 DTW 的提出是为了解决或尽量解决在语音识别当中的孤立词识别不正确的问题。该问题简单描述为&#xff1a;在识别阶段&#xff0c;将输入…

SAP SD信贷管理信用管理手册(下)

1、项目类别的信贷激活 图1-12-1.项目类别的信贷设置路径 图1-12-2.项目类别的信贷参数激活 说明&#xff1a;项目类别是否进行信贷管理设置。 2、定义信贷组 图1-13-1.定义信贷组路径 图1-13-2.信贷组定义 说明&#xff1a;信贷组参与后续信贷控制的组合分配。 3、销售凭证及…

分布式项目___某污水处理项目

一.分布式项目___污水处理项目 项目地址:https://gitee.com/yanyigege/collaborative-water-springboot.git ​ 1.项目背景 总公司在全国各地有处理污水的项目部,各项目部处理自己的污水,总部需要监控各地分项目部每天处理污水的原料用量,掌握各分部的污水处理情况 ​ 2.功…