模型部署 - onnx 的导出和分析 -(1) - PyTorch 导出 ONNX

onnx 的导出和分析

一、PyTorch 导出 ONNX 的方法
- 1.1、一个简单的例子 -- 将线性模型转成 onnx
- 1.2、导出多个输出头的模型
- 1.3、导出含有动态维度的模型
二、pytorch 导出 onnx 不成功的时候如何解决
- 2.1、修改 opset 的版本
- 2.2、替换 pytorch 中的算子组合
- 2.3、在 pytorch 登记（注册） onnx 中某些算子
- - 2.3.1、注册方法一
  - 2.3.2、注册方法二
- 2.4、直接修改 onnx，创建 plugin

一、PyTorch 导出 ONNX 的方法

1.1、一个简单的例子 – 将线性模型转成 onnx

首先我们用 pytorch 定义一个线性模型，nn.Linear ：线性层执行的操作是 y = x * W^T + b，其中 x 是输入，W 是权重，b 是偏置。（实际上就是一个矩阵乘法）

class Model(torch.nn.Module):def __init__(self, in_features, out_features, weights, bias=False):super().__init__()self.linear = nn.Linear(in_features, out_features, bias)with torch.no_grad():self.linear.weight.copy_(weights)def forward(self, x):x = self.linear(x)return x

然后我们再定义一个函数，用于导出 onnx

def export_onnx():input   = torch.zeros(1, 1, 1, 4)weights = torch.tensor([[1, 2, 3, 4],[2, 3, 4, 5],[3, 4, 5, 6]],dtype=torch.float32)model   = Model(4, 3, weights)model.eval() #添加eval防止权重继续更新torch.onnx.export(model         = model, args          = (input,),f             = "model.onnx",input_names   = ["input0"],output_names  = ["output0"],opset_version = 12)print("Finished onnx export")

可以看到，这里面的关键在函数 torch.onnx.export()，这是 pytorch 导出 onnx 的基本方式，这个函数的参数有很多，但只要一些基本的参数即可导出模型，下面是一些基本参数的定义：

model (torch.nn.Module): 需要导出的PyTorch模型
args (tuple or Tensor): 一个元组，其中包含传递给模型的输入张量
f (str): 要保存导出模型的文件路径。
input_names (list of str): 输入节点的名字的列表
output_names (list of str): 输出节点的名字的列表
opset_version (int): 用于导出模型的 ONNX 操作集版本

最后我们完整的运行一下代码：

import torch
import torch.nn as nn
import torch.onnxclass Model(torch.nn.Module):def __init__(self, in_features, out_features, weights, bias=False):super().__init__()self.linear = nn.Linear(in_features, out_features, bias)with torch.no_grad():self.linear.weight.copy_(weights)def forward(self, x):x = self.linear(x)return xdef export_onnx():input   = torch.zeros(1, 1, 1, 4)weights = torch.tensor([[1, 2, 3, 4],[2, 3, 4, 5],[3, 4, 5, 6]],dtype=torch.float32)model   = Model(4, 3, weights)model.eval() #添加eval防止权重继续更新torch.onnx.export(model         = model, args          = (input,),f             = "model.onnx",input_names   = ["input0"],output_names  = ["output0"],opset_version = 12)print("Finished onnx export")if __name__ == "__main__":export_onnx()

导出模型后，我们用 netron 查看模型，在终端输入

netron model.onnx

在这里插入图片描述

1.2、导出多个输出头的模型

第一步：定义一个多输出的模型：

class Model(torch.nn.Module):def __init__(self, in_features, out_features, weights1, weights2, bias=False):super().__init__()self.linear1 = nn.Linear(in_features, out_features, bias)self.linear2 = nn.Linear(in_features, out_features, bias)with torch.no_grad():self.linear1.weight.copy_(weights1)self.linear2.weight.copy_(weights2)def forward(self, x):x1 = self.linear1(x)x2 = self.linear2(x)return x1, x2

第二步：编写导出 onnx 的函数

def export_onnx():input    = torch.zeros(1, 1, 1, 4)weights1 = torch.tensor([[1, 2, 3, 4],[2, 3, 4, 5],[3, 4, 5, 6]],dtype=torch.float32)weights2 = torch.tensor([[2, 3, 4, 5],[3, 4, 5, 6],[4, 5, 6, 7]],dtype=torch.float32)model   = Model(4, 3, weights1, weights2)model.eval() #添加eval防止权重继续更新torch.onnx.export(model         = model, args          = (input,),f             = "model.onnx",input_names   = ["input0"],output_names  = ["output0", "output1"],opset_version = 12)print("Finished onnx export")

可以看到，和例 1.1 不一样的地方是 torch.onnx.export 的 output_names
例1.1：output_names = [“output0”]
例1.2：output_names = [“output0”, “output1”]

运行一下完整代码：

import torch
import torch.nn as nn
import torch.onnxclass Model(torch.nn.Module):def __init__(self, in_features, out_features, weights1, weights2, bias=False):super().__init__()self.linear1 = nn.Linear(in_features, out_features, bias)self.linear2 = nn.Linear(in_features, out_features, bias)with torch.no_grad():self.linear1.weight.copy_(weights1)self.linear2.weight.copy_(weights2)def forward(self, x):x1 = self.linear1(x)x2 = self.linear2(x)return x1, x2def export_onnx():input    = torch.zeros(1, 1, 1, 4)weights1 = torch.tensor([[1, 2, 3, 4],[2, 3, 4, 5],[3, 4, 5, 6]],dtype=torch.float32)weights2 = torch.tensor([[2, 3, 4, 5],[3, 4, 5, 6],[4, 5, 6, 7]],dtype=torch.float32)model   = Model(4, 3, weights1, weights2)model.eval() #添加eval防止权重继续更新torch.onnx.export(model         = model, args          = (input,),f             = "model.onnx",input_names   = ["input0"],output_names  = ["output0", "output1"],opset_version = 12)print("Finished onnx export")if __name__ == "__main__":export_onnx()

用 netron 查看模型，结果如下，模型多出了一个输出结果
在这里插入图片描述

1.3、导出含有动态维度的模型

完整运行代码如下：

import torch
import torch.nn as nn
import torch.onnxclass Model(torch.nn.Module):def __init__(self, in_features, out_features, weights, bias=False):super().__init__()self.linear = nn.Linear(in_features, out_features, bias)with torch.no_grad():self.linear.weight.copy_(weights)def forward(self, x):x = self.linear(x)return xdef export_onnx():input   = torch.zeros(1, 1, 1, 4)weights = torch.tensor([[1, 2, 3, 4],[2, 3, 4, 5],[3, 4, 5, 6]],dtype=torch.float32)model   = Model(4, 3, weights)model.eval() #添加eval防止权重继续更新torch.onnx.export(model         = model, args          = (input,),f             = "model.onnx",input_names   = ["input0"],output_names  = ["output0"],dynamic_axes  = {'input0':  {0: 'batch'},'output0': {0: 'batch'}},opset_version = 12)print("Finished onnx export")if __name__ == "__main__":export_onnx()

可以看到，比例 1.1 多了一行 torch.onnx.export 的 dynamic_axes 。我们可以用 dynamic_axes 来指定动态维度，其中 'input0': {0: 'batch'} 中的 0 表示在第 0 维度上的元素是动态的，这里取名为 ‘batch’

用 netron 查看模型：
在这里插入图片描述
可以看到相对于例1.1，他的维度 0 变成了动态的，并且名为 ‘batch’

二、pytorch 导出 onnx 不成功的时候如何解决

上面是 onnx 可以直接被导出的情况，是因为对应的 pytorch 和 onnx 版本都有相应支持的算子在里面。但是有些时候，我们不能顺利的导出 onnx，下面记录一下常见的解决思路。

2.1、修改 opset 的版本

这是首先应该考虑的思路，因为有可能只是版本过低然后有些算子还不支持，所以考虑提高 opset 的版本。

比如下面的这个报错，提示当前 onnx 的 opset 版本不支持这个算子，那我们可以去官方手册搜索一下是否在高的版本支持了这个算子
在这里插入图片描述

官方手册地址：https://github.com/onnx/onnx/blob/main/docs/Operators.md

在这里插入图片描述
又比如说 Acosh 这个算子，在 since version 9 才开始支持，那我们用 7 的时候就是不合适的，升级 opset 版本即可

2.2、替换 pytorch 中的算子组合

有些时候 pytorch 中的一些算子操作在 onnx 中并没有，那我们可以把这些算子替换成 onnx 支持的算子

2.3、在 pytorch 登记（注册） onnx 中某些算子

有些算子在 onnx 中是有的，但是在 pytorch 中没被登记，则需要注册一下
比如下面这个案例，我们想要导出 asinh 这个算子的模型

import torch
import torch.onnxclass Model(torch.nn.Module):def __init__(self):super().__init__()def forward(self, x):x = torch.asinh(x)return xdef export_norm_onnx():input   = torch.rand(1, 5)model   = Model()model.eval()file    = "asinh.onnx"torch.onnx.export(model         = model, args          = (input,),f             = file,input_names   = ["input0"],output_names  = ["output0"],opset_version = 9)print("Finished normal onnx export")if __name__ == "__main__":export_norm_onnx()

但是报错,提示 opset_version = 9 不支持这个算子
在这里插入图片描述

但是我们打开官方手册去搜索发现 asinh 在 version 9 又是支持的
在这里插入图片描述
这里的问题是 PyTorch 与 onnx 之间没有建立 asinh 的映射 （没有搭建桥梁），所以我们编写一个注册代码，来手动注册一下这个算子

2.3.1、注册方法一

完整代码如下：

import torch
import torch.onnx
import onnxruntime
from torch.onnx import register_custom_op_symbolicdef asinh_symbolic(g, input, *, out=None):return g.op("Asinh", input)
register_custom_op_symbolic('aten::asinh', asinh_symbolic, 12)class Model(torch.nn.Module):def __init__(self):super().__init__()def forward(self, x):x = torch.asinh(x)return xdef validate_onnx():input = torch.rand(1, 5)# PyTorch的推理model = Model()x     = model(input)print("result from Pytorch is :", x)# onnxruntime的推理sess  = onnxruntime.InferenceSession('asinh.onnx')x     = sess.run(None, {'input0': input.numpy()})print("result from onnx is:    ", x)def export_norm_onnx():input   = torch.rand(1, 5)model   = Model()model.eval()file    = "asinh.onnx"torch.onnx.export(model         = model, args          = (input,),f             = file,input_names   = ["input0"],output_names  = ["output0"],opset_version = 12)print("Finished normal onnx export")if __name__ == "__main__":export_norm_onnx()# 自定义完onnx以后必须要进行一下验证validate_onnx()

这段代码的关键在于算子的注册：

1、定义 asinh_symbolic 函数

def asinh_symbolic(g, input, *, out=None):return g.op("Asinh", input)

函数必须是 asinh_symbolic 这个名字
g: 就是 graph，计算图（在计算图中添加onnx算子）
input ：symblic的参数需要与Pytorch的asinh接口函数的参数对齐
(def asinh( input: Tensor, *, out: Optional[Tensor]=None) -> Tensor: … )
符号函数内部调用 g.op, 为 onnx 计算图添加 Asinh 算子
g.op中的第一个参数是onnx中的算子名字: Asinh

2、使用 register_custom_op_symbolic 函数

register_custom_op_symbolic('aten::asinh', asinh_symbolic, 12)

aten 是"a Tensor Library"的缩写，是一个实现张量运算的C++库
asinh 是在名为 aten 的一个c++命名空间下进行实现的
将 asinh_symbolic 这个符号函数，与PyTorch的 asinh 算子绑定
register_op 中的第一个参数是PyTorch中的算子名字: aten::asinh
最后一个参数表示从第几个 opset 开始支持（可自己设置）

3、自定义完 onnx 以后必须要进行一下验证，可使用 onnxruntime

2.3.2、注册方法二

import torch
import torch.onnx
import onnxruntime
import functools
from torch.onnx import register_custom_op_symbolic
from torch.onnx._internal import registration_onnx_symbolic = functools.partial(registration.onnx_symbolic, opset=9)@_onnx_symbolic('aten::asinh')
def asinh_symbolic(g, input, *, out=None):return g.op("Asinh", input)class Model(torch.nn.Module):def __init__(self):super().__init__()def forward(self, x):x = torch.asinh(x)return xdef validate_onnx():input = torch.rand(1, 5)# PyTorch的推理model = Model()x     = model(input)print("result from Pytorch is :", x)# onnxruntime的推理sess  = onnxruntime.InferenceSession('asinh2.onnx')x     = sess.run(None, {'input0': input.numpy()})print("result from onnx is:    ", x)def export_norm_onnx():input   = torch.rand(1, 5)model   = Model()model.eval()file    = "asinh2.onnx"torch.onnx.export(model         = model, args          = (input,),f             = file,input_names   = ["input0"],output_names  = ["output0"],opset_version = 12)print("Finished normal onnx export")if __name__ == "__main__":export_norm_onnx()# 自定义完onnx以后必须要进行一下验证validate_onnx()

与上面例子不同的是，这个注册方式跟底层文件的写法是一样的（文件在虚拟环境中的 torch/onnx/symbolic_opset*.py ）

通过torch._internal 中的 registration 来注册这个算子，让这个算子可以与底层C++实现的 aten::asinh 绑定

_onnx_symbolic = functools.partial(registration.onnx_symbolic, opset=9)
@_onnx_symbolic('aten::asinh')
def asinh_symbolic(g, input, *, out=None):return g.op("Asinh", input)