ncnn 算子操作描述,具体查询见
ncnn/docs/developer-guide/ at master · Tencent/ncnn · GitHub
具体如下:(针对有些算子 用pytorch 实现了用例,可以对比学习,如有错误欢迎指出)
1.AbsVal: 计算输入张量中的每个元素的绝对值
2.ArgMax: 计算输入张量中元素的最大值,并返回其位置索引。
3.BatchNorm: 对神经网络的每一层进行归一化操作
4.Bias: 为神经网络的神经元或层添加偏置项
5.BinaryOp: 二元操作
6.BNLL: 对输入应用 BNLL 激活函数
7.Cast: 类型转换
8.CELU: 应用 CELU 激活函数。
9.Clip: 将输入张量中的元素限制在指定范围内。
10.Concat: 沿指定轴连接多个输入张量。
11.Convolution: 卷积操作
14.ConvolutionDepthWise: 深度可分离卷积
15.ConvolutionDepthWise1D: 在一维数据上应用深度可分离卷积
16.ConvolutionDepthWise3D: 在三维数据上应用深度可分离卷积
17.CopyTo: 将输入数据复制到指定位置
18.Crop: 裁剪操作
19.CumulativeSum: 对输入数据进行累积求和操作。
20.Deconvolution: 反卷积操作
21.Deconvolution1D: 一维反卷积操作
22.Deconvolution3D: 三维反卷积操作
23.DeconvolutionDepthWise: 深度可分离反卷积
24.DeconvolutionDepthWise1D: 在一维数据上应用深度可分离反卷积
25.DeconvolutionDepthWise3D: 三维深度可分离反卷积
26.DeformableConv2D: 可变形卷积,允许卷积核在空间上变形
27.Dequantize: 对量化后的数据进行反量化操作
28.Diag: 创建一个对角阵
29.Dropout: 随机失活
30.Eltwise: 逐元素操作
31.ELU: 应用指数线性单元(ELU)激活函数
32.Embed: 将输入数据映射到低维空间
33.Exp: 计算输入数据的指数
34.Flatten: 将输入数据展平为一维
35.Fold: 折叠操作
36.GELU: 应用高斯误差线性单元(GELU)激活函数
37.GLU: 应用门控线性单元(GLU)激活函数
38.Gemm: 执行矩阵乘法操作
39.GridSample: 在输入的网格上进行采样操作
40.GroupNorm: 对神经网络中的特征图执行分组归一化
41.GRU: 门控循环单元(GRU)神经网络层
42.HardSigmoid: 应用硬Sigmoid激活函数
43.HardSwish: 应用硬Swish激活函数
44.InnerProduct: 执行全连接操作
45.Input: 神经网络的输入层
46.InstanceNorm: 归一化操作
47.Interp: 执行插值操作
48.LayerNorm: 对神经网络中的层执行归一化操作
49.Log: 计算输入数据的自然对数
50.LRN: 局部响应归一化层
51.LSTM: 长短期记忆(LSTM)神经网络层
52.MemoryData: 用于存储数据并生成数据迭代器
53.Mish: 应用Mish激活函数
54.MultiHeadAttention: 多头注意力机制
55.MVN: 均值方差归一化操作
56.Noop: 空操作
57.Normalize: 归一化操作
58.Packing: 打包操作
59.Padding: 填充操作
60.Permute: 置换操作
61.PixelShuffle: 像素重组
62.Pooling: 池化操作
63.Pooling1D: 一维池化操作
64.Pooling3D: 三维池化操作
65.Power: 幂运算
66.PReLU: 参数化修正线性单元
67.Quantize: 量化操作
68.Reduction: 执行张量的降维操作
69.ReLU: 应用修正线性单元(ReLU)激活函数。
70.Reorg: 通道重排操作
71.Requantize: 重新量化(再量化)
72.Reshape: 形状重塑操作
73.RNN: 循环神经网络(RNN)层。
74.Scale: 缩放操作
75.SELU: 应用自归一化激活函数
76.Shrink: 对输入数据进行收缩操作
77.ShuffleChannel: 通道混洗操作
78.Sigmoid: 应用Sigmoid激活函数
79.Slice: 分割操作
80.Softmax: 应用Softmax激活函数,通常用于分类任务。
81.Softplus: 应用Softplus激活函数。
82.Split: 将输入数据分割为多个部分。
83.Swish: swish激活函数
84.TanH: TanH激活函数
85.Threshold: 阈值操作
86.Tile: 重复复制
87.UnaryOp: 对输入执行一元操作
88.Unfold: 在输入数据上执行展开操作
1.AbsVal: 计算输入张量中的每个元素的绝对值。
y = abs(x)
- one_blob_only 只支持一个blob
- support_inplace 支持替换输入的blob 就 y=abs(y)
import torchinput_tensor = torch.tensor([-1, 2, -3, 4, -5])
output_tensor = torch.abs(input_tensor)
# tensor([1, 2, 3, 4, 5])
2.ArgMax: 计算输入张量中元素的最大值,并返回其位置索引。
y = argmax(x, out_max_val, topk)
- one_blob_only 支持一个blob
param id | name | type | default | description |
0 | out_max_val | int | 0 | |
1 | topk | int | 1 |
import torchinput_tensor = torch.tensor([10, 5, 8, 20, 15])
output_index = torch.argmax(input_tensor)
# tensor(3)
3.BatchNorm: 对神经网络的每一层进行归一化操作。
y = (x - mean) / sqrt(var + eps) * slope + bias
- one_blob_only 支持一个参数
- support_inplace 支持替换
param id | name | type | default | description |
0 | channels | int | 0 | |
1 | eps | float | 0.f |
weight | type | shape |
slope_data | float | [channels] |
mean_data | float | [channels] |
var_data | float | [channels] |
bias_data | float | [channels] |
import torch
import torch.nn as nnbatch_norm_layer = nn.BatchNorm1d(3)
input_tensor = torch.randn(2, 3, 4) # Batch size为2,特征维度为3,序列长度为4
output_tensor = batch_norm_layer(input_tensor)
print(output_tensor)# tensor([[[-0.5624, 0.9015, -0.9183, 0.3030],
# [ 0.4668, 1.0430, -2.0182, 0.7149],
# [-1.5960, 0.5437, 0.8771, -0.1269]],
# [[-0.1101, -1.4983, 1.9178, -0.0333],
# [-0.1873, -1.1687, 0.7301, 0.4194],
# [ 1.2667, 0.7976, -1.4188, -0.3434]]],
# grad_fn=<NativeBatchNormBackward0>)
4.Bias: 为神经网络的神经元或层添加偏置项。
y = x + bias
- one_blob_only
- support_inplace
param id | name | type | default | description |
0 | bias_data_size | int | 0 |
weight | type | shape |
bias_data | float | [channels] |
import torchinput_tensor = torch.randn(3, 4)
bias = torch.randn(4)
output_tensor = input_tensor + bias
print('output_tensor:',output_tensor,'\nshape:',output_tensor.shape)# tensor([[-0.1874, 1.2358, 1.9006, 0.4483],
# [-1.1005, 1.6844, -0.3991, -0.4538],
# [ 0.4519, 2.2752, 1.6041, -1.2463]])
# shape: torch.Size([3, 4])
5.BinaryOp: 二元操作
This operation is used for binary computation, and the calculation rule depends on the broadcasting rule.(这个操作用于二进制计算,计算规则取决于广播规则。)
C = binaryop(A, B)
if with_scalar = 1:
- one_blob_only
- support_inplace
param id | name | type | default | description |
0 | op_type | int | 0 | Operation type as follows |
1 | with_scalar | int | 0 | with_scalar=0 B is a matrix, with_scalar=1 B is a scalar |
2 | b | float | 0.f | When B is a scalar, B = b |
Operation type:
- 0 = ADD(加法)
- 1 = SUB(减法)
- 2 = MUL(乘法)
- 3 = DIV(除法)
- 4 = MAX(取最大值)
- 5 = MIN(取最小值)
- 6 = POW(幂运算)
- 7 = RSUB(右操作数减去左操作数)
- 8 = RDIV(右操作数除以左操作数)
- 9 = RPOW(右操作数的左操作数次幂)
- 10 = ATAN2(反正切运算)
- 11 = RATAN2(右操作数以左操作数为底的反正切运算)
6.BNLL: 对输入应用 BNLL 激活函数
激活函数中的双极性 Sigmoid 函数
f(x)=log(1 + exp(x))
y = log(1 + e^(-x)) , x > 0
y = log(1 + e^x), x < 0
- one_blob_only
- support_inplace
7.Cast: 类型转换
y = cast(x)
- one_blob_only
- support_packing
param id | name | type | default | description |
0 | type_from | int | 0 | |
1 | type_to | int | 0 |
Element type:
- 0 = auto
- 1 = float32
- 2 = float16
- 3 = int8
- 4 = bfloat16
import torchinput_tensor = torch.tensor([1.5, 2.3, 3.7]) output_tensor = input_tensor.type( print(output_tensor) # tensor([1, 2, 3], dtype=torch.int32)
8.CELU: 应用 CELU 激活函数。
if x < 0 y = (exp(x / alpha) - 1.f) * alpha
else y = x
- one_blob_only
- support_inplace
param id | name | type | default | description |
0 | alpha | float | 1.f |
import torch
import torch.nn.functional as Finput_tensor = torch.randn(3, 4)
output_tensor = F.elu(input_tensor)
# output_tensor: tensor([[-0.5924, 0.7810, 1.1752, 0.8274],
# [-0.6871, 0.0466, 0.9411, -0.7082],
# [-0.8632, -0.1801, -0.8730, 0.9515]])
# shape: torch.Size([3, 4])
9.Clip: 将输入张量中的元素限制在指定范围内。
y = clamp(x, min, max)
- one_blob_only
- support_inplace
param id | name | type | default | description |
0 | min | float | -FLT_MAX | |
1 | max | float | FLT_MAX |
import torchinput_tensor = torch.randn(2, 3)
output_tensor = torch.clamp(input_tensor, min=-0.5, max=0.5)
print(output_tensor)# tensor([[-0.5000, -0.5000, -0.5000],
# [ 0.5000, -0.4091, -0.5000]])
10.Concat: 沿指定轴连接多个输入张量。
y = concat(x0, x1, x2, ...) by axis
param id | name | type | default | description |
0 | axis | int | 0 |
import torchinput_tensor1 = torch.randn(2, 3)
input_tensor2 = torch.randn(2, 3)
output_tensor =, input_tensor2), dim=1)
print('output_tensor:',output_tensor,'\nshape:',output_tensor.shape)# output_tensor: tensor([[-2.4431, -0.6428, 0.4434, 1.2216, -1.1874, -1.1327],
# [-0.8082, -0.3552, 0.9945, -0.7679, 0.6547, -1.0401]])
# shape: torch.Size([2, 6])
11.Convolution: 卷积操作
x2 = pad(x, pads, pad_value)
x3 = conv(x2, weight, kernel, stride, dilation) + bias
y = activation(x3, act_type, act_params)
- one_blob_only
param id | name | type | default | description |
0 | num_output | int | 0 | |
1 | kernel_w | int | 0 | |
2 | dilation_w | int | 1 | |
3 | stride_w | int | 1 | |
4 | pad_left | int | 0 | |
5 | bias_term | int | 0 | |
6 | weight_data_size | int | 0 | |
8 | int8_scale_term | int | 0 | |
9 | activation_type | int | 0 | |
10 | activation_params | array | [ ] | |
11 | kernel_h | int | kernel_w | |
12 | dilation_h | int | dilation_w | |
13 | stride_h | int | stride_w | |
14 | pad_top | int | pad_left | |
15 | pad_right | int | pad_left | |
16 | pad_bottom | int | pad_top | |
18 | pad_value | float | 0.f | |
19 | dynamic_weight | int | 0 |
weight | type | shape |
weight_data | float/fp16/int8 | [kernel_w, kernel_h, num_input, num_output] |
bias_data | float | [num_output] |
weight_data_int8_scales | float | [num_output] |
bottom_blob_int8_scales | float | [1] |
top_blob_int8_scales | float | [1] |
import torch
import torch.nn as nnconv_layer = nn.Conv2d(in_channels=3, out_channels=16, kernel_size=3, stride=1, padding=1)
input_tensor = torch.randn(1, 3, 32, 32)
output_tensor = conv_layer(input_tensor)
# torch.Size([1, 16, 32, 32])
x2 = pad(x, pads, pad_value)
x3 = conv1d(x2, weight, kernel, stride, dilation) + bias
y = activation(x3, act_type, act_params)
- one_blob_only
param id | name | type | default | description |
0 | num_output | int | 0 | |
1 | kernel_w | int | 0 | |
2 | dilation_w | int | 1 | |
3 | stride_w | int | 1 | |
4 | pad_left | int | 0 | |
5 | bias_term | int | 0 | |
6 | weight_data_size | int | 0 | |
9 | activation_type | int | 0 | |
10 | activation_params | array | [ ] | |
15 | pad_right | int | pad_left | |
18 | pad_value | float | 0.f | |
19 | dynamic_weight | int | 0 |
weight | type | shape |
weight_data | float/fp16/int8 | [kernel_w, num_input, num_output] |
bias_data | float | [num_output] |
import torch
import torch.nn as nnconv_layer = nn.Conv1d(in_channels=3, out_channels=16, kernel_size=3, stride=1, padding=1)
input_tensor = torch.randn(1, 3, 32)
output_tensor = conv_layer(input_tensor)
# torch.Size([1, 16, 32])
x2 = pad(x, pads, pad_value)
x3 = conv3d(x2, weight, kernel, stride, dilation) + bias
y = activation(x3, act_type, act_params)
- one_blob_only
param id | name | type | default | description |
0 | num_output | int | 0 | |
1 | kernel_w | int | 0 | |
2 | dilation_w | int | 1 | |
3 | stride_w | int | 1 | |
4 | pad_left | int | 0 | |
5 | bias_term | int | 0 | |
6 | weight_data_size | int | 0 | |
9 | activation_type | int | 0 | |
10 | activation_params | array | [ ] | |
11 | kernel_h | int | kernel_w | |
12 | dilation_h | int | dilation_w | |
13 | stride_h | int | stride_w | |
14 | pad_top | int | pad_left | |
15 | pad_right | int | pad_left | |
16 | pad_bottom | int | pad_top | |
17 | pad_behind | int | pad_front | |
18 | pad_value | float | 0.f | |
21 | kernel_d | int | kernel_w | |
22 | dilation_d | int | dilation_w | |
23 | stride_d | int | stride_w | |
24 | pad_front | int | pad_left |
weight | type | shape |
weight_data | float/fp16/int8 | [kernel_w, kernel_h, kernel_d, num_input, num_output] |
bias_data | float | [num_output] |
import torch
import torch.nn as nnconv_layer = nn.Conv3d(in_channels=3, out_channels=16, kernel_size=3, stride=1, padding=1)
input_tensor = torch.randn(1, 3, 32, 32, 32)
output_tensor = conv_layer(input_tensor)
# torch.Size([1, 16, 32, 32, 32])
14.ConvolutionDepthWise: 深度可分离卷积
x2 = pad(x, pads, pad_value)
x3 = conv(x2, weight, kernel, stride, dilation, group) + bias
y = activation(x3, act_type, act_params)
- one_blob_only
param id | name | type | default | description |
0 | num_output | int | 0 | |
1 | kernel_w | int | 0 | |
2 | dilation_w | int | 1 | |
3 | stride_w | int | 1 | |
4 | pad_left | int | 0 | |
5 | bias_term | int | 0 | |
6 | weight_data_size | int | 0 | |
7 | group | int | 1 | |
8 | int8_scale_term | int | 0 | |
9 | activation_type | int | 0 | |
10 | activation_params | array | [ ] | |
11 | kernel_h | int | kernel_w | |
12 | dilation_h | int | dilation_w | |
13 | stride_h | int | stride_w | |
14 | pad_top | int | pad_left | |
15 | pad_right | int | pad_left | |
16 | pad_bottom | int | pad_top | |
18 | pad_value | float | 0.f | |
19 | dynamic_weight | int | 0 |
weight | type | shape |
weight_data | float/fp16/int8 | [kernel_w, kernel_h, num_input / group, num_output / group, group] |
bias_data | float | [num_output] |
weight_data_int8_scales | float | [group] |
bottom_blob_int8_scales | float | [1] |
top_blob_int8_scales | float | [1] |
import torch
import torch.nn as nnconv_dw_layer = nn.Conv2d(in_channels=3, out_channels=3, kernel_size=3, groups=3)
input_tensor = torch.randn(1, 3, 32, 32)
output_tensor = conv_dw_layer(input_tensor)
# torch.Size([1, 3, 30, 30])
15.ConvolutionDepthWise1D: 在一维数据上应用深度可分离卷积。
x2 = pad(x, pads, pad_value)
x3 = conv1d(x2, weight, kernel, stride, dilation, group) + bias
y = activation(x3, act_type, act_params)
- one_blob_only
param id | name | type | default | description |
0 | num_output | int | 0 | |
1 | kernel_w | int | 0 | |
2 | dilation_w | int | 1 | |
3 | stride_w | int | 1 | |
4 | pad_left | int | 0 | |
5 | bias_term | int | 0 | |
6 | weight_data_size | int | 0 | |
7 | group | int | 1 | |
9 | activation_type | int | 0 | |
10 | activation_params | array | [ ] | |
15 | pad_right | int | pad_left | |
18 | pad_value | float | 0.f | |
19 | dynamic_weight | int | 0 |
weight | type | shape |
weight_data | float/fp16/int8 | [kernel_w, num_input / group, num_output / group, group] |
bias_data | float | [num_output] |
import torch
import torch.nn as nn# 定义一个一维的深度可分离卷积层
conv_dw_layer = nn.Conv1d(in_channels=3, out_channels=3, kernel_size=3, groups=3)# 创建一个随机输入张量
input_tensor = torch.randn(1, 3, 10) # 输入张量的形状为 (batch_size, channels, sequence_length)# 将输入张量传递给深度可分离卷积层
output_tensor = conv_dw_layer(input_tensor)print(output_tensor.shape)
# torch.Size([1, 3, 8])
16.ConvolutionDepthWise3D: 在三维数据上应用深度可分离卷积。
x2 = pad(x, pads, pad_value)
x3 = conv1d(x2, weight, kernel, stride, dilation, group) + bias
y = activation(x3, act_type, act_params)
- one_blob_only
param id | name | type | default | description |
0 | num_output | int | 0 | |
1 | kernel_w | int | 0 | |
2 | dilation_w | int | 1 | |
3 | stride_w | int | 1 | |
4 | pad_left | int | 0 | |
5 | bias_term | int | 0 | |
6 | weight_data_size | int | 0 | |
7 | group | int | 1 | |
9 | activation_type | int | 0 | |
10 | activation_params | array | [ ] | |
15 | pad_right | int | pad_left | |
18 | pad_value | float | 0.f | |
19 | dynamic_weight | int | 0 |
weight | type | shape |
weight_data | float/fp16/int8 | [kernel_w, num_input / group, num_output / group, group] |
bias_data | float | [num_output] |
17.CopyTo: 将输入数据复制到指定位置
self[offset] = src
- one_blob_only
param id | name | type | default | description |
0 | woffset | int | 0 | |
1 | hoffset | int | 0 | |
13 | doffset | int | 0 | |
2 | coffset | int | 0 | |
9 | starts | array | [ ] | |
11 | axes | array | [ ] |
18.Crop: 裁剪操作
y = crop(x)
- one_blob_only
param id | name | type | default | description |
0 | woffset | int | 0 | |
1 | hoffset | int | 0 | |
13 | doffset | int | 0 | |
2 | coffset | int | 0 | |
3 | outw | int | 0 | |
4 | outh | int | 0 | |
14 | outd | int | 0 | |
5 | outc | int | 0 | |
6 | woffset2 | int | 0 | |
7 | hoffset2 | int | 0 | |
15 | doffset2 | int | 0 | |
8 | coffset2 | int | 0 | |
9 | starts | array | [ ] | |
10 | ends | array | [ ] | |
11 | axes | array | [ ] |
import torch# 创建一个3x3的张量
tensor = torch.tensor([[1, 2, 3],[4, 5, 6],[7, 8, 9]])# 进行裁剪,选取其中部分区域
cropped_tensor = tensor[1:, 1:]print(cropped_tensor)
# tensor([[5, 6],
# [8, 9]])
19.CumulativeSum: 对输入数据进行累积求和操作。
If axis < 0, we use axis = x.dims + axis
It implements torch.cumsum — PyTorch 2.3 documentation
- one_blob_only
- support_inplace
param id | name | type | default | description |
0 | axis | int | 0 |
20.Deconvolution: 反卷积操作
x2 = deconv(x, weight, kernel, stride, dilation) + bias
x3 = depad(x2, pads, pad_value)
y = activation(x3, act_type, act_params)
- one_blob_only
param id | name | type | default | description |
0 | num_output | int | 0 | |
1 | kernel_w | int | 0 | |
2 | dilation_w | int | 1 | |
3 | stride_w | int | 1 | |
4 | pad_left | int | 0 | |
5 | bias_term | int | 0 | |
6 | weight_data_size | int | 0 | |
9 | activation_type | int | 0 | |
10 | activation_params | array | [ ] | |
11 | kernel_h | int | kernel_w | |
12 | dilation_h | int | dilation_w | |
13 | stride_h | int | stride_w | |
14 | pad_top | int | pad_left | |
15 | pad_right | int | pad_left | |
16 | pad_bottom | int | pad_top | |
18 | output_pad_right | int | 0 | |
19 | output_pad_bottom | int | output_pad_right | |
20 | output_w | int | 0 | |
21 | output_h | int | output_w | |
28 | dynamic_weight | int | 0 |
weight | type | shape |
weight_data | float/fp16 | [kernel_w, kernel_h, num_input, num_output] |
bias_data | float | [num_output] |
21.Deconvolution1D: 一维反卷积操作
x2 = deconv1d(x, weight, kernel, stride, dilation) + bias
x3 = depad(x2, pads, pad_value)
y = activation(x3, act_type, act_params)
- one_blob_only
param id | name | type | default | description |
0 | num_output | int | 0 | |
1 | kernel_w | int | 0 | |
2 | dilation_w | int | 1 | |
3 | stride_w | int | 1 | |
4 | pad_left | int | 0 | |
5 | bias_term | int | 0 | |
6 | weight_data_size | int | 0 | |
9 | activation_type | int | 0 | |
10 | activation_params | array | [ ] | |
15 | pad_right | int | pad_left | |
18 | output_pad_right | int | 0 | |
20 | output_w | int | 0 | |
28 | dynamic_weight | int | 0 |
weight | type | shape |
weight_data | float/fp16 | [kernel_w, num_input, num_output] |
bias_data | float | [num_output] |
22.Deconvolution3D: 三维反卷积操作
x2 = deconv3d(x, weight, kernel, stride, dilation) + bias
x3 = depad(x2, pads, pad_value)
y = activation(x3, act_type, act_params)
- one_blob_only
param id | name | type | default | description |
0 | num_output | int | 0 | |
1 | kernel_w | int | 0 | |
2 | dilation_w | int | 1 | |
3 | stride_w | int | 1 | |
4 | pad_left | int | 0 | |
5 | bias_term | int | 0 | |
6 | weight_data_size | int | 0 | |
9 | activation_type | int | 0 | |
10 | activation_params | array | [ ] | |
11 | kernel_h | int | kernel_w | |
12 | dilation_h | int | dilation_w | |
13 | stride_h | int | stride_w | |
14 | pad_top | int | pad_left | |
15 | pad_right | int | pad_left | |
16 | pad_bottom | int | pad_top | |
17 | pad_behind | int | pad_front | |
18 | output_pad_right | int | 0 | |
19 | output_pad_bottom | int | output_pad_right | |
20 | output_pad_behind | int | output_pad_right | |
21 | kernel_d | int | kernel_w | |
22 | dilation_d | int | dilation_w | |
23 | stride_d | int | stride_w | |
24 | pad_front | int | pad_left | |
25 | output_w | int | 0 | |
26 | output_h | int | output_w | |
27 | output_d | int | output_w |
weight | type | shape |
weight_data | float/fp16 | [kernel_w, kernel_h, kernel_d, num_input, num_output] |
bias_data | float | [num_output] |
23.DeconvolutionDepthWise: 深度可分离反卷积。
x2 = deconv(x, weight, kernel, stride, dilation, group) + bias
x3 = depad(x2, pads, pad_value)
y = activation(x3, act_type, act_params)
- one_blob_only
param id | name | type | default | description |
0 | num_output | int | 0 | |
1 | kernel_w | int | 0 | |
2 | dilation_w | int | 1 | |
3 | stride_w | int | 1 | |
4 | pad_left | int | 0 | |
5 | bias_term | int | 0 | |
6 | weight_data_size | int | 0 | |
7 | group | int | 1 | |
9 | activation_type | int | 0 | |
10 | activation_params | array | [ ] | |
11 | kernel_h | int | kernel_w | |
12 | dilation_h | int | dilation_w | |
13 | stride_h | int | stride_w | |
14 | pad_top | int | pad_left | |
15 | pad_right | int | pad_left | |
16 | pad_bottom | int | pad_top | |
18 | output_pad_right | int | 0 | |
19 | output_pad_bottom | int | output_pad_right | |
20 | output_w | int | 0 | |
21 | output_h | int | output_w | |
28 | dynamic_weight | int | 0 |
weight | type | shape |
weight_data | float/fp16 | [kernel_w, kernel_h, num_input / group, num_output / group, group] |
bias_data | float | [num_output] |
24.DeconvolutionDepthWise1D: 在一维数据上应用深度可分离反卷积。
x2 = deconv1d(x, weight, kernel, stride, dilation, group) + bias
x3 = depad(x2, pads, pad_value)
y = activation(x3, act_type, act_params)
- one_blob_only
param id | name | type | default | description |
0 | num_output | int | 0 | |
1 | kernel_w | int | 0 | |
2 | dilation_w | int | 1 | |
3 | stride_w | int | 1 | |
4 | pad_left | int | 0 | |
5 | bias_term | int | 0 | |
6 | weight_data_size | int | 0 | |
7 | group | int | 1 | |
9 | activation_type | int | 0 | |
10 | activation_params | array | [ ] | |
15 | pad_right | int | pad_left | |
18 | output_pad_right | int | 0 | |
20 | output_w | int | 0 | |
28 | dynamic_weight | int | 0 |
weight | type | shape |
weight_data | float/fp16 | [kernel_w, num_input / group, num_output / group, group] |
bias_data | float | [num_output] |
25.DeconvolutionDepthWise3D: 三维深度可分离反卷积
x2 = deconv3d(x, weight, kernel, stride, dilation, group) + bias
x3 = depad(x2, pads, pad_value)
y = activation(x3, act_type, act_params)
- one_blob_only
param id | name | type | default | description |
0 | num_output | int | 0 | |
1 | kernel_w | int | 0 | |
2 | dilation_w | int | 1 | |
3 | stride_w | int | 1 | |
4 | pad_left | int | 0 | |
5 | bias_term | int | 0 | |
6 | weight_data_size | int | 0 | |
7 | group | int | 1 | |
9 | activation_type | int | 0 | |
10 | activation_params | array | [ ] | |
11 | kernel_h | int | kernel_w | |
12 | dilation_h | int | dilation_w | |
13 | stride_h | int | stride_w | |
14 | pad_top | int | pad_left | |
15 | pad_right | int | pad_left | |
16 | pad_bottom | int | pad_top | |
17 | pad_behind | int | pad_front | |
18 | output_pad_right | int | 0 | |
19 | output_pad_bottom | int | output_pad_right | |
20 | output_pad_behind | int | output_pad_right | |
21 | kernel_d | int | kernel_w | |
22 | dilation_d | int | dilation_w | |
23 | stride_d | int | stride_w | |
24 | pad_front | int | pad_left | |
25 | output_w | int | 0 | |
26 | output_h | int | output_w | |
27 | output_d | int | output_w |
weight | type | shape |
weight_data | float/fp16 | [kernel_w, kernel_h, kernel_d, num_input / group, num_output / group, group] |
bias_data | float | [num_output] |
26.DeformableConv2D: 可变形卷积,允许卷积核在空间上变形。
x2 = deformableconv2d(x, offset, mask, weight, kernel, stride, dilation) + bias
y = activation(x2, act_type, act_params)
param id | name | type | default | description |
0 | num_output | int | 0 | |
1 | kernel_w | int | 0 | |
2 | dilation_w | int | 1 | |
3 | stride_w | int | 1 | |
4 | pad_left | int | 0 | |
5 | bias_term | int | 0 | |
6 | weight_data_size | int | 0 | |
9 | activation_type | int | 0 | |
10 | activation_params | array | [ ] | |
11 | kernel_h | int | kernel_w | |
12 | dilation_h | int | dilation_w | |
13 | stride_h | int | stride_w | |
14 | pad_top | int | pad_left | |
15 | pad_right | int | pad_left | |
16 | pad_bottom | int | pad_top |
weight | type | shape |
weight_data | float/fp16/int8 | [kernel_w, kernel_h, num_input, num_output] |
bias_data | float | [num_output] |
27.Dequantize: 对量化后的数据进行反量化操作。
y = x * scale + bias
- one_blob_only
- support_inplace
param id | name | type | default | description |
0 | scale_data_size | int | 1 | |
1 | bias_data_size | int | 0 |
weight | type | shape |
scale_data | float | [scale_data_size] |
bias_data | float | [bias_data_size] |
import torch# 假设quantized_tensor为量化后的张量
quantized_tensor = torch.tensor([0, 1, 2, 3], dtype=torch.uint8) # 假设使用8位无符号整数进行量化# 进行Dequantization
dequantized_tensor = quantized_tensor.float() # 将数据类型转换为float类型,即将量化后的整数数据转换为浮点数print(dequantized_tensor)
# tensor([0., 1., 2., 3.])
import torch# 假设quantized_weights为量化后的权重张量
quantized_weights = torch.tensor([-1, 0, 1, 2], dtype=torch.int8) # 假设使用8位有符号整数进行量化# 进行Dequantization
scale = 0.01 # 量化比例
dequantized_weights = quantized_weights.float() * scale # 将量化后的整数数据乘以比例因子以完成反量化操作print(dequantized_weights)
# tensor([-0.0100, 0.0000, 0.0100, 0.0200])
28.Diag: 创建一个对角阵。
y = diag(x, diagonal)
- one_blob_only
param id | name | type | default | description |
0 | diagonal | int | 0 |
import torch# 创建一个包含对角线元素为 [1, 2, 3] 的对角矩阵
diagonal_elements = torch.tensor([1, 2, 3])
diagonal_matrix = torch.diag(diagonal_elements)print(diagonal_matrix)
# tensor([[1, 0, 0],
# [0, 2, 0],
# [0, 0, 3]])
29.Dropout: 随机失活
y = x * scale
- one_blob_only
param id | name | type | default | description |
0 | scale | float | 1.f |
import torch
import torch.nn as nn# 创建一个包含两个全连接层和一个Dropout层的神经网络
class MyModel(nn.Module):def __init__(self):super(MyModel, self).__init__()self.fc1 = nn.Linear(10, 5)self.dropout = nn.Dropout(p=0.5) # 创建一个保留概率为0.5的Dropout层self.fc2 = nn.Linear(5, 2)def forward(self, x):x = self.fc1(x)x = self.dropout(x) # 在全连接层1的输出上应用Dropoutx = torch.relu(x)x = self.fc2(x)return x# 创建模型实例
model = MyModel()# 在训练时,使用model.train()来开启Dropout
model.train()# 输入数据示例
input_data = torch.randn(1, 10) # 创建一个大小为(1, 10)的张量# 前向传播
output = model(input_data)print(output)
# tensor([[0.7759, 0.4466]], grad_fn=<AddmmBackward0>)
30.Eltwise: 逐元素操作
y = elementwise_op(x0, x1, ...)
param id | name | type | default | description |
0 | op_type | int | 0 | |
1 | coeffs | array | [ ] |
Operation type:
- 0 = PROD
- 1 = SUM
- 2 = MAX
import torch# 创建两个张量 a = torch.tensor([1, 2, 3]) b = torch.tensor([4, 5, 6])# 0 = PROD,逐元素相乘 prod_result = torch.mul(a, b) print("Elementwise product result:", prod_result) # Elementwise product result: tensor([ 4, 10, 18]) # 1 = SUM,逐元素相加 sum_result = torch.add(a, b) print("Elementwise sum result:", sum_result) # Elementwise sum result: tensor([5, 7, 9])# 2 = MAX,逐元素取最大值 max_result = torch.maximum(a, b) print("Elementwise max result:", max_result) # Elementwise max result: tensor([4, 5, 6])
31.ELU: 应用指数线性单元(ELU)激活函数。
if x < 0 y = (exp(x) - 1) * alpha
else y = x
- one_blob_only
- support_inplace
param id | name | type | default | description |
0 | alpha | float | 0.1f |
32.Embed: 将输入数据映射到低维空间。
y = embedding(x)
param id | name | type | default | description |
0 | num_output | int | 0 | |
1 | input_dim | int | 0 | |
2 | bias_term | int | 0 | |
3 | weight_data_size | int | 0 |
weight | type | shape |
weight_data | float | [weight_data_size] |
bias_term | float | [num_output] |
import torch
import torch.nn as nn# 假设我们有10个不同的词,需要将它们映射成一个5维的稠密向量
vocab_size = 10
embedding_dim = 5# 创建一个Embedding层
embedding = nn.Embedding(num_embeddings=vocab_size, embedding_dim=embedding_dim)# 定义一个输入,假设我们要获取ID为3和7的词的向量表示
input_ids = torch.LongTensor([3, 7])# 通过Embedding层获取对应词的向量表示
output = embedding(input_ids)print(output)
# tensor([[-0.4583, 2.2385, 1.1503, 0.4575, -0.5081],
# [ 2.1852, -1.2893, 0.6631, 0.1552, 1.6735]],
# grad_fn=<EmbeddingBackward0>)
33.Exp: 计算输入数据的指数。
if base == -1 y = exp(shift + x * scale)
else y = pow(base, (shift + x * scale))
- one_blob_only
- support_inplace
param id | name | type | default | description |
0 | base | float | -1.f | |
1 | scale | float | 1.f | |
2 | shift | float | 0.f |
34.Flatten: 将输入数据展平为一维。
Reshape blob to 1 dimension(将其重塑为一维数组。)
- one_blob_only
import torch# 创建一个3维张量,例如(2, 3, 4),表示(batch_size, channels, height, width) input_tensor = torch.randn(2, 3, 4)# 使用torch.flatten()将张量展平 output_tensor1 = torch.flatten(input_tensor, start_dim=0)# 使用torch.flatten()将张量展平 output_tensor2 = input_tensor.view(2*3*4)print("Input Tensor shape:", input_tensor.shape) print("Flattened Tensor shape:", output_tensor1.shape) print("view Tensor shape:", output_tensor2.shape) # Input Tensor shape: torch.Size([2, 3, 4]) # Flattened Tensor shape: torch.Size([24]) # view Tensor shape: torch.Size([24])
35.Fold: 折叠操作
y = fold(x)
- one_blob_only
param id | name | type | default | description |
0 | num_output | int | 0 | |
1 | kernel_w | int | 0 | |
2 | dilation_w | int | 1 | |
3 | stride_w | int | 1 | |
4 | pad_left | int | 0 | |
11 | kernel_h | int | kernel_w | |
12 | dilation_h | int | dilation_w | |
13 | stride_h | int | stride_w | |
14 | pad_top | int | pad_left | |
15 | pad_right | int | pad_left | |
16 | pad_bottom | int | pad_top | |
20 | output_w | int | 0 | |
21 | output_h | int | output_w |
import torch# 创建一个4x4的张量
x = torch.arange(1, 17).view(4, 4)
print("Original tensor:")
# Original tensor:
# tensor([[ 1, 2, 3, 4],
# [ 5, 6, 7, 8],
# [ 9, 10, 11, 12],
# [13, 14, 15, 16]])
# 对张量进行fold操作 4x4 =16 分成 2x8 或者8x2 、1x16 、2x2x2x2其他 等等
folded_tensor = x.view(2,2,2,2)
print("Folded tensor:")
# Folded tensor:
# tensor([[[[ 1, 2],
# [ 3, 4]],
# [[ 5, 6],
# [ 7, 8]]],
# [[[ 9, 10],
# [11, 12]],
# [[13, 14],
# [15, 16]]]])
36.GELU: 应用高斯误差线性单元(GELU)激活函数。
if fast_gelu == 1 y = 0.5 * x * (1 + tanh(0.79788452 * (x + 0.044715 * x * x * x)));
else y = 0.5 * x * erfc(-0.70710678 * x)
- one_blob_only
- support_inplace
param id | name | type | default | description |
0 | fast_gelu | int | 0 | use approximation |
37.GLU: 应用门控线性单元(GLU)激活函数。
If axis < 0, we use axis = x.dims + axis
where a is the first half of the input matrix and b is the second half.
axis specifies the dimension to split the input
a 是输入矩阵的前一半,b 是后一半。
axis 参数用于指定沿着哪个维度(dimension)对输入矩阵进行分割。
- one_blob_only
param id | name | type | default | description |
0 | axis | int | 0 |
38.Gemm: 执行矩阵乘法操作。
a = transA ? transpose(x0) : x0
b = transb ? transpose(x1) : x1
c = x2
y = (gemm(a, b) + c * beta) * alpha
param id | name | type | default | description |
0 | alpha | float | 1.f | |
1 | beta | float | 1.f | |
2 | transA | int | 0 | |
3 | transb | int | 0 | |
4 | constantA | int | 0 | |
5 | constantB | int | 0 | |
6 | constantC | int | 0 | |
7 | constantM | int | 0 | |
8 | constantN | int | 0 | |
9 | constantK | int | 0 | |
10 | constant_broadcast_type_C | int | 0 | |
11 | output_N1M | int | 0 | |
12 | output_elempack | int | 0 | |
13 | output_elemtype | int | 0 | |
14 | output_transpose | int | 0 | |
20 | constant_TILE_M | int | 0 | |
21 | constant_TILE_N | int | 0 | |
22 | constant_TILE_K | int | 0 |
weight | type | shape |
A_data | float | [M, K] or [K, M] |
B_data | float | [N, K] or [K, N] |
C_data | float | [1], [M] or [N] or [1, M] or [N,1] or [N, M] |
import torch# 创建两个矩阵
A = torch.tensor([[1, 2], [3, 4]])
B = torch.tensor([[5, 6], [7, 8]])# 执行矩阵乘法
C = torch.matmul(A, B)print("Matrix A:")
print("Matrix B:")
print("Result of Matrix Multiplication:")
# Matrix A:
# tensor([[1, 2],
# [3, 4]])
# Matrix B:
# tensor([[5, 6],
# [7, 8]])
# Result of Matrix Multiplication:
# tensor([[19, 22],
# [43, 50]])
39.GridSample: 在输入的网格上进行采样操作。
根据输入的采样网格(sampling grid)中指定的坐标,在输入张量上进行采样,输出对应的插值结果
Given an input and a flow-field grid, computes the output using input values and pixel locations from grid.For each output location output[:, h2, w2], the size-2 vector grid[h2, w2, 2] specifies input pixel[:, h1, w1] locations x and y,
which are used to interpolate the output value output[:, h2, w2]This function is often used in conjunction with affine_grid() to build Spatial Transformer Networks .
对于每个输出位置 output[:, h2, w2],大小为2的向量 grid[h2, w2, 2] 指定了输入像素[:, h1, w1] 的位置 x 和 y,用于进行输出值 output[:, h2, w2] 的插值计算。
这个函数通常与 affine_grid() 一起使用,用于构建空间变换网络(Spatial Transformer Networks)。
param id | name | type | default | description |
0 | sample_type | int | 1 | |
1 | padding_mode | int | 1 | |
2 | align_corner | int | 0 | |
3 | permute_fusion | int | 0 | fuse with permute |
Sample type:
- 1 = Nearest
- 2 = Bilinear
- 3 = Bicubic
Padding mode:
- 1 = zeros
- 2 = border
- 3 = reflection
#引用 import torch from torch.nn import functional as Finp = torch.randint(10, 20, (1, 1, 20, 20)).float() print('inp.shape:', inp.shape)# 得到一个长宽为20的tensor out_h = 40 out_w = 40# 生成grid点 grid_h = torch.linspace(-1, 1, out_h).view(1, -1, 1).expand(1, out_h, out_w) grid_w = torch.linspace(-1, 1, out_w).view(1, 1, -1).expand(1, out_h, out_w) grid = torch.stack((grid_h, grid_w), dim=3) # grid的形状为 [1, 20, 20, 2]outp = F.grid_sample(inp, grid=grid, mode='bilinear') print(outp.shape) # torch.Size([1, 1, 20, 20])print("Input tensor:") print(inp)print("Output tensor after grid sampling:") print(outp) # inp.shape: torch.Size([1, 1, 20, 20]) # torch.Size([1, 1, 40, 40]) # Input tensor: # tensor([[[[16., 17., 16., 10., 16., 11., 13., 17., 16., 15., 10., 10., 13., 17., # 11., 19., 12., 11., 10., 12.], # [12., 15., 17., 16., 13., 13., 16., 19., 18., 10., 11., 13., 19., 14., # 14., 18., 14., 11., 10., 15.], # [12., 11., 18., 10., 15., 15., 17., 10., 10., 14., 18., 15., 12., 16., # 10., 18., 16., 16., 10., 16.], # [17., 17., 12., 11., 16., 16., 10., 16., 17., 16., 13., 10., 18., 18., # 17., 17., 17., 10., 16., 19.], # [14., 15., 16., 19., 12., 12., 11., 10., 16., 12., 16., 10., 17., 10., # 12., 18., 19., 13., 13., 16.], # [15., 19., 17., 18., 15., 16., 15., 10., 19., 15., 11., 16., 18., 14., # 19., 10., 13., 16., 18., 19.], # [13., 13., 14., 11., 15., 13., 18., 14., 10., 13., 13., 11., 17., 13., # 17., 13., 10., 12., 14., 10.], # [12., 10., 17., 16., 17., 10., 18., 15., 14., 13., 13., 10., 17., 16., # 19., 13., 14., 10., 17., 12.], # [12., 14., 18., 15., 16., 14., 13., 14., 13., 13., 17., 11., 15., 18., # 19., 14., 12., 14., 12., 14.], # [12., 13., 17., 14., 18., 16., 14., 16., 14., 15., 19., 13., 19., 17., # 12., 18., 15., 12., 16., 11.], # [10., 19., 12., 13., 12., 17., 14., 13., 19., 19., 12., 13., 17., 17., # 14., 17., 11., 14., 18., 12.], # [10., 19., 19., 11., 16., 16., 15., 17., 10., 13., 16., 10., 17., 10., # 15., 11., 11., 17., 15., 17.], # [13., 12., 10., 11., 11., 16., 16., 16., 10., 10., 13., 19., 14., 13., # 18., 15., 12., 19., 14., 16.], # [16., 13., 11., 11., 12., 16., 12., 16., 10., 16., 11., 19., 19., 12., # 11., 15., 11., 15., 12., 17.], # [17., 12., 17., 10., 15., 12., 13., 16., 14., 15., 19., 17., 17., 12., # 10., 18., 19., 12., 15., 13.], # [10., 15., 16., 10., 13., 19., 17., 19., 18., 18., 12., 14., 13., 12., # 18., 17., 12., 17., 14., 17.], # [13., 10., 15., 19., 19., 14., 11., 14., 11., 13., 19., 10., 10., 13., # 16., 11., 15., 13., 18., 15.], # [19., 10., 15., 15., 13., 13., 15., 13., 15., 18., 13., 10., 14., 10., # 13., 14., 16., 12., 17., 12.], # [12., 10., 17., 15., 19., 12., 19., 11., 14., 19., 16., 11., 17., 14., # 15., 12., 12., 14., 18., 15.], # [12., 15., 14., 18., 19., 19., 17., 11., 11., 12., 13., 19., 17., 19., # 10., 17., 15., 18., 14., 10.]]]]) # Output tensor after grid sampling: # tensor([[[[ 4.0000, 7.9744, 6.9487, ..., 6.0000, 6.0000, 3.0000], # [ 8.0064, 15.9619, 13.9237, ..., 12.0048, 12.0376, 6.0192], # [ 8.2628, 16.4878, 14.9757, ..., 12.1954, 13.5432, 6.7885], # ..., # [ 5.4744, 10.9670, 11.6967, ..., 14.4545, 12.1599, 6.0513], # [ 5.9872, 12.0123, 13.5311, ..., 12.6727, 10.1152, 5.0256], # [ 3.0000, 6.0192, 6.7885, ..., 6.3141, 5.0321, 2.5000]]]])
40.GroupNorm: 对神经网络中的特征图执行分组归一化。
split x along channel axis into group x0, x1 ...
l2 normalize for each group x0, x1 ...
y = x * gamma + beta
- one_blob_only
- support_inplace
param id | name | type | default | description |
0 | group | int | 1 | |
1 | channels | int | 0 | |
2 | eps | float | 0.001f | x = x / sqrt(var + eps) |
3 | affine | int | 1 |
weight | type | shape |
gamma_data | float | [channels] |
beta_data | float | [channels] |
import torch
import torch.nn as nn# 定义一个输入张量
input_tensor = torch.randn(1, 6, 4, 4) # (batch_size, num_channels, height, width)# 使用GroupNorm,假设分成2组
num_groups = 2
group_norm = nn.GroupNorm(num_groups, 6) # num_groups为组数,6为输入通道数# 对输入张量进行GroupNorm操作
output = group_norm(input_tensor)# 打印输入输出形状
print("Input shape:", input_tensor.shape)
print("Output shape after GroupNorm:", output.shape)
# Input shape: torch.Size([1, 6, 4, 4])
# Output shape after GroupNorm: torch.Size([1, 6, 4, 4])
41.GRU: 门控循环单元(GRU)神经网络层。
Apply a single-layer GRU to a feature sequence of T
timesteps. The input blob shape is [w=input_size, h=T]
and the output blob shape is [w=num_output, h=T]
y = gru(x)
y0, hidden y1 = gru(x0, hidden x1)
- one_blob_only if bidirectional
param id | name | type | default | description |
0 | num_output | int | 0 | hidden size of output |
1 | weight_data_size | int | 0 | total size of weight matrix |
2 | direction | int | 0 | 0=forward, 1=reverse, 2=bidirectional |
weight | type | shape |
weight_xc_data | float/fp16/int8 | [input_size, num_output * 3, num_directions] |
bias_c_data | float/fp16/int8 | [num_output, 4, num_directions] |
weight_hc_data | float/fp16/int8 | [num_output, num_output * 3, num_directions] |
Direction flag:
- 0 = forward only
- 1 = reverse only
- 2 = bidirectional
import torch import torch.nn as nn# 假设输入维度为3,隐藏单元数为4 input_size = 3 hidden_size = 4# 定义一个GRU层 gru = nn.GRU(input_size, hidden_size) # 默认情况下,没有指定层数,默认为单层# 定义一个输入序列,假设序列长度为2,批量大小为1 input_seq = torch.randn(2, 1, 3) # (seq_len, batch_size, input_size)# 初始化隐藏状态 hidden = torch.zeros(1, 1, 4) # (num_layers, batch_size, hidden_size)# 将输入序列传递给GRU层 output, hidden = gru(input_seq, hidden)# 打印输出和隐藏状态的形状 print("Output shape:", output.shape) # (seq_len, batch_size, num_directions * hidden_size) print("Hidden state shape:", hidden.shape) # (num_layers * num_directions, batch, hidden_size) # Output shape: torch.Size([2, 1, 4]) # Hidden state shape: torch.Size([1, 1, 4])
42.HardSigmoid: 应用硬Sigmoid激活函数。
在神经网络中通常用于限制神经元的激活范围。与标准的 Sigmoid 函数相比,HardSigmoid
y = clamp(x * alpha + beta, 0, 1)
- one_blob_only
- support_inplace
param id | name | type | default | description |
0 | alpha | float | 0.2f | |
1 | beta | float | 0.5f |
import torch
import torch.nn.functional as F# 定义输入张量
input_tensor = torch.randn(3, 4) # 假设输入张量大小为3x4# 使用HardSigmoid激活函数
output = F.hardsigmoid(input_tensor) # HardSigmoid(x) = clip(0.2*x + 0.5, 0, 1)# 打印输入和输出张量
print("Input tensor:")
# Input tensor:
# tensor([[ 0.5026, 0.6612, -0.0961, 1.9332],
# [-0.8780, -0.4930, -0.2804, -0.0440],
# [ 1.2866, -1.9575, 0.7738, -0.8340]])
print("\nOutput tensor after HardSigmoid:")
# Output tensor after HardSigmoid:
# tensor([[0.5838, 0.6102, 0.4840, 0.8222],
# [0.3537, 0.4178, 0.4533, 0.4927],
# [0.7144, 0.1738, 0.6290, 0.3610]])
43.HardSwish: 应用硬Swish激活函数。
y = x * clamp(x * alpha + beta, 0, 1)
- one_blob_only
- support_inplace
param id | name | type | default | description |
0 | alpha | float | 0.2f | |
1 | beta | float | 0.5f |
import torch
import torch.nn.functional as F# 定义 HardSwish 激活函数
def hardswish(x):return x * F.hardsigmoid(x + 3, inplace=True)# 创建一个张量作为输入
input_tensor = torch.randn(3, 4) # 假设输入张量大小为 3x4# 应用 HardSwish 激活函数
output = hardswish(input_tensor)# 打印输入张量和输出张量
print("Input tensor:")
print("\nOutput tensor after HardSwish:")
# Input tensor:
# tensor([[ 0.4330, -1.9232, 1.9127, 0.6024],
# [-0.2073, 0.1116, -0.6153, 0.5362],
# [-1.4893, 0.0764, -0.1484, -0.0945]])
# Output tensor after HardSwish:
# tensor([[ 0.4330, -1.3068, 1.9127, 0.6024],
# [-0.2001, 0.1116, -0.5522, 0.5362],
# [-1.1197, 0.0764, -0.1447, -0.0930]])
44.InnerProduct: 执行全连接操作。
x2 = innerproduct(x, weight) + bias
y = activation(x2, act_type, act_params)
- one_blob_only
param id | name | type | default | description |
0 | num_output | int | 0 | |
1 | bias_term | int | 0 | |
2 | weight_data_size | int | 0 | |
8 | int8_scale_term | int | 0 | |
9 | activation_type | int | 0 | |
10 | activation_params | array | [ ] |
weight | type | shape |
weight_data | float/fp16/int8 | [num_input, num_output] |
bias_data | float | [num_output] |
weight_data_int8_scales | float | [num_output] |
bottom_blob_int8_scales | float | [1] |
import torch
import torch.nn as nnclass InnerProduct(nn.Module):def __init__(self, in_features, out_features):super(InnerProduct, self).__init__()self.fc = nn.Linear(in_features, out_features)def forward(self, x):return self.fc(x)# 创建一个 InnerProduct 层
inner_product_layer = InnerProduct(100, 200) # 假设输入特征维度为 100,输出特征维度为 200# 定义输入数据
input_data = torch.randn(1, 100) # 假设输入数据为 1 组,每组包含 100 个特征# 运行 InnerProduct 层
output = inner_product_layer(input_data)
print(output.shape) # 输出特征的形状
# torch.Size([1, 200])
45.Input: 神经网络的输入层
y = input
- support_inplace
param id | name | type | default | description |
0 | w | int | 0 | |
1 | h | int | 0 | |
11 | d | int | 0 | |
2 | c | int | 0 |
46.InstanceNorm: 归一化操作
split x along channel axis into instance x0, x1 ...
l2 normalize for each channel instance x0, x1 ...
y = x * gamma + beta
- one_blob_only
- support_inplace
param id | name | type | default | description |
0 | channels | int | 0 | |
1 | eps | float | 0.001f | x = x / sqrt(var + eps) |
2 | affine | int | 1 |
weight | type | shape |
gamma_data | float | [channels] |
beta_data | float | [channels] |
import torch
import torch.nn as nn# 创建一个实例归一化层
instance_norm_layer = nn.InstanceNorm2d(3) # 通道数为 3# 随机生成一组特征图作为输入数据
input_data = torch.randn(1, 3, 224, 224) # 假设输入数据为 1 组,通道数为 3,图像尺寸为 224x224# 运行实例归一化层
output = instance_norm_layer(input_data)print(output.shape) # 输出特征的形状
# torch.Size([1, 3, 224, 224])
47.Interp: 执行插值操作
if dynamic_target_size == 0 y = resize(x) by fixed size or scale
else y = resize(x0, size(x1))
- one_blob_only if dynamic_target_size == 0
param id | name | type | default | description |
0 | resize_type | int | 0 | |
1 | height_scale | float | 1.f | |
2 | width_scale | float | 1.f | |
3 | output_height | int | 0 | |
4 | output_width | int | 0 | |
5 | dynamic_target_size | int | 0 | |
6 | align_corner | int | 0 |
Resize type:
- 1 = Nearest 最近邻插值 最近邻插值是一种简单的插值方法,它将目标图像中每个像素的值设置为其在原始图像中最近的像素的值。这种方法适用于像素级别的映射,但可能会导致图像呈现边缘锯齿状的情况
- 2 = Bilinear 双线性插值 双线性插值是一种常见的插值方法,它根据目标图像中的位置对原始图像中的四个最近像素进行线性插值。这种方法能够提供比最近邻插值更平滑的图像结果。
- 3 = Bicubic 双三次插值 双三次插值是一种更复杂的插值方法,它会在目标图像的像素周围选择16个像素进行加权平均,以生成新像素的值。这种方法在保留图像细节的同时,也会增加计算复杂度
import torch import torch.nn.functional as F# 创建一个随机的特征图作为输入数据 input_data = torch.randn(1, 3, 224, 224) # 假设输入数据为 1 组,通道数为 3,图像尺寸为 224x224# 执行双线性插值将图像大小调整到 300x300 output = F.interpolate(input_data, size=(300, 300), mode='bilinear', align_corners=False)print(output.shape) # 输出特征的形状 # torch.Size([1, 3, 300, 300])
48.LayerNorm: 对神经网络中的层执行归一化操作
是一种用于神经网络中的归一化技术,与 Batch Normalization 不同,Layer Normalization 是对单个样本的特征进行标准化,而不是对整个批次。层归一化有助于减少内部协变量偏移,从而加速网络训练过程并提高泛化性能
split x along outmost axis into part x0, x1 ...
l2 normalize for each part x0, x1 ...
y = x * gamma + beta by elementwise
- one_blob_only
- support_inplace
param id | name | type | default | description |
0 | affine_size | int | 0 | |
1 | eps | float | 0.001f | x = x / sqrt(var + eps) |
2 | affine | int | 1 |
weight | type | shape |
gamma_data | float | [affine_size] |
beta_data | float | [affine_size] |
import torch
import torch.nn as nn# 创建一个层归一化模块
layer_norm = nn.LayerNorm(256) # 输入特征的尺寸为 256# 随机生成一组特征作为输入数据
input_data = torch.randn(4, 256) # 假设输入数据为 4 组,每组特征的尺寸为 256# 运行层归一化模块
output = layer_norm(input_data)print(output.shape) # 输出特征的形状
# torch.Size([4, 256])
49.Log: 计算输入数据的自然对数。
if base == -1 y = log(shift + x * scale)
else y = log(shift + x * scale) / log(base)
- one_blob_only
- support_inplace
param id | name | type | default | description |
0 | base | float | -1.f | |
1 | scale | float | 1.f | |
2 | shift | float | 0.f |
50.LRN: 局部响应归一化层。
一种局部归一化的方法,用于一些深度学习模型中,旨在模拟生物神经元系统中的侧抑制机制。LRN 主要用于提升模型的泛化能力,防止模型过拟合
if region_type == ACROSS_CHANNELS square_sum = sum of channel window of local_size
if region_type == WITHIN_CHANNEL square_sum = sum of spatial window of local_size
y = x * pow(bias + alpha * square_sum / (local_size * local_size), -beta)
- one_blob_only
- support_inplace
param id | name | type | default | description |
0 | region_type | int | 0 | |
1 | local_size | int | 5 | |
2 | alpha | float | 1.f | |
3 | beta | float | 0.75f | |
4 | bias | float | 1.f |
Region type:
import torch import torch.nn as nnclass LRN(nn.Module):def __init__(self, size=5, alpha=1e-4, beta=0.75, k=1.0):super(LRN, self).__init__()self.size = sizeself.alpha = alphaself.beta = betaself.k = kdef forward(self, x):squared = x.pow(2)pool = nn.functional.avg_pool2d(squared, self.size, stride=1, padding=self.size//2)denom = self.k + self.alpha * pooloutput = x / denom.pow(self.beta)return output# 创建一个 LRN 模块实例 lrn = LRN(size=3, alpha=1e-4, beta=0.75, k=1.0)# 随机生成一组特征作为输入数据 input_data = torch.randn(1, 3, 224, 224) # 假设输入数据为 1 组,通道数为 3,图像尺寸为 224x224# 运行 LRN 模块 output = lrn(input_data)print(output.shape) # 输出特征的形状 # torch.Size([1, 3, 224, 224])
51.LSTM: 长短期记忆(LSTM)神经网络层。
是一种常用的循环神经网络(RNN)变体,专门设计用来解决传统 RNN 中遇到的长期依赖问题。LSTM 的设计使其能够更好地捕捉和利用长期序列中的依赖关系,适用于处理时间序列数据、自然语言处理等任务。
Apply a single-layer LSTM to a feature sequence of T
timesteps. The input blob shape is [w=input_size, h=T]
and the output blob shape is [w=num_output, h=T]
y = lstm(x)
y0, hidden y1, cell y2 = lstm(x0, hidden x1, cell x2)
- one_blob_only if bidirectional
param id | name | type | default | description |
0 | num_output | int | 0 | output size of output |
1 | weight_data_size | int | 0 | total size of IFOG weight matrix |
2 | direction | int | 0 | 0=forward, 1=reverse, 2=bidirectional |
3 | hidden_size | int | num_output | hidden size |
weight | type | shape |
weight_xc_data | float/fp16/int8 | [input_size, hidden_size * 4, num_directions] |
bias_c_data | float/fp16/int8 | [hidden_size, 4, num_directions] |
weight_hc_data | float/fp16/int8 | [num_output, hidden_size * 4, num_directions] |
weight_hr_data | float/fp16/int8 | [hidden_size, num_output, num_directions] |
Direction flag:
- 0 = forward only
- 1 = reverse only
- 2 = bidirectional
52.MemoryData: 用于存储数据并生成数据迭代器。
用于在模型中定义一个固定大小的内存数据块。MemoryData 层通常用于存储一些固定的参数或中间数据,以便在模型前向推理过程中进行使用。
y = data
param id | name | type | default | description |
0 | w | int | 0 | |
1 | h | int | 0 | |
11 | d | int | 0 | |
2 | c | int | 0 | |
21 | load_type | int | 1 | 1=fp32 |
weight | type | shape |
data | float | [w, h, d, c] |
53.Mish: 应用Mish激活函数。
Mish 激活函数的形式相对简单,但由于其使用了双曲正切函数和软加函数的组合,可以在一定程度上克服一些常见激活函数的问题,如梯度消失和梯度爆炸。
y = x * tanh(log(exp(x) + 1))
- one_blob_only
- support_inplace
54.MultiHeadAttention: 多头注意力机制。
split q k v into num_head part q0, k0, v0, q1, k1, v1 ...
for each num_head partxq = affine(q) / (embed_dim / num_head)xk = affine(k)xv = affine(v)xqk = xq * xkxqk = xqk + attn_mask if attn_mask existssoftmax_inplace(xqk)xqkv = xqk * xvmerge xqkv to out
y = affine(out)
param id | name | type | default | description |
0 | embed_dim | int | 0 | |
1 | num_heads | int | 1 | |
2 | weight_data_size | int | 0 | |
3 | kdim | int | embed_dim | |
4 | vdim | int | embed_dim | |
5 | attn_mask | int | 0 |
weight | type | shape |
q_weight_data | float/fp16/int8 | [weight_data_size] |
q_bias_data | float | [embed_dim] |
k_weight_data | float/fp16/int8 | [embed_dim * kdim] |
k_bias_data | float | [embed_dim] |
v_weight_data | float/fp16/int8 | [embed_dim * vdim] |
v_bias_data | float | [embed_dim] |
out_weight_data | float/fp16/int8 | [weight_data_size] |
out_bias_data | float | [embed_dim] |
55.MVN: 均值方差归一化操作。
if normalize_variance == 1 && across_channels == 1 y = (x - mean) / (sqrt(var) + eps) of whole blob
if normalize_variance == 1 && across_channels == 0 y = (x - mean) / (sqrt(var) + eps) of each channel
if normalize_variance == 0 && across_channels == 1 y = x - mean of whole blob
if normalize_variance == 0 && across_channels == 0 y = x - mean of each channel
- one_blob_only
param id | name | type | default | description |
0 | normalize_variance | int | 0 | |
1 | across_channels | int | 0 | |
2 | eps | float | 0.0001f | x = x / (sqrt(var) + eps) |
56.Noop: 空操作
y = x
57.Normalize: 归一化操作
if across_spatial == 1 && across_channel == 1 x2 = normalize(x) of whole blob
if across_spatial == 1 && across_channel == 0 x2 = normalize(x) of each channel
if across_spatial == 0 && across_channel == 1 x2 = normalize(x) of each position
y = x2 * scale
- one_blob_only
- support_inplace
param id | name | type | default | description |
0 | across_spatial | int | 0 | |
1 | channel_shared | int | 0 | |
2 | eps | float | 0.0001f | see eps mode |
3 | scale_data_size | int | 0 | |
4 | across_channel | int | 0 | |
9 | eps_mode | int | 0 |
weight | type | shape |
scale_data | float | [scale_data_size] |
Eps Mode:
- 0 = caffe/mxnet x = x / sqrt(var + eps)
- 1 = pytorch x = x / max(sqrt(var), eps)
- 2 = tensorflow x = x / sqrt(max(var, eps))
58.Packing: 打包操作
y = wrap_packing(x)
- one_blob_only
param id | name | type | default | description |
0 | out_elempack | int | 1 | |
1 | use_padding | int | 0 | |
2 | cast_type_from | int | 0 | |
3 | cast_type_to | int | 0 | |
4 | storage_type_from | int | 0 | |
5 | storage_type_to | int | 0 |
59.Padding: 填充操作
y = pad(x, pads)
param id | name | type | default | description |
0 | top | int | 0 | |
1 | bottom | int | 0 | |
2 | left | int | 0 | |
3 | right | int | 0 | |
4 | type | int | 0 | |
5 | value | float | 0 | |
6 | per_channel_pad_data_size | int | 0 | |
7 | front | int | stride_w | |
8 | behind | int | pad_left |
weight | type | shape |
per_channel_pad_data | float | [per_channel_pad_data_size] |
Padding type:
60.Permute: 置换操作
y = reorder(x)
param id | name | type | default | description |
0 | order_type | int | 0 |
Order Type:排列类型如下( W-宽 H-高 C-通道 D-次数)
- 2 = WCH WDHC
- 3 = CWH DWHC
- 4 = HCW HDWC
- 5 = CHW DHWC
- 6 = WHCD
- 7 = HWCD
- 8 = WCHD
- 9 = CWHD
- 10 = HCWD
- 11 = CHWD
- 12 = WDCH
- 13 = DWCH
- 14 = WCDH
- 15 = CWDH
- 16 = DCWH
- 17 = CDWH
- 18 = HDCW
- 19 = DHCW
- 20 = HCDW
- 21 = CHDW
- 22 = DCHW
- 23 = CDHW
61.PixelShuffle: 像素重组
if mode == 0 y = depth_to_space(x) where x channel order is sw-sh-outc
if mode == 1 y = depth_to_space(x) where x channel order is outc-sw-sh
- one_blob_only
param id | name | type | default | description |
0 | upscale_factor | int | 1 | |
1 | mode | int | 0 |
PixelShuffle 操作将输入张量中的通道分组,然后对每个分组内的像素进行重排,从而增加图像的分辨率。在每个分组内部,PixelShuffle 操作会将多个低分辨率通道重组成一个高分辨率通道。
PixelShuffle 的主要优点是可以在不引入额外参数的情况下增加图像的分辨率,这使得神经网络在图像超分辨率重建等任务上表现更加出色
62.Pooling: 池化操作
x2 = pad(x, pads)
x3 = pooling(x2, kernel, stride)
param id | name | type | default | description |
0 | pooling_type | int | 0 | |
1 | kernel_w | int | 0 | |
2 | stride_w | int | 1 | |
3 | pad_left | int | 0 | |
4 | global_pooling | int | 0 | |
5 | pad_mode | int | 0 | |
6 | avgpool_count_include_pad | int | 0 | |
7 | adaptive_pooling | int | 0 | |
8 | out_w | int | 0 | |
11 | kernel_h | int | kernel_w | |
12 | stride_h | int | stride_w | |
13 | pad_top | int | pad_left | |
14 | pad_right | int | pad_left | |
15 | pad_bottom | int | pad_top | |
18 | out_h | int | out_w |
Pooling type:
- 0 = MAX
- 1 = AVG
Pad mode:
- 0 = full padding
- 1 = valid padding
- 2 = tensorflow padding=SAME or onnx padding=SAME_UPPER
- 3 = onnx padding=SAME_LOWER
63.Pooling1D: 一维池化操作
x2 = pad(x, pads)
x3 = pooling1d(x2, kernel, stride)
param id | name | type | default | description |
0 | pooling_type | int | 0 | |
1 | kernel_w | int | 0 | |
2 | stride_w | int | 1 | |
3 | pad_left | int | 0 | |
4 | global_pooling | int | 0 | |
5 | pad_mode | int | 0 | |
6 | avgpool_count_include_pad | int | 0 | |
7 | adaptive_pooling | int | 0 | |
8 | out_w | int | 0 | |
14 | pad_right | int | pad_left |
Pooling type:
- 0 = MAX
- 1 = AVG
Pad mode:
- 0 = full padding
- 1 = valid padding
- 2 = tensorflow padding=SAME or onnx padding=SAME_UPPER
- 3 = onnx padding=SAME_LOWER
64.Pooling3D: 三维池化操作
x2 = pad(x, pads)
x3 = pooling3d(x2, kernel, stride)
param id | name | type | default | description |
0 | pooling_type | int | 0 | |
1 | kernel_w | int | 0 | |
2 | stride_w | int | 1 | |
3 | pad_left | int | 0 | |
4 | global_pooling | int | 0 | |
5 | pad_mode | int | 0 | |
6 | avgpool_count_include_pad | int | 0 | |
7 | adaptive_pooling | int | 0 | |
8 | out_w | int | 0 | |
11 | kernel_h | int | kernel_w | |
12 | stride_h | int | stride_w | |
13 | pad_top | int | pad_left | |
14 | pad_right | int | pad_left | |
15 | pad_bottom | int | pad_top | |
16 | pad_behind | int | pad_front | |
18 | out_h | int | out_w | |
21 | kernel_d | int | kernel_w | |
22 | stride_d | int | stride_w | |
23 | pad_front | int | pad_left | |
28 | out_d | int | out_w |
Pooling type:
- 0 = MAX
- 1 = AVG
Pad mode:
- 0 = full padding
- 1 = valid padding
- 2 = tensorflow padding=SAME or onnx padding=SAME_UPPER
- 3 = onnx padding=SAME_LOWER
65.Power: 幂运算
y = pow((shift + x * scale), power)
- one_blob_only
- support_inplace
param id | name | type | default | description |
0 | power | float | 1.f | |
1 | scale | float | 1.f | |
2 | shift | float | 0.f |
66.PReLU: 参数化修正线性单元
if x < 0 y = x * slope
else y = x
- one_blob_only
- support_inplace
param id | name | type | default | description |
0 | num_slope | int | 0 |
weight | type | shape |
slope_data | float | [num_slope] |
67.Quantize: 量化操作
y = float2int8(x * scale)
- one_blob_only
param id | name | type | default | description |
0 | scale_data_size | int | 1 |
weight | type | shape |
scale_data | float | [scale_data_size] |
68.Reduction: 执行张量的降维操作
y = reduce_op(x * coeff)
- one_blob_only
param id | name | type | default | description |
0 | operation | int | 0 | |
1 | reduce_all | int | 1 | |
2 | coeff | float | 1.f | |
3 | axes | array | [ ] | |
4 | keepdims | int | 0 | |
5 | fixbug0 | int | 0 | hack for bug fix, should be 1 |
Operation type:
- 0 = SUM (求和):将张量中所有元素相加,得到一个标量值。
- 1 = ASUM(绝对值求和): 将张量中所有元素的绝对值相加,得到一个标量值。
- 2 = SUMSQ (平方和): 将张量中所有元素的平方相加,得到一个标量值。
- 3 = MEAN (均值): 计算张量中所有元素的平均值,得到一个标量值
- 4 = MAX (最大值): 找出张量中的最大值,并返回一个标量值。
- 5 = MIN(最小值): 找出张量中的最小值,并返回一个标量值。
- 6 = PROD(乘积): 计算张量中所有元素的乘积,得到一个标量值。
- 7 = L1 (L1范数):计算张量中所有元素的L1范数(绝对值的和),得到一个标量值。
- 8 = L2(L2范数): 计算张量中所有元素的L2范数(平方和后开根号),得到一个标量值。
- 9 = LogSum(对数求和): 对张量中的元素取对数后相加,得到一个标量值。
- 10 = LogSumExp对数指数求和): 对张量中的元素先分别取指数,再取对数后相加,得到一个标量值。
69.ReLU: 应用修正线性单元(ReLU)激活函数。
if x < 0 y = x * slope
else y = x
- one_blob_only
- support_inplace
param id | name | type | default | description |
0 | slope | float | 0.f |
70.Reorg: 通道重排操作
if mode == 0 y = space_to_depth(x) where x channel order is sw-sh-outc
if mode == 1 y = space_to_depth(x) where x channel order is outc-sw-sh
- one_blob_only
param id | name | type | default | description |
0 | stride | int | 1 | |
1 | mode | int | 0 |
71.Requantize: 重新量化(再量化)
就是对量化的数据进再量化,一般Quantize从f32 到 int8 ,Requantize 从int32 到int8
x2 = x * scale_in + bias
x3 = activation(x2)
y = float2int8(x3 * scale_out)
- one_blob_only
param id | name | type | default | description |
0 | scale_in_data_size | int | 1 | |
1 | scale_out_data_size | int | 1 | |
2 | bias_data_size | int | 0 | |
3 | activation_type | int | 0 | |
4 | activation_params | int | [ ] |
weight | type | shape |
scale_in_data | float | [scale_in_data_size] |
scale_out_data | float | [scale_out_data_size] |
bias_data | float | [bias_data_size] |
72.Reshape: 形状重塑操作
if permute == 1 y = hwc2chw(reshape(chw2hwc(x)))
else y = reshape(x)
- one_blob_only
param id | name | type | default | description |
0 | w | int | -233 | |
1 | h | int | -233 | |
11 | d | int | -233 | |
2 | c | int | -233 | |
3 | permute | int | 0 |
Reshape flag:
- 0 = copy from bottom (当维度值为0时,表示从底部(原始维度)复制维度值。换句话说,保留原始张量的相应维度值)
- -1 = remaining (维度值为-1时,表示保持剩余的维度不变。这意味着在进行reshape操作时,会根据其他指定的维度值,自动计算并保持剩余的维度值)
- -233 = drop this dim(default)(维度值为-233时,表示丢弃该维度。在进行reshape操作时,将会将指定维度值设为-233,这样就会将该维度丢弃,从而改变张量的形状)
73.RNN: 循环神经网络(RNN)层。
Apply a single-layer RNN to a feature sequence of T
timesteps. The input blob shape is [w=input_size, h=T]
and the output blob shape is [w=num_output, h=T]
将单层 RNN 应用于一个包含 T 个时间步的特征序列。输入的数据形状为 [w=input_size, h=T],输出的数据形状为 [w=num_output, h=T]。
y = rnn(x)
y0, hidden y1 = rnn(x0, hidden x1)
- one_blob_only if bidirectional
param id | name | type | default | description |
0 | num_output | int | 0 | hidden size of output |
1 | weight_data_size | int | 0 | total size of weight matrix |
2 | direction | int | 0 | 0=forward, 1=reverse, 2=bidirectional |
weight | type | shape |
weight_xc_data | float/fp16/int8 | [input_size, num_output, num_directions] |
bias_c_data | float/fp16/int8 | [num_output, 1, num_directions] |
weight_hc_data | float/fp16/int8 | [num_output, num_output, num_directions] |
Direction flag:
- 0 = forward only 只允许向前移动
- 1 = reverse only 只允许向后移动
- 2 = bidirectional 允许双向移动
74.Scale: 缩放操作
if scale_data_size == -233 y = x0 * x1
else y = x * scale + bias
- one_blob_only if scale_data_size != -233
- support_inplace
param id | name | type | default | description |
0 | scale_data_size | int | 0 | |
1 | bias_term | int | 0 |
weight | type | shape |
scale_data | float | [scale_data_size] |
bias_data | float | [scale_data_size] |
75.SELU: 应用自归一化激活函数
= 1.0507 和
= 1.67326
自归一化性质(self-normalizing): 在一定条件下,使用SELU激活函数可以使得神经网络自我归一化,有助于缓解梯度消失或爆炸问题,提高网络训练的稳定性。
非线性特性: SELU在激活过程中引入了非线性,有助于神经网络学习复杂的数据模式和特征。
稳定性和鲁棒性: SELU对于输入值的变化相对稳定,在一定程度上增强了网络的鲁棒性。
if x < 0 y = (exp(x) - 1.f) * alpha * lambda
else y = x * lambda
- one_blob_only
- support_inplace
param id | name | type | default | description |
0 | alpha | float | 1.67326324f | |
1 | lambda | float | 1.050700987f |
76.Shrink: 对输入数据进行收缩操作
if x < -lambd y = x + bias
if x > lambd y = x - bias
else y = x
- one_blob_only
- support_inplace
param id | name | type | default | description |
0 | bias | float | 0.0f | |
1 | lambd | float | 0.5f |
77.ShuffleChannel: 通道混洗操作
- 将输入张量按照一定规则分割成若干个通道组。
- 对这些通道组进行重新排列。
- 将重新排列后的通道重新组合成最终的输出张量。
if reverse == 0 y = shufflechannel(x) by group
if reverse == 1 y = shufflechannel(x) by channel / group
- one_blob_only
param id | name | type | default | description |
0 | group | int | 1 | |
1 | reverse | int | 0 |
78.Sigmoid: 应用Sigmoid激活函数
它将任意实数映射到一个取值范围在 0 到 1 之间的实数
y = 1 / (1 + exp(-x))
- one_blob_only
- support_inplace
79.Slice: 分割操作
split x along axis into slices, each part slice size is based on slices array
param id | name | type | default | description |
0 | slices | array | [ ] | 切片数组 |
1 | axis | int | 0 | 轴 |
2 | indices | array | [ ] |
80.Softmax: 应用Softmax激活函数,通常用于分类任务。
softmax(x, axis)
- one_blob_only
- support_inplace
param id | name | type | default | description |
0 | axis | int | 0 | |
1 | fixbug0 | int | 0 | hack for bug fix, should be 1 |
import torch
import torch.nn.functional as F# 定义一个示例原始输出张量
logits = torch.tensor([2.0, 1.0, 0.1])# 使用 torch.nn.functional.softmax 进行Softmax操作
probabilities = F.softmax(logits, dim=0)# 打印转换后的概率分布
# Softmax输出概率分布:
# tensor([0.6590, 0.2424, 0.0986])
81.Softplus: 应用Softplus激活函数。
Softplus函数的特点是它在输入值为负数时会接近于0,而在输入值为正数时会保持增长。与 ReLU 函数类似,Softplus函数也具有非线性特性,有助于增加神经网络的表达能力
y = log(exp(x) + 1)
- one_blob_only
- support_inplace
import torch import torch.nn.functional as F# 定义一个示例输入张量 x = torch.tensor([-2.0, 0.0, 2.0])# 使用 torch.nn.functional.softplus 进行Softplus操作 output = F.softplus(x)# 打印Softplus函数的输出 print("Softplus输出:") print(output) # Softplus输出: # tensor([0.1269, 0.6931, 2.1269])
82.Split: 将输入数据分割为多个部分。
y0, y1 ... = x
83.Swish: swish激活函数
y = x / (1 + exp(-x))
- one_blob_only
- support_inplace
84.TanH: TanH激活函数
y = tanh(x)
- one_blob_only
- support_inplace
85.Threshold: 阈值操作
if x > threshold y = 1
else y = 0
- one_blob_only
- support_inplace
param id | name | type | default | description |
0 | threshold | float | 0.f |
86.Tile: 重复复制
y = repeat tiles along axis for x
- one_blob_only
param id | name | type | default | description |
0 | axis | int | 0 | 轴 |
1 | tiles | int | 1 | 次数 |
2 | repeats | array | [ ] |
import torch# 创建一个示例张量
x = torch.tensor([[1, 2],[3, 4]])# 定义参数
params = {"axis": 0, "tiles": 2, "repeats": [2, 1]}# 获取参数值
axis = params["axis"]
tiles = params["tiles"]
repeats = params["repeats"]# 在指定的轴上重复张量内容
y = x.repeat(repeats[0] if axis == 0 else 1, repeats[1] if axis == 1 else 1)# 输出结果
# tensor([[1, 2],
# [3, 4],
# [1, 2],
# [3, 4]])
87.UnaryOp: 对输入执行一元操作。
y = unaryop(x)
- one_blob_only
- support_inplace
param id | name | type | default | description |
0 | op_type | int | 0 | Operation type as follows |
Operation type:
- 0 = ABS(绝对值):返回输入的绝对值。
- 1 = NEG(负值):返回输入的负值。
- 2 = FLOOR(向下取整):返回不大于输入值的最大整数。
- 3 = CEIL(向上取整):返回不小于输入值的最小整数
- 4 = SQUARE(平方):返回输入值的平方。
- 5 = SQRT(平方根):返回输入的平方根。
- 6 = RSQ(倒数平方根):返回输入值的倒数的平方根。
- 7 = EXP(指数):返回以 e 为底的输入值的指数。
- 8 = LOG(对数):返回输入值的自然对数。
- 9 = SIN(正弦):返回输入值的正弦值。
- 10 = COS(余弦):返回输入值的余弦值。
- 11 = TAN(正切):返回输入值的正切值。
- 12 = ASIN(反正弦):返回输入值的反正弦值
- 13 = ACOS(反余弦):返回输入值的反余弦值。
- 14 = ATAN(反正切):返回输入值的反正切值。
- 15 = RECIPROCAL(倒数):返回输入值的倒数。
- 16 = TANH(双曲正切):返回输入值的双曲正切值。
- 17 = LOG10(以10为底的对数):返回输入值的以10为底的对数。
- 18 = ROUND(四舍五入):返回输入值四舍五入的结果。
- 19 = TRUNC(截断):返回输入值的整数部分。
88.Unfold: 在输入数据上执行展开操作。
y = unfold(x)
- one_blob_only
param id | name | type | default | description |
0 | num_output | int | 0 | |
1 | kernel_w | int | 0 | |
2 | dilation_w | int | 1 | |
3 | stride_w | int | 1 | |
4 | pad_left | int | 0 | |
11 | kernel_h | int | kernel_w | |
12 | dilation_h | int | dilation_w | |
13 | stride_h | int | stride_w | |
14 | pad_top | int | pad_left | |
15 | pad_right | int | pad_left | |
16 | pad_bottom | int | pad_top |
import torch# 创建一个3x3的张量作为示例输入
input_tensor = torch.tensor([[1, 2, 3],[4, 5, 6],[7, 8, 9]])# 在第一个维度上展开,窗口大小为2,步长为1
unfolded_tensor = input_tensor.unfold(0, 2, 1)print('Input Tensor:\n', input_tensor)
# tensor([[1, 2, 3],
# [4, 5, 6],
# [7, 8, 9]])
print('Unfolded Tensor:\n', unfolded_tensor,"\nshape:",unfolded_tensor.shape)
# tensor([[[1, 4],
# [2, 5],
# [3, 6]],
# [[4, 7],
# [5, 8],
# [6, 9]]])
# shape: torch.Size([2, 3, 2])