卷积的计算过程

flyfish
包括手动计算，可视化使用torch.nn.Conv2d实现

示例

import torch
import torch.nn as nn# 定义输入图像
input_image = torch.tensor([[1, 2, 3, 0, 1],[0, 1, 2, 3, 4],[2, 3, 0, 1, 2],[1, 2, 3, 4, 0],[0, 1, 2, 3, 4]
], dtype=torch.float32).unsqueeze(0).unsqueeze(0)  # 添加批次和通道维度
print(input_image.shape)# 定义卷积核
conv_kernel = torch.tensor([[1, 0, -1],[1, 0, -1],[1, 0, -1]
], dtype=torch.float32).unsqueeze(0).unsqueeze(0)  # 添加输入和输出通道维度
print(conv_kernel.shape)
# 创建卷积层
conv_layer = nn.Conv2d(in_channels=1, out_channels=1, kernel_size=3, stride=1, padding=0, bias=False)# 将卷积核的权重设置为自定义值
with torch.no_grad():conv_layer.weight = nn.Parameter(conv_kernel)# 进行卷积操作
output_tensor = conv_layer(input_image)# 打印输入图像
print("输入图像:")
print(input_image.squeeze().numpy())# 打印卷积核
print("卷积核:")
print(conv_kernel.squeeze().numpy())# 打印输出结果
print("输出结果:")
print(output_tensor.squeeze().detach().numpy())

torch.Size([1, 1, 5, 5])
torch.Size([1, 1, 3, 3])
# 输入图像:
[[1. 2. 3. 0. 1.][0. 1. 2. 3. 4.][2. 3. 0. 1. 2.][1. 2. 3. 4. 0.][0. 1. 2. 3. 4.]]
卷积核:
[[ 1.  0. -1.][ 1.  0. -1.][ 1.  0. -1.]]
输出结果:
[[-2.  2. -2.][-2. -2. -1.][-2. -2. -1.]]

输入图像和卷积核

输入图像 $I$ :
$\begin{bmatrix} 1 & 2 & 3 & 0 & 1 \\ 0 & 1 & 2 & 3 & 4 \\ 2 & 3 & 0 & 1 & 2 \\ 1 & 2 & 3 & 4 & 0 \\ 0 & 1 & 2 & 3 & 4 \\ \end{bmatrix}$
卷积核 $K$ :
$\begin{bmatrix} 1 & 0 & -1 \\ 1 & 0 & -1 \\ 1 & 0 & -1 \\ \end{bmatrix}$

手动计算卷积

我们将逐个计算每个位置的卷积结果：

位置 (0, 0)： $\begin{bmatrix} 1 & 2 & 3 \\ 0 & 1 & 2 \\ 2 & 3 & 0 \\ \end{bmatrix} \odot \begin{bmatrix} 1 & 0 & -1 \\ 1 & 0 & -1 \\ 1 & 0 & -1 \\ \end{bmatrix} = (1 \cdot 1 + 2 \cdot 0 + 3 \cdot (-1)) + (0 \cdot 1 + 1 \cdot 0 + 2 \cdot (-1)) + (2 \cdot 1 + 3 \cdot 0 + 0 \cdot (-1)) = (1 - 3) + (-2) + (2) \\= -2$
位置 (0, 1)： $\begin{bmatrix} 2 & 3 & 0 \\ 1 & 2 & 3 \\ 3 & 0 & 1 \\ \end{bmatrix} \odot \begin{bmatrix} 1 & 0 & -1 \\ 1 & 0 & -1 \\ 1 & 0 & -1 \\ \end{bmatrix} = (2 \cdot 1 + 3 \cdot 0 + 0 \cdot (-1)) + (1 \cdot 1 + 2 \cdot 0 + 3 \cdot (-1)) + (3 \cdot 1 + 0 \cdot 0 + 1 \cdot (-1)) = 2 + (1 - 3) + (3 - 1) \\= 2$
位置 (0, 2)： $\begin{bmatrix} 3 & 0 & 1 \\ 2 & 3 & 4 \\ 0 & 1 & 2 \\ \end{bmatrix} \odot \begin{bmatrix} 1 & 0 & -1 \\ 1 & 0 & -1 \\ 1 & 0 & -1 \\ \end{bmatrix} = (3 \cdot 1 + 0 \cdot 0 + 1 \cdot (-1)) + (2 \cdot 1 + 3 \cdot 0 + 4 \cdot (-1)) + (0 \cdot 1 + 1 \cdot 0 + 2 \cdot (-1)) = 3 - 1 + 2 - 4 - 2 \\= -2$
位置 (1, 0)： $\begin{bmatrix} 0 & 1 & 2 \\ 2 & 3 & 0 \\ 1 & 2 & 3 \\ \end{bmatrix} \odot \begin{bmatrix} 1 & 0 & -1 \\ 1 & 0 & -1 \\ 1 & 0 & -1 \\ \end{bmatrix} = (0 \cdot 1 + 1 \cdot 0 + 2 \cdot (-1)) + (2 \cdot 1 + 3 \cdot 0 + 0 \cdot (-1)) + (1 \cdot 1 + 2 \cdot 0 + 3 \cdot (-1)) = -2 + 2 + 1 - 3 \\= -2$
位置 (1, 1)： $\begin{bmatrix} 1 & 2 & 3 \\ 3 & 0 & 1 \\ 2 & 3 & 4 \\ \end{bmatrix} \odot \begin{bmatrix} 1 & 0 & -1 \\ 1 & 0 & -1 \\ 1 & 0 & -1 \\ \end{bmatrix} \begin{aligned} \\ &= (1 \cdot 1 + 2 \cdot 0 + 3 \cdot (-1)) + (3 \cdot 1 + 0 \cdot 0 + 1 \cdot (-1)) + (2 \cdot 1 + 3 \cdot 0 + 4 \cdot (-1)) \\ &= 1 - 3 + 3 - 1 + 2 - 4 \\ &= -2\end{aligned}$
位置 (1, 2)： $\begin{bmatrix} 2 & 3 & 4 \\ 0 & 1 & 2 \\ 3 & 4 & 0 \\ \end{bmatrix} \odot \begin{bmatrix} 1 & 0 & -1 \\ 1 & 0 & -1 \\ 1 & 0 & -1 \\ \end{bmatrix} \\ = (2 \cdot 1 + 3 \cdot 0 + 4 \cdot (-1)) + (0 \cdot 1 + 1 \cdot 0 + 2 \cdot (-1)) + (3 \cdot 1 + 4 \cdot 0 + 0 \cdot (-1)) \\ = -2 - 2 + 3 \\ = -1$
位置 (2, 0)： $\begin{bmatrix} 2 & 3 & 0 \\ 1 & 2 & 3 \\ 0 & 1 & 2 \\ \end{bmatrix} \odot \begin{bmatrix} 1 & 0 & -1 \\ 1 & 0 & -1 \\ 1 & 0 & -1 \\ \end{bmatrix} = (2 \cdot 1 + 3 \cdot 0 + 0 \cdot (-1)) + (1 \cdot 1 + 2 \cdot 0 + 3 \cdot (-1)) + (0 \cdot 1 + 1 \cdot 0 + 2 \cdot (-1)) \\= 2 + (1 - 3) - 2 \\= -2$
位置 (2, 1)：$ $\begin{bmatrix} 3 & 0 & 1 \\ 2 & 3 & 4 \\ 1 & 2 & 3 \\ \end{bmatrix} \odot \begin{bmatrix} 1 & 0 & -1 \\ 1 & 0 & -1 \\ 1 & 0 & -1 \\ \end{bmatrix} \\= (3 \cdot 1 + 0 \cdot 0 + 1 \cdot (-1)) + (2 \cdot 1 + 3 \cdot 0 + 4 \cdot (-1)) + (1 \cdot 1 + 2 \cdot 0 + 3 \cdot (-1)) = 3 - 1 + 2 - 4 + 1 - 3 \\= -2$
位置 (2, 2)： $\begin{bmatrix} 0 & 1 & 2 \\ 3 & 4 & 0 \\ 2 & 3 & 4 \\ \end{bmatrix} \odot \begin{bmatrix} 1 & 0 & -1 \\ 1 & 0 & -1 \\ 1 & 0 & -1 \\ \end{bmatrix} \\= (0 \cdot 1 + 1 \cdot 0 + 2 \cdot (-1)) + (3 \cdot 1 + 4 \cdot 0 + 0 \cdot (-1)) + (2 \cdot 1 + 3 \cdot 0 + 4 \cdot (-1)) \\= -2 + 3 + 2 - 4 \\= -1$

参数解释

conv_layer = nn.Conv2d(in_channels=3,        # 输入通道数out_channels=16,      # 输出通道数kernel_size=3,        # 卷积核大小stride=1,             # 步幅padding=1,            # 填充padding_mode='zeros', # 填充模式dilation=1,           # 空洞卷积groups=1,             # 组卷积bias=True             # 是否使用偏置
)

in_channels (int): 输入通道数。例如，对于RGB图像，in_channels 应为 3。
out_channels (int): 输出通道数，也就是卷积核的数量。
kernel_size (int or tuple): 卷积核的大小。如果是整数，表示卷积核的高度和宽度相等。如果是元组，表示 (高度, 宽度)。
stride (int or tuple, optional): 卷积操作中窗口滑动的步幅。如果是整数，表示高度和宽度的步幅相等。如果是元组，表示 (高度步幅, 宽度步幅)。默认值为 1。
padding (int or tuple, optional): 输入的每一边要填充的零的层数。如果是整数，表示高度和宽度的填充相等。如果是元组，表示 (高度填充, 宽度填充)。默认值为 0。
padding_mode (str, optional): 填充模式，可以是 'zeros', 'reflect', 'replicate' 或 'circular'。默认值为 'zeros'。
dilation (int or tuple, optional): 卷积核元素之间的间距。如果是整数，表示高度和宽度的间距相等。如果是元组，表示 (高度间距, 宽度间距)。默认值为 1。
groups (int, optional): 从输入通道到输出通道的阻塞连接数。默认值为 1。groups 可以用于实现深度可分离卷积。
bias (bool, optional): 如果设置为 True，则添加一个学习到的偏置。默认值为 True。

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.patches as patches
from matplotlib.animation import FuncAnimation, PillowWriter# 定义输入图像和卷积核
input_image = np.array([[1, 2, 3, 0, 1],[0, 1, 2, 3, 4],[2, 3, 0, 1, 2],[1, 2, 3, 4, 0],[0, 1, 2, 3, 4]
])conv_kernel = np.array([[1, 0, -1],[1, 0, -1],[1, 0, -1]
])# 输入图像和卷积核的尺寸
input_size = input_image.shape[0]
kernel_size = conv_kernel.shape[0]
output_size = input_size - kernel_size + 1# 创建图形和轴
fig, ax = plt.subplots(figsize=(6, 6))# 显示输入图像
im = ax.imshow(input_image, cmap='viridis')# 初始化矩形框和文本
rect = patches.Rectangle((0, 0), kernel_size, kernel_size, linewidth=2, edgecolor='r', facecolor='none')
ax.add_patch(rect)
text = ax.text(0, 0, '', ha='center', va='center', color='white', fontsize=12)# 动画更新函数
def update(frame):i, j = divmod(frame, output_size)sub_matrix = input_image[i:i+kernel_size, j:j+kernel_size]conv_result = np.sum(sub_matrix * conv_kernel)# 更新矩形框的位置rect.set_xy((j, i))# 更新文本的位置和内容text.set_position((j + kernel_size / 2, i + kernel_size / 2))text.set_text(f'{conv_result:.2f}')return im, rect, text# 创建动画
ani = FuncAnimation(fig, update, frames=output_size * output_size, blit=True, repeat=False)# 保存动画为 GIF 文件
ani.save('convolution_animation.gif', writer=PillowWriter(fps=1))plt.show()

请添加图片描述
卷积的结果

[[-2.  2. -2.][-2. -2. -1.][-2. -2. -1.]]

本文来自互联网用户投稿，该文观点仅代表作者本人，不代表本站立场。本站仅提供信息存储空间服务，不拥有所有权，不承担相关法律责任。如若转载，请注明出处：http://www.mzph.cn/bicheng/25319.shtml

如若内容造成侵权/违法违规/事实不符，请联系多彩编程网进行投诉反馈email:809451989@qq.com，一经查实，立即删除！

卷积的计算过程

卷积的计算过程

示例

输入图像和卷积核

手动计算卷积

相关文章

springboot 3 oauth2认证this.authorizationService.save(authorization)生成token报错异常

2024年政治经济学与社会科学国际会议（ICPESS 2024）

DNS解析和bond网卡

探索智慧农业系统架构的设计与应用

vite 配置 typescript 环境

用爬虫实现---模拟填志愿

补篇协程:susend 挂起函数的深入理解

美团大规模KV存储挑战与架构实践--图文分析

kafka如何保证消息不丢失

AI炒股：用Kimi获取美股的历史成交价格并画出股价走势图

DolphinScheduler 3.x 执行insert into SQL任务显示成功，但查不到数据

明天15点！如何打好重保预防针：迎战HVV经验分享

程序代码问题随时记录

6月7号作业

【职业思考】程序员应该有什么职业素养？

2024年电子工程与自动化技术国际会议(ICEEAT 2024)

OrangePi AIpro小试牛刀-目标检测（YoloV5s）

Vue3中的常见组件通信之$attrs

[Linux]内网穿透nps

微服务第二轮