pytorch卷积可视化_使用Pytorch可视化卷积神经网络

pytorch卷积可视化

Image for post
Filter and Feature map Image by the author
筛选和特征图作者提供的图像

When dealing with image’s and image data, CNN are the go-to architectures. Convolutional neural networks have proved to provide many state-of-the-art solutions in deep learning and computer vision. Image recognition, object detection, self-driving cars would not be possible without CNN.

w ^ 母鸡交易与图像的图像数据,CNN是去到架构。 卷积神经网络已被证明可以提供深度学习和计算机视觉方面的许多最新解决方案。 没有CNN,图像识别,物体检测,自动驾驶汽车将无法实现。

But when it comes down to how CNN see’s and recognize the image the way they do, things can be trickier.

但是,当涉及到CNN如何看待并以其方式识别图像时,事情可能会变得更加棘手。

  • How a CNN decides whether a image is a cat or dog ?

    CNN如何确定图片是猫还是狗?

  • What makes a CNN more powerful than other models when it comes to image classification problems ?

    在图像分类问题上,什么使CNN比其他模型更强大?

  • How and what do they see in an image ?

    他们如何以及在图像中看到什么?

These were some of the questions i had back back when i first learned about CNN. The questions will grow as you deep dive into it.

这些是我第一次了解CNN时回想的一些问题。 当您深入研究时,问题将会越来越多。

Back then i heard about these terms filters and featuremaps, but dont know what they are and what they do. Later i know what they are but dont know what they look like but now, i know. When dealing with Deep Convolutional Networks filters and featuremaps are important. Filters are what makes the Featuremaps and that’s what the model see’s.

那时我听说过这些术语过滤器和功能图,但不知道它们是什么以及它们做什么。 后来我知道它们是什么,但不知道它们是什么样子,但是现在,我知道了。 在处理深度卷积网络时,过滤器和功能图很重要。 过滤器是构成Featuremap的要素,而这正是模型所看到的。

什么是CNN中的过滤器和FeatureMap? (What are Filters and FeatureMaps in CNN?)

Filters are set of weights which are learned using the backpropagation algorithm. If you do alot of practical deep learning coding, you may know them as kernels. Filter size can be of 3×3 or maybe 5×5 or maybe even 7×7.

˚Filters设置其使用的是BP算法了解到砝码。 如果您进行了大量实用的深度学习编码,则可能将它们称为内核。 过滤器尺寸可以是3×35×5甚至7×7

Filters in a CNN layer learn to detect abstract concepts like boundary of a face, edges of a buildings etc. By stacking more and more CNN layers on top of each other, we can get more abstract and in-depth information from a CNN.

CNN层中的过滤器学习检测抽象概念,例如人脸边界,建筑物边缘等。通过在彼此之上堆叠越来越多的CNN层,我们可以从CNN获得更多抽象和深入的信息。

Image for post
Image for post
7×7 and 3×3 filters
7×7和3×3滤镜

Feature Maps are the results we get after applying the filter through the pixel value of the image.This is what the model see’s in a image and the process is called convolution operation. The reason for visualising the feature maps is to gain deeper understandings about CNN.

˚Feature地图是结果通过image.This的像素值应用筛选后我们拿到的是什么模型中看到的一个图像中的过程被称为卷积运算 。 可视化特征图的原因是为了获得对CNN的更深入了解。

Image for post
Feature map
功能图

选择型号 (Selecting the model)

We will use the ResNet-50 neural network model for visualizing filters and feature maps. Using a ResNet-50 model for visualizing filters and feature maps is not ideal. The reason is that the resnet models in general, are a bit complex. Traversing through the inner convolutional layers can become quite difficult. You will learn how to access the inner convolutional layers of a difficult architecture. In the future, you will feel much more comfortable working with similar or more complex architectures.

我们将使用ResNet-50神经网络模型来可视化过滤器和特征图。 使用ResNet-50模型来可视化过滤器和功能图不是理想的选择。 原因是resnet模型通常比较复杂。 遍历内部卷积层可能变得非常困难。 您将学习如何访问困难体系结构的内部卷积层。 将来,您将在使用类似或更复杂的体系结构时感到更加自在。

The image i used is a photo from pexels. Its a image i collected to train my face-detection classifier.

我使用的图像是来自像素像素的照片。 我收集来训练我的面部检测分类器的图像 。

Image for post
pixels像素的图像

模型结构 (Model Structure)

At first glance, looking at the model structure can be intimidating, but it is really easy to get what we want. By knowing how to extract the layers of this model, you will be able to extract layers of more complex models. Below is the model structure.

乍一看,看模型结构可能会令人生畏,但真正容易获得我们想要的。 通过了解如何提取此模型的图层,您将能够提取更复杂的模型的图层。 下面是模型结构。

ResNet(
(conv1): Conv2d(3, 64, kernel_size=(7, 7), stride=(2, 2), padding=(3, 3), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(maxpool): MaxPool2d(kernel_size=3, stride=2, padding=1, dilation=1, ceil_mode=False)
(layer1): Sequential(
(0): Bottleneck(
(conv1): Conv2d(64, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
(downsample): Sequential(
(0): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(1): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
)
)
(1): Bottleneck(
(conv1): Conv2d(256, 64, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(64, 64, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(64, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(64, 256, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(256, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
...
(2): Bottleneck(
(conv1): Conv2d(2048, 512, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn1): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv2): Conv2d(512, 512, kernel_size=(3, 3), stride=(1, 1), padding=(1, 1), bias=False)
(bn2): BatchNorm2d(512, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(conv3): Conv2d(512, 2048, kernel_size=(1, 1), stride=(1, 1), bias=False)
(bn3): BatchNorm2d(2048, eps=1e-05, momentum=0.1, affine=True, track_running_stats=True)
(relu): ReLU(inplace=True)
)
)
(avgpool): AdaptiveAvgPool2d(output_size=(1, 1))
(fc): Linear(in_features=2048, out_features=1000, bias=True)

提取CNN层 (Extracting the CNN layers)

  • First, at line 4, we initialize a counter variable to keep track of the number of convolutional layers.

    首先,在第4行 ,我们初始化一个counter变量以跟踪卷积层的数量。

  • Starting from line 6, we are going through all the layers of the ResNet-50 model.

    第6行开始,我们将遍历ResNet-50模型的所有层。

  • Specifically, we are checking for convolutional layers at three levels of nesting

    具体来说,我们正在检查三个嵌套级别的卷积层
  • Line 7, checks if any of the direct children of the model is a convolutional layer.

    第7行 ,检查模型的任何直接子级是否为卷积层。

  • Then from line 10, we check whether any of the Bottleneck layer inside the Sequential blocks contain any convolutional layers.

    然后从第10行开始 ,检查Sequential块内的任何Bottleneck层是否包含任何卷积层。

  • If any of the above two conditions satisfy, then we append that child node and the weights to the conv_layers and model_weights respectively,

    如果以上两个条件中的任何一个都满足,则我们将该子节点和权重分别附加到conv_layersmodel_weights上,

The above code is simple and self-explanatory but it is limited to pre-existing models like other resnet model resnet-18, 34, 101, 152. For a custom model ,things will be different ,lets say there is a Sequential layer inside another Sequential layer and if there is a CNN layer it will be unchecked by the program. This is where the extractor.py module i wrote can be useful.

上面的代码很简单,不言自明,但仅限于像其他resnet模型resnet-18、34、101、152这样的现有模型。对于自定义模型,情况将有所不同,可以说内部有一个顺序层。另一个顺序层,如果有CNN层,程序将不对其进行检查。 这是我编写的extractor.py模块有用的地方。

提取器类 (Extractor class)

The Extractor class can find every CNN layer(except down-sample layers) including their weights in any resnet model and almost in any custom resnet and vgg model. Its not limited to CNN layers, it can find Linear layers and if the name of the Down-sampling layer is mentioned, it can find that too. It can also give some useful information like the number of CNN, Linear and Sequential layers in a model.

Extractor类可以在任何resnet模型以及几乎任何自定义的resnet和vgg模型中找到每个CNN层(向下采样层除外),包括它们的权重。 它不限于CNN层,它可以找到线性层,并且如果提到下采样层的名称,它也可以找到。 它还可以提供一些有用的信息,例如模型中的CNN,线性层和顺序层的数量。

如何使用 (How to use)

In the Extractor class the model parameter takes in a model and the DS_layer_name parameter is optional. The DS_layer_name parameter is to find the down-sampling layer, normally in resnet layer the name will be ‘downsample’ so it is kept as default.

在Extractor类中,模型参数接受模型,而DS_layer_name参数是可选的。 DS_layer_name参数用于查找下采样层,通常在resnet层中,名称为“ downsample”,因此将其保留为默认值。

extractor = Extractor(model = resnet, DS_layer_name = 'downsample')

The code extractor.activate() is to activate the program.

代码extractor.activate()用于激活程序。

You can get relevant details in a dictionary by calling extractor.info()

您可以通过调用extractor.info()获取字典中的相关详细信息。

{'Down-sample layers name': 'downsample', 'Total CNN Layers': 49, 'Total Sequential Layers': 4, 'Total Downsampling Layers': 4, 'Total Linear Layers': 1, 'Total number of Bottleneck and Basicblock': 16, 'Total Execution time': '0.00137 sec'}

访问权重和图层 (Accessing the weights and the layers)

extractor.CNN_layers -----> Gives all the CNN layers in a model
extractor.Linear_layers --> Gives all the Linear layers in a model
extractor.DS_layers ------> Gives all the Down-sample layers in a model if there are any
extractor.CNN_weights ----> Gives all the CNN layer's weights in a model
extractor.Linear_weights -> Gives all the Linear layer's weights in a model

Without any coding you can get CNN and Linear layers and their weights in almost every resnet model. Below is what the class methods looks like , there is more, do go through the entire script.

无需任何编码,您几乎可以在每个resnet模型中获得CNN和Linear图层及其权重。 下面是类方法的样子,还有更多,请仔细阅读整个脚本。

可视化 (Visualizing)

卷积层过滤器 (Convolutional Layer Filters)

Here we will visualize the convolutional layer filters. For simplicity, we will only visualize the filters of the first convolutional layer.

在这里,我们将可视化卷积层过滤器。 为了简单起见,我们将仅可视化第一卷积层的过滤器。

We are looping through the model weights of the first layer. For the first layer the filter size is 7×7 and there are 64 channels(hidden layers).

我们正在遍历第一层的模型权重。 对于第一层,过滤器大小为7×7,并且有64个通道(隐藏层)。

Image for post
7×7 filter
7×7过滤器
Image for post
7×7 filters from trained resnet-50 model
来自受过训练的resnet-50模型的7×7过滤器

The pixel values for each small boxes is between 0 to 255. 0 being complete black and 255 being white. The range can be different like between 0 to 1 or -1 to 1 with 0 as the mean.

每个小盒子的像素值在0到255之间。0为全黑,而255为白。 范围可以不同,例如0到1或-1到1,平均值为0。

要素图 (The Feature Maps)

Transformin g ^ (Transforming)

To visualize the feature maps, first the image need to be converted to a tensor image. Using the transforms from torchvision the image can be normalized and transformed to a tensor.

为了可视化特征图,首先需要将图像转换为张量图像。 使用来自火炬视觉的变换,可以将图像标准化并变换为张量。

The last line after the transforms means applying the transforms to the image. You can create a new variable and then apply it, but make sure to change the variable name. And the .unsqueeze(0) is to add an extra dimension to the tensor img. Adding the batch dimension is an important step. Now the size of the image, instead of being [3, 128, 128], is [1, 3, 128, 128], indicating that there is only one image in the batch.

变换后的最后一行表示将变换应用于图像。 您可以创建一个新变量,然后应用它,但请确保更改变量名称。 .unsqueeze(0)用于为张量img添加额外的尺寸。 添加批次尺寸是重要的步骤。 现在,图像的大小不是[3, 128, 128] ,而是[1, 3, 128, 128] ,指示批次中只有一个图像。

使输入图像通过每个卷积层 (Passing the Input Image Through Each Convolutional Layer)

The below code will pass the image through each convolutional layer.

以下代码将使图像通过每个卷积层。

We will first give the image as an input to the first convolutional layer. After that, we will use a for loop to pass the last layer’s outputs to the next layer, until we reach the last convolutional layer.

我们首先将图像作为第一卷积层的输入。 之后,我们将使用for循环将最后一层的输出传递到下一层,直到到达最后一个卷积层。

  • At line 1, we give the image as input to the first convolutional layer.

    第1行 ,我们将图像作为输入输入到第一卷积层。

  • Then we iterate from through the second till the last convolutional layer using a for loop.

    然后,我们使用for循环从第二个卷积层到最后一个卷积层进行迭代。

  • We give the last layer’s output as the input to the next convolutional layer (featuremaps[-1]).

    我们将最后一层的输出作为下一个卷积层( featuremaps[-1 ])的输入。

  • Also, we append each layer’s output to the featuremaps list.

    另外,我们将每个图层的输出附加到featuremaps列表。

可视化特征图 (Visualizing the Feature Maps)

This is the final step. We will write the code to visualize the feature maps. Notice that the final cnn layer have many feature maps, in the range of 512 to 2048. But we will only visualize 64 feature maps from each layer as any more than that will make the outputs really cluttered.

这是最后一步。 我们将编写代码以可视化要素地图。 请注意,最后的cnn图层具有许多要素图,范围在512到2048之间。但是,我们将仅可视化每个图层的64个要素图,因为这将使输出真正混乱。

  • Starting from line 2, we iterate through the featuremaps.

    第2行开始,我们遍历featuremaps

  • Then we get layers as featuremaps[x][0, :, :, :].detach() .

    然后,我们将layers作为featuremaps[x][0, :, :, :].detach()

  • Starting from line 5, we iterate through the filters in each layers. We break out of the loop if it is the 64th feature map.

    第5行开始,我们遍历每layers的过滤器。 如果它是第64个要素图,我们将跳出循环。

  • After that we plot the feature map, and save them if necessary.

    之后,我们绘制特征图,并在必要时保存它们。

结果 (Results)

Image for post
Feature maps from the first convolutional layer of ResNet-50 model
来自ResNet-50模型的第一卷积层的特征图

You can see that different filters focus on different aspects while creating the feature map of an image.

您可以看到在创建图像的特征图时,不同的滤镜专注于不同的方面。

Some feature maps focus on the background of the image. Some others create an outline of the image. A few filters create feature maps where the background is dark but the image of the face is bright. This is due to the corresponding weights of the filters. It is very clear from the above image that in the deep layers, the neural network gets to see very detailed feature maps of the input image.

一些功能贴图集中在图像的背景上。 其他一些则创建图像的轮廓。 一些滤镜会创建要素图,其中背景较暗,但脸部图像较亮。 这是由于过滤器的相应重量。 从上面的图像很清楚,在较深的层中,神经网络可以看到输入图像的非常详细的特征图。

Let’s take a look at a few other feature maps.

让我们看一下其他一些功能图。

Image for post
Image for post
Feature maps from the 20th and 10th convolutional layer of ResNet-50 model
ResNet-50模型的第20和第10卷积层的特征图
Image for post
Image for post
Feature maps from the 40th and 30th convolutional layer of ResNet-50 model
ResNet-50模型的第40和第30卷积层的特征图

You can observe that as the image progresses through the layers the details from the images slowly disappears. They look like noise, but surely there is a pattern in those feature maps which human eyes cannot detect, but a neural network can.

您可以观察到,随着图像逐步穿过图层,图像中的细节逐渐消失。 它们看起来像噪声,但可以肯定的是,在这些特征图中,人眼无法检测到某种模式,但是神经网络可以检测到。

By the time the image reaches the last convolutional layer then it is impossible for a human being to tell what that is. These last layer outputs are really important for the fully connected neurons which basically form the classification layers in a convolutional neural network.

到图像到达最后一个卷积层时,人类就不可能知道那是什么。 这些最后一层的输出对于完全连接的神经元非常重要,这些神经元基本上形成了卷积神经网络中的分类层。

结论 (Conclusions)

A big thanks to @sovitrath5 author of machine learning blog DebuggerCafe for the content.

非常感谢机器学习博客DebuggerCafe的作者@ sovitrath5提供的内容。

翻译自: https://medium.com/swlh/visualizing-filters-and-feature-maps-in-convolutional-neural-networks-using-pytorch-110d4c1cfdeb

pytorch卷积可视化

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/242145.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

Golang之轻松化解defer的温柔陷阱

defer是Go语言提供的一种用于注册延迟调用的机制:让函数或语句可以在当前函数执行完毕后(包括通过return正常结束或者panic导致的异常结束)执行。深受Go开发者的欢迎,但一不小心就会掉进它的温柔陷阱,只有深入理解它的…

u-net语义分割_使用U-Net的语义分割

u-net语义分割Picture By Martei Macru On Unsplash图片由Martei Macru On Unsplash拍摄 Semantic segmentation is a computer vision problem where we try to assign a class to each pixel . Unlike the classic image classification task where only one class value is …

我国身家超过亿元的有多少人?

目前我国身家达到亿元以上的人数,从公开数据来看大概有13万人,但如果把那些统计不到的隐形亿万富翁计算在内,我认为至少有20万以上。公开资料显示目前我国亿万富翁人数达到133000人根据胡润2018财富报告显示,目前我国(…

地理空间数据

摘要 (Summary) In this article, using Data Science and Python, I will show how different Clustering algorithms can be applied to Geospatial data in order to solve a Retail Rationalization business case.在本文中,我将使用数据科学和Python演示如何将…

嵌入式系统分类及其应用场景_词嵌入及其应用简介

嵌入式系统分类及其应用场景Before I give you an introduction on Word Embeddings, take a look at the following examples and ask yourself what is common between them:在向您介绍Word Embeddings之前,请看一下以下示例并问问自己它们之间的共同点是什么&…

山东男子5个月刷信用卡1800次,被银行处理后他选择29次取款100元

虽然我国实行的是存款自愿,取款自由的储蓄政策,客户想怎么取款,在什么时候取,取多少钱,完全是客户的权利,只要客户的账户上有钱,哪怕他每次取一毛钱取个100次都是客户的权利。但是明明可以一次性…

深发银行为什么要更名为平安银行?

深圳发展银行之所以更名为平安银行,最直接的原因是平安银行收购了深圳发展银行,然后又以平安集团作为主体,以深圳发展银行的名义收购了平安银行,最后两个人合并之后统一命名为平安银行。深圳发展银行更名为平安银行,大…

高斯过程分类和高斯过程回归_高斯过程回归建模入门

高斯过程分类和高斯过程回归Gaussian processing (GP) is quite a useful technique that enables a non-parametric Bayesian approach to modeling. It has wide applicability in areas such as regression, classification, optimization, etc. The goal of this article i…

假如购买的期房不小心烂尾了,那银行贷款是否可以不还了?

如今房价一路高升,再加上开发商融资难度越来越大,现在很多人都开始打期房的主意。期房不论是对开发商还是对购房者来说都是双赢的,开发商可以以较低的融资成本维持楼盘的开发,提高财务杠杆,而购房者可以较低的价格买房…

在银行存款5000万,能办理一张50万额度的信用卡吗?

拥有一张大额信用卡是很多人梦寐以求的事情,大额信用卡不仅实用,在关键时刻可以把钱拿出来刷卡或者取现,这是一种非常方便的融资方式。然而大额信用卡并不是说谁想申请就可以申请下来,正常情况下,10万以上额度以上的信…

hotelling变换_基于Hotelling-T²的偏最小二乘(PLS)中的变量选择

hotelling变换背景 (Background) One of the most common challenges encountered in the modeling of spectroscopic data is to select a subset of variables (i.e. wavelengths) out of a large number of variables associated with the response variable. It is common …

商业银行为什么大量组织高净值小规模活动?

在管理界有一个非常著名的定律叫做二八定律,所谓28定律就是20%的客户贡献了企业80%的利润。虽然这个定律在银行不一定适用,但同样的道理用于银行营销也是合适的。银行之所以经常组织一些高净值小规模的活动,因为这些客户的资产和价值比较高&a…

在县城投资买一辆出租车,一个月能收入多少钱?

在县城投资出租车能赚多少钱具体要看你是什么县城,比如西部的县城勉强能养活自己,中部的县城一个月能赚个5、6千,东部的小县城月赚个万元以上也有可能。具体回报率怎么样可以先算下投资一个出租车的成本投资一个出租车的构成成本比较多&#…

通过ISO镜像文件安装Ubuntu(可实现默认启动Windows的双系统)

解压文件 使用WinRAR等软件,Ubuntu ISO镜像文件中的casper文件夹解压到硬盘中的任意分区根目录,把ISO镜像也放在那个分区根目录。 使用Grub4dos启动Ubuntu 使用grub4dos启动Ubuntu,menu.lst写法如下。其中root命令指定了硬盘分区编号&#xf…

命名实体识别 实体抽取_您的公司为什么要关心命名实体的识别

命名实体识别 实体抽取Named entity recognition is the task of categorizing text into entities, such as people, locations, and dates. For example, for the sentence, On April 30, 1789, George Washington was inaugurated as the first president of the United Sta…

表达式测试

1111 (parameters) -> { statements; }//求平方 (int a) -> {return a * a;}//打印,无返回值 (int a) -> {System.out.println("a " a);}

有关西电的课程学分相关问题:必修课、选修课、补考、重修、学分

注:最近一年多以来学校的政策改动比较大,听说有选修一旦选了就必须通过,否则视为挂科需要重修的;还有的说是选修课学分够了再多选可能要收费(未经确认,可能只是误传);等各种说法。本…

银行现在都很缺钱吗,为什么给的利息比以前高了?

目前无论是大银行还是小银行,也不论是国有银行还是民营银行,基本上每个银行都上浮利率,如果不上浮利率,那就只能吃土了,当然加息一般主要针对定期存款以及贷款来说,活期存款利率一般是不会上浮,…

机器学习 异常值检测_异常值是否会破坏您的机器学习预测? 寻找最佳解决方案

机器学习 异常值检测内部AI (Inside AI) In the world of data, we all love Gaussian distribution (also known as a normal distribution). In real-life, seldom we have normal distribution data. It is skewed, missing data points or has outliers.在数据世界中&#…

1000万贷款三年,到期一次性偿还1500万,这个利息算不算高?

1000万的贷款三年期到期还1500万,相当于每一年的利息是166.6万,折算下来年化利率是16.6%。至于这个利率是否划算,要看你在什么金融机构贷款以及你个人的资质来看。如果你个人条件比较好,在银行做的抵押贷款,那我认为16…