U-Net for Image Segmentation

1.Unet for Image Segmentation

笔记来源:使用Pytorch搭建U-Net网络并基于DRIVE数据集训练(语义分割)

1.1 DoubleConv (Conv2d+BatchNorm2d+ReLU)

import torch
import torch.nn as nn
import torch.nn.functional as F# nn.Sequential 按照类定义的顺序去执行模型,并且不需要写forward函数
class DoubleConv(nn.Sequential): # class 子类(父类)
# __init__() 是一个类的构造函数,用于初始化对象的属性。它会在创建对象时自动调用,而且通常在这里完成对象所需的所有初始化操作
# forward方法是实现模型的功能,实现各个层之间的连接关系的核心# def __init__(self) 只有一个self,指的是实例的本身。它允许定义一个空的类对象,需要实例化之后,再进行赋值# def __init__(self,args) 属性值不允许为空,实例化时,直接传入参数def __init__(self,in_channels,out_channels,mid_channels=None) # DoubleConv输入的通道数,输出的通道数,DoubleConv中第一次conv后输出的通道数if mid_channels is None: # mid_channels为none时执行语句mid_channels = out_channelssuper(DoubleConv,self).__init__(  #继承父类并且调用父类的初始化方法nn.Conv2d(in_channels,mid_channels,kernel_size=3,padding=1,bias=False),nn.BatchNorm2d(mid_channels), # 进行数据的归一化处理,这使得数据在进行Relu之前不会因为数据过大而导致网络性能的不稳定nn.ReLU(inplace=True), # 指原地进行操作,操作完成后覆盖原来的变量nn.Conv2d(in_channels,out_channels,kernel_size=3,padding=1,bias=False),nn.BatchNorm2d(out_channels),nn.ReLU(inplace=True))

1.1.1 Conv2d

直观了解kernel size、stride、padding、bias
原始图像通过与卷积核的数学运算,可以提取出图像的某些指定特征,卷积核相当于“滤镜”可以着重提取它感兴趣的特征

下图来自:Expression recognition based on residual rectification convolution neural network

1.1.2 BatchNorm2d

To understand what happens without normalization, let’s look at an example with just two features that are on drastically different scales. Since the network output is a linear combination of each feature vector, this means that the network learns weights for each feature that are also on different scales. Otherwise, the large feature will simply drown out the small feature.
Then during gradient descent, in order to “move the needle” for the Loss, the network would have to make a large update to one weight compared to the other weight. This can cause the gradient descent trajectory to oscillate back and forth along one dimension, thus taking more steps to reach the minimum.—Batch Norm Explained Visually — How it works, and why neural networks need it

if the features are on the same scale, the loss landscape is more uniform like a bowl. Gradient descent can then proceed smoothly down to the minimum. —Batch Norm Explained Visually — How it works, and why neural networks need it

Batch Norm is just another network layer that gets inserted between a hidden layer and the next hidden layer. Its job is to take the outputs from the first hidden layer and normalize them before passing them on as the input of the next hidden layer.

without batch norm
include batch norm




下图来自:BatchNorm2d原理、作用及其pytorch中BatchNorm2d函数的参数讲解

1.1.3 ReLU

How an Activation function works?

We know, the neural network has neurons that work in correspondence with weight, bias, and their respective activation function. In a neural network, we would update the weights and biases of the neurons on the basis of the error at the output. This process is known as Back-propagation. Activation functions make the back-propagation possible since the gradients are supplied along with the error to update the weights and biases. —Role of Activation functions in Neural Networks

Why do we need Non-linear Activation function?

Activation functions introduce non-linearity into the model, allowing it to learn and perform complex tasks. Without them, no matter how many layers we stack in the network, it would still behave as a single-layer perceptron because the composition of linear functions is a linear function. —Convolutional Neural Network — Lesson 9: Activation Functions in CNNs

Doesn’t matter how many hidden layers we attach in neural net, all layers will behave same way because the composition of two linear function is a linear function itself. Neuron cannot learn with just a linear function attached to it. A non-linear activation function will let it learn as per the difference w.r.t error. Hence, we need an activation function.—Role of Activation functions in Neural Networks

下图来自:Activation Functions 101: Sigmoid, Tanh, ReLU, Softmax and more

1.2 Down (MaxPool+DoubleConv)

class Down(nn.Sequential):def __init__(self,in_channels,out_channels):super(Down,self).__init__(nn.MaxPool2d(2,stride=2), # kernel_size核大小,stride核的移动步长DoubleConv(in_channels,out_channels))

1.2.1 MaxPool

Two reasons for applying Max Pooling :

  1. Downscaling Image by extracting most important feature
  2. Removing Invariances like shift, rotational and scale

下图来自:Max Pooling, Why use it and its advantages.

Max Pooling is advantageous because it adds translation invariance. There are following types of it

  1. Shift Invariance(Invariance in Position)
  2. Rotational Invariance(Invariance in Rotation)
  3. Scale Invariance(Invariance in Scale(small or big))

1.3 Up (Upsample/ConvTranspose+DoubleConv)

双线性插值进行上采样

转置卷积进行上采样

class Up(nn.Module):
# __init__() 是一个类的构造函数,用于初始化对象的属性。它会在创建对象时自动调用,而且通常在这里完成对象所需的所有初始化操作def __init__(self,in_channels,out_channels,bilinear=True): #默认通过双线性插值进行上采样super(Up,self).__init__()if bilinear: #通过双线性插值进行上采样self.up = nn.Upsample(scale_factor=2,mode='bilinear',align_corners=True) #输出为输入的多少倍数、上采样算法、输入的角像素将与输出张量对齐self.conv = DoubleConv(in_channels,out_channels,in_channels//2)else #通过转置卷积进行上采样self.up = nn.ConvTranspose2d(in_channels,in_channels//2,kernel_size=2,stride=2)self.conv = DoubleConv(in_channels,out_channels) # mid_channels = none时 mid_channels = mid_channels
# forward方法是实现模型的功能,实现各个层之间的连接关系的核心def forward(self,x1,x2) # 对x1进行上采样,将上采样后的x1与x2进行cat# 对x1进行上采样x1 = self.up(x1)# 对上采样后的x1进行padding,使得上采样后的x1与x2的高宽一致# [N,C,H,W] Number of data samples、Image channels、Image height、Image widthdiff_y = x2.size()[2] - x1.size()[2] # x2的高度与上采样后的x1的高度之差diff_x = x2.size()[3] - x2.size()[3] # x2的宽度与上采样后的x1的宽度之差# padding_left、padding_right、padding_top、padding_bottomx1 = F.pad(x1,[diff_x//2, diff_x-diff_x//2, diff_y//2, diff_y-diff_y//2])# 上采样后的x1与x2进行catx = torch.cat([x2,x1],dim=1)# x1和x2进行cat之后进行doubleconvx = self.conv(x)return x

1.3.1 Upsample

上采样率scale_factor

双线性插值bilinear


角像素对齐align_corners
下图来自:[PyTorch]Upsample

1.4 OutConv

class OutConv(nn.Sequential):def __init__(self, in_channels, num_classes):super(OutConv,self).__init__(nn.Conv2d(in_channels,num_classes,kernel_size=1) #输出分类类别的数量num_classes)

1.5 Unet

class UNet(nn.Module):def __init__(self,in_channels: int = 1, # 网络输入图片的通道数num_classes: int = 2, # 网络输出分类类别数bilinear: bool = True, # 上采样时使用双线性插值base_c: int = 64): # 基础通道数,网络中其他层的通道数均为该基础通道数的倍数super(UNet,self).__init__()self.in_channels = in_channelsself.num_classes = num_classesself.bilinear = bilinearself.in_conv = DoubleConv(in_channels,base_c) # in_channels、out_channels、mid_channels=noneself.down1 = Down(base_c,base_c*2) # in_channels、out_channels self.down2 = Down(base_c*2,base_c*4)self.down3 = Down(base_c*4,base_c*8)factor = 2 if bilinear else 1 #如果上采样使用双线性插值则factor=2,若使用转置卷积则factor=1self.down4 = Down(base_c*8,base_c*16//factor)self.up1 = Up(base_c*16, base_c*8//factor, bilinear) # in_channels、out_channels、bilinear=Trueself.up2 = Up(base_c*8, base_c*4//factor, bilinear)self.up3 = Up(base_c*4, base_c*2//factor, bilinear)self.up4 = Up(base_c*2, base_c, bilinear)self.out_conv = OutConv(base_c, num_classes) # in_channels, num_classesdef forward(self,x):x1 = self.in_conv(x)x2 = self.down1(x1)x3 = self.down2(x2)x4 = self.down3(x3)x5 = self.down4(x4)x = self.up1(x5,x4) # 对x5进行上采样,将上采样后的x5与x4进行catx = self.up2(x,x3)x = self.up3(x,x2)x = self.up4(x,x1)logits = self.out_conv(x)return {"out": logits} # 以字典形式返回

为何没有调用forward方法却能直接使用?
nn.Module类是所有神经网络模块的基类,需要重载__init__和forward函数

self.up1 = Up(base_c*16, base_c*8//factor, bilinear) //实例化Up
....
x = self.up1(x5,x4) //调用 Up() 中的 forward(self,x1,x2) 函数

up1调用父类nn.Module中的__call__方法,__call__方法又调用UP()中的forward方法

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/web/32112.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

安卓开发使用proxyman监控真机

1、真机跟电脑连接到同个网络中 2、手机里面设置代理,代理地址为proxyman上面指示的地址。 3、一般情况下,电脑的对应的端口是没开放的。需要到防火墙里面新建规则。入站规则 选择端口输入上方端口号 这样就能监控到了

计算机系统基础实训六-ShellLab实验

实验目的与要求 1、让学生更加理解进程控制的概念和具体操作方法; 2、让学生更加理解信号的概念和具体使用方法; 3、让学生更加理解Unix shell程序的原理和实现方法; 实验原理与内容 shell是一种交互式的命令行解释器,能代表…

Apple - Cryptographic Services Guide

本文翻译自:Cryptographic Services Guide(更新时间:2018-06-04 https://developer.apple.com/library/archive/documentation/Security/Conceptual/cryptoservices/Introduction/Introduction.html#//apple_ref/doc/uid/TP40011172 文章目录…

jnp.linalg.svd

jnp.linalg.svd 是 JAX 库中的一个函数,用于计算矩阵的奇异值分解 (SVD)。SVD 将一个矩阵分解成三个矩阵的乘积,通常表示为 A U * S * V^T,其中: A 是原始矩阵。U 是一个正交矩阵,列是左奇异向量。S 是一个对角矩阵&…

香橙派 5 PLUS 安装微信(arm架构、Ubuntu系统)

先上百度网盘链接: 链接:https://pan.baidu.com/s/14I4vPYdzPLSvyJ7MR_KOgg?pwdswcz 提取码:swcz 里面是我们要安装的微信文件,文件名:com.tencent.WeChat-aarch64.flatpak,请下载。 要在Ubuntu中安装微…

Butter Knife 8

// 部分代码省略… Override public View getView(int position, View view, ViewGroup parent) { ViewHolder holder; if (view ! null) { holder (ViewHolder) view.getTag(); } else { view inflater.inflate(R.layout.testlayout, parent, false); holder new ViewHolde…

tmux的使用

简单的介绍 常用到的了解会话和窗口的区别?如何为session创建window呢?如何为window改名字呢?常用到的 查看已创建会话:tmux ls创建新的会话: tmux new -s 名字离开当前会话:tmux detach(注意是离开,并不是杀死哦!)进入指定会话:tmux attach -t 名字 或者是tmux a -…

大二C++期末复习(自用)

一、类 1.定义成员函数 输入年份判断是否是闰年&#xff0c;若是输出年份&#xff1b;若不是&#xff0c;输出NO #include<iostream> #include<cstring> using namespace std; class TDate{private:int month;int day;int year;public:TDate(int y,int m,int d)…

电路仿真实战设计教程--平均电流控制原理与仿真实战教程

1.平均电流控制原理: 平均电流控制的方块图如下,其由外电路电压误差放大器作电压调整器产生电感电流命令信号,再利用电感电流与电流信号的误差经过一个电流误差放大器产生PWM所需的控制电压,最后由控制电压与三角波比较生成开关管的驱动信号。 2.电流环设计: 根据状态平…

外部存储器

外部存储器是主存的后援设备&#xff0c;也叫做辅助存储器&#xff0c;简称外存或辅存。 它的特点是容量大、速度慢、价格低&#xff0c;可以脱机保存信息&#xff0c;属于非易失性存储器。 外存主要有&#xff1a;光盘、磁带、磁盘&#xff1b;磁盘和磁带都属于磁表面存储器…

人工智能领域的机器学习方法给我们的带来了哪些好处?

关于人工智能领域中的机器学习&#xff0c;这是一个深入且广泛的主题。以下是对该领域的简要概述&#xff0c;以及对其主要特点和发展的详细分析&#xff1a; 一、定义与概述 人工智能机器学习&#xff08;AI & ML&#xff09;是通过一定算法和数学模型&#xff0c;使计算…

【Java毕业设计】基于JavaWeb的服务出租系统

本科毕业设计论文 题目&#xff1a;房屋交易平台设计与实现 系 别&#xff1a; XX系&#xff08;全称&#xff09; 专 业&#xff1a; 软件工程 班 级&#xff1a; 软件工程15201 学生姓名&#xff1a; 学生学号&#xff1a; 指导教师&#xff1a; 导师1 导师2 文章目录 摘…

从零对Transformer的理解(台大李宏毅)

Self-attention layer自注意力 对比与传统cnn和rnn&#xff0c;都是需要t-1时刻的状态然后得到t时刻的状态。我不知道这样理解对不对&#xff0c;反正从代码上看我是这么认为的。而transformer的子注意力机制是在同一时刻产生。意思就是输入一个时间序列&#xff0c;在计算完权…

java-正则表达式 2

7. 复杂的正则表达式示例&#xff08;续&#xff09; 7.1 验证日期格式 以下正则表达式用于验证日期格式&#xff0c;例如YYYY-MM-DD。 import java.util.regex.*;public class RegexExample {public static void main(String[] args) {String[] dates {"2023-01-01&q…

PostgreSQL的学习心得和知识总结(一百四十五)|深入理解PostgreSQL数据库之ShowTransactionState的使用及父子事务有限状态机

目录结构 注&#xff1a;提前言明 本文借鉴了以下博主、书籍或网站的内容&#xff0c;其列表如下&#xff1a; 1、参考书籍&#xff1a;《PostgreSQL数据库内核分析》 2、参考书籍&#xff1a;《数据库事务处理的艺术&#xff1a;事务管理与并发控制》 3、PostgreSQL数据库仓库…

信息技术课如何禁止学生玩游戏

在信息技术课上禁止学生玩游戏是一个常见的挑战&#xff0c;但可以通过一系列策略和工具来有效地实现。以下是一些建议&#xff1a; 明确课堂规则和纪律&#xff1a; (1)在课程开始时&#xff0c;明确告知学生课堂规则和纪律&#xff0c;包括禁止玩游戏的规定。 (2)强调遵守…

CSS 修改鼠标图标样式

自定义的鼠标图标&#xff0c;推荐使用ico或cur格式&#xff0c;png也可以。 修改body的设置鼠标样式 body不代表所有html元素。 body{cursor: url(../pic/cursor/Luo\ Tianyi.png), default;width: 100%;height: 100%; }cursor末尾的default&#xff1a;default是默认样式之…

【CSS】box-shadow盒阴影

box-shadow 是 CSS 中的一个属性&#xff0c;用于为 HTML 元素添加阴影效果。这个属性可以创建非常复杂的阴影效果&#xff0c;包括内阴影、外阴影、多重阴影等。 基本语法如下&#xff1a; box-shadow: h-offset v-offset blur-radius spread-radius color inset;h-offset&a…

[Qt] QtCreator编辑区关闭右侧不必要的警告提示

在代码编辑页面&#xff0c;右侧总会出现一些即时Waring&#xff0c;不想看见&#xff1f; 取消勾选插件管理中的ClangCodeModel 插件即可

Linux 内核权限提升漏洞CVE-2024-1086三种修复方法

作者介绍&#xff1a;老苏&#xff0c;10余年DBA工作运维经验&#xff0c;擅长Oracle、MySQL、PG数据库运维&#xff08;如安装迁移&#xff0c;性能优化、故障应急处理等&#xff09; 公众号&#xff1a;老苏畅谈运维 欢迎关注本人公众号&#xff0c;更多精彩与您分享。一、漏…