前言
仅记录学习过程,有问题欢迎讨论
fine-tuning
在实践中,由于数据集不够大,很少有人从头开始训练网络。常见的做法是使用预训练的网络
(例如在ImageNet上训练的分类1000类的网络)来重新fine-tuning(也叫微调),或者当做特
征提取器。
- 迁移学习简单的讲就是将一个在数据集上训练好的卷积神经网络模型通过简单的调整快速移动
到另外一个数据集上。 - 随着模型的层数及模型的复杂度的增加,模型的错误率也随着降低。但是要训练一个复杂的卷
积神经网络需要非常多的标注信息,同时也需要几天甚至几周的时间,为了解决标注数据和训
练时间的问题,就可以使用迁移学习
常见的场景:
- 1 卷积网络当做特征提取器。使用在ImageNet上预训练的网络,去掉最后的全连接层,剩余部分当做特征提取器(例如AlexNet在最后分类器前,是4096维的特征向量)。这样提取的特征叫做CNN codes。得到这样的特征后,可以使用线性分类器(Liner SVM、Softmax等)来分类图像。
- 2 Fine-tuning卷积网络。替换掉网络的输入层(数据),使用新的数据继续训练。Fine-tune时可以选择fine-tune全部层或部分层。通常,前面的层提取的是图像的通用特征(generic features)(例如边缘检测,色彩检测),这些特征对许多任务都有用。后面的层提取的是与特定类别有关的特征,因此fine-tune时常常只需要Fine-tuning后面的层。
Inception网络:
之前存在问题
- 1、由于信息位置的巨大差异,为卷积操作选择合适的卷积核大小就比较困难。信息分布更全
局性的图像偏好较大的卷积核,信息分布比较局部的图像偏好较小的卷积核。 - 2、非常深的网络更容易过拟合。将梯度更新传输到整个网络是很困难的。
- 3、简单地堆叠较大的卷积层非常消耗计算资源
解决方案:
- 为什么不在同一层级上运行具备多个尺寸的滤波器呢?网络本质上会变得稍微「宽一些」,而不是
「更深」。作者因此设计了 Inception 模块。Inception 模块:它使用 1x1、3x3 和 5x5 的滤波器对输入执行卷积操作,此外它还会执行最大池化。所有子层的输出最后会被级联起来,
并传送至下一个Inception 模块。(在33和55之前添加1*1可以降低算力成本) - 一方面增加了网络的宽度,另一方面增加了网络对尺度的适应性
Inception V1:
- 网络额外增加了两个辅助的softmax用于前向传导梯度(避免梯度消失)
Inception V2:
- 输入的适合加入batchNormalization(relu/softmax),使输入在0-1之间;
- 使用两个33代替55的卷积层!(减少参数,对应的信息丢失会变多)
Inception V3:
- 分解成小卷积很有效(使用1n和n1代替n*n的卷积核),可以降低参数量,减轻过拟合,增加网络非线性的表达能力
Inception V4:
- 使用了残差结构,使网络更宽更深,同时引入了ResNet的残差结构,使网络更宽更深,同时引入了ResNet的残差结构,使网络更宽更深
卷积网络从输入到输出,应该让图片尺寸逐渐减小,输出通道数逐渐增加,即让空间结构化,将空间信息转化为高阶抽象的特征信息
Inception Module用多个分支提取不同抽象程度的高阶特征的思路很有效,可以丰富网络的表达能力
优势:
- 采用了1x1卷积核,性价比高,用很少的计算量既可以增加一层的特征变换和非线性变换。
- 提出Batch Normalization,通过一定的手段,把每层神经元的输入值分布拉到均值0方差1
的正态分布,使其落入激活函数的敏感区,避免梯度消失,加快收敛。 - 引入Inception module,4个分支结合的结构。
MobileNe
一个轻量级的深层神经网络,核心思想是深度可分离卷积
(先用3x3的1个通道去卷积,然后得到3个通道的卷积,再用n个1*1的通道卷积得到个n通道)
神经网络设计技巧:
1.使用简单但强大的模型(VGG Resnet)
2.增加深度
3.增加对称性
4.使用正则化和dropout 增加泛化性
5.使用数据增强
6.使用batchNormal
7.使用预训练模型
8.使用循环学习率
实现InceptionV3
"""
1.实现InceptionV3
"""import torch
import torch.nn as nn
import torch.nn.functional as F
import torchvision
import torchvision.transforms as transforms
# 自定义卷积层 多了Bn和relu
class conv2d_bn(nn.Module):def __init__(self, in_channels, out_channels, kernel_size, stride=1, padding=0, bias=False):super(conv2d_bn, self).__init__()if isinstance(padding, str):if padding == 'valid':padding = 0elif padding == 'same':padding = (kernel_size - 1) // 2 if isinstance(kernel_size, int) else [(k - 1) // 2 for k inkernel_size]self.conv = nn.Conv2d(in_channels, out_channels, kernel_size, stride, padding, bias=bias)self.bn = nn.BatchNorm2d(out_channels, eps=1e-3)self.relu = nn.ReLU(inplace=True)def forward(self, x):x = self.conv(x)x = self.bn(x)x = self.relu(x)return xclass inceptionA(nn.Module):def __init__(self, in_channels, pool_features):super(inceptionA, self).__init__()self.branch1x1 = conv2d_bn(in_channels, 64, kernel_size=1)# 使用5*5卷积来提取self.branch5x5_1 = conv2d_bn(in_channels, 48, kernel_size=1)self.branch5x5_2 = conv2d_bn(48, 64, kernel_size=5,padding=2)# 再使用3*3self.branch3x3dbl_1 = conv2d_bn(in_channels, 64, kernel_size=1)self.branch3x3dbl_2 = conv2d_bn(64, 96, kernel_size=3,padding=1)self.branch3x3dbl_3 = conv2d_bn(96, 96, kernel_size=3,padding=1)# 平均池化self.branch_pool = conv2d_bn(in_channels, pool_features, kernel_size=1)def forward(self, x):branch1x1 = self.branch1x1(x)branch5x5 = self.branch5x5_1(x)branch5x5 = self.branch5x5_2(branch5x5)branch3x3dbl = self.branch3x3dbl_1(x)branch3x3dbl = self.branch3x3dbl_2(branch3x3dbl)branch3x3dbl = self.branch3x3dbl_3(branch3x3dbl)branch_pool = F.avg_pool2d(x, kernel_size=3, stride=1, padding=1)branch_pool = self.branch_pool(branch_pool)# 放入一个output,代表了 Inception 模块中不同分支的输出结果outputs = (branch1x1, branch5x5, branch3x3dbl, branch_pool)# 按照维度 1 进行拼接return torch.cat(outputs, 1)class inceptionB(nn.Module):def __init__(self, in_channels):super(inceptionB, self).__init__()self.branch3x3 = conv2d_bn(in_channels, 384, kernel_size=3, padding=1)self.branch3x3dbl_1 = conv2d_bn(in_channels, 64, kernel_size=1)self.branch3x3dbl_2 = conv2d_bn(64, 96, kernel_size=3, padding=1)self.branch3x3dbl_3 = conv2d_bn(96, 96, kernel_size=3, padding=1)self.branch_pool_conv = conv2d_bn(in_channels, 32, kernel_size=1) # 新增,用于池化分支卷积def forward(self, x):branch3x3 = self.branch3x3(x)branch3x3dbl = self.branch3x3dbl_1(x)branch3x3dbl = self.branch3x3dbl_2(branch3x3dbl)branch3x3dbl = self.branch3x3dbl_3(branch3x3dbl)branch_pool = F.avg_pool2d(x, kernel_size=3, stride=1, padding=1)branch_pool = self.branch_pool_conv(branch_pool)outputs = (branch3x3, branch3x3dbl, branch_pool)return torch.cat(outputs, 1)class InceptionV3(nn.Module):def __init__(self, input_shape=[299, 299, 3], classes=100):super(InceptionV3, self).__init__()self.input_shape = input_shape# 初始化卷积层self.cov1 = conv2d_bn(1, 32, kernel_size=3, padding=1)self.conv2 = conv2d_bn(32, 32, kernel_size=3, padding=1)self.conv3 = conv2d_bn(32, 64, kernel_size=3)self.max_pool1 = nn.MaxPool2d(kernel_size=3, stride=(2, 2))self.conv4 = conv2d_bn(64, 80, kernel_size=1, padding=0)self.conv5 = conv2d_bn(80, 192, kernel_size=3, padding=0)self.max_pool2 = nn.MaxPool2d(kernel_size=3, stride=(2, 2))# inception 模块self.inception3a = inceptionA(192, pool_features=32)self.inception3b = inceptionA(256, pool_features=64)# 最终的分类器self.avg_pool = nn.AdaptiveAvgPool2d((1, 1))self.fc = nn.Linear(288, classes) # 224是假设经过前面的Inception模块后得到的特征维度,实际可能需要根据具体结构计算def forward(self, x):x = self.cov1(x)x = self.conv2(x)x = self.conv3(x)x = self.max_pool1(x)x = self.conv4(x)x = self.conv5(x)x = self.max_pool2(x)x = self.inception3a(x)x = self.inception3b(x)x = self.avg_pool(x)x = torch.flatten(x, 1)x = self.fc(x)return xdef test_model(model, test_loader, device):device = torch.device(device)model.eval()correct = 0total = 0with torch.no_grad():for data in test_loader:images, labels = data[0].to(device), data[1].to(device)outputs = model(images)_, predicted = torch.max(outputs.data, 1)total += labels.size(0)correct += (predicted == labels).sum().item()print(f'Accuracy of the network on the test images: {100 * correct / total}%')# 加载数据 返回图片数组和对应label数组
def load_data_set():transform = transforms.Compose([transforms.Resize((299, 299)),transforms.ToTensor(),transforms.Normalize((0.1307,), (0.3081,))])train_set = torchvision.datasets.MNIST(root='./data', train=True,download=True, transform=transform)train_loader = torch.utils.data.DataLoader(train_set, batch_size=32,shuffle=True, num_workers=2)test_set = torchvision.datasets.MNIST(root='./data', train=False,download=True, transform=transform)test_loader = torch.utils.data.DataLoader(test_set, batch_size=32, shuffle=True, num_workers=2)return train_loader, test_loaderif __name__ == '__main__':model = InceptionV3(classes=10)# # 定义数据预处理# transform = transforms.Compose([# transforms.Resize((299, 299)),# transforms.ToTensor(),# transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))# ])train_loader, test_loader = load_data_set()# 测试模型test_model(model, test_loader, "cpu")
实现MobileNet
"""
实现mobileNet
深度可分离卷积
"""
from torch import nnimport torch
import torch.nn as nn
import torch.nn.functional as F# 定义深度可分离卷积模块
class DepthwiseSeparableConv(nn.Module):def __init__(self, in_channels, out_channels, stride=1):super(DepthwiseSeparableConv, self).__init__()self.depthwise = nn.Conv2d(in_channels, in_channels, kernel_size=3, stride=stride, padding=1, groups=in_channels)self.pointwise = nn.Conv2d(in_channels, out_channels, kernel_size=1)def forward(self, x):x = self.depthwise(x)x = self.pointwise(x)return xclass MobileNet(nn.Module):def __init__(self, num_classes=1000):super(MobileNet, self).__init__()self.conv1 = nn.Conv2d(3, 32, kernel_size=3, stride=2, padding=1)self.dwconv1 = DepthwiseSeparableConv(32, 64, stride=1)self.dwconv2 = DepthwiseSeparableConv(64, 128, stride=2)self.dwconv3 = DepthwiseSeparableConv(128, 128, stride=1)self.dwconv4 = DepthwiseSeparableConv(128, 256, stride=2)self.dwconv5 = DepthwiseSeparableConv(256, 256, stride=1)self.dwconv6 = DepthwiseSeparableConv(256, 512, stride=2)# 5个连续的深度可分离卷积,步长为1self.dwconv7 = DepthwiseSeparableConv(512, 512, stride=1)self.dwconv8 = DepthwiseSeparableConv(512, 512, stride=1)self.dwconv9 = DepthwiseSeparableConv(512, 512, stride=1)self.dwconv10 = DepthwiseSeparableConv(512, 512, stride=1)self.dwconv11 = DepthwiseSeparableConv(512, 512, stride=1)self.dwconv12 = DepthwiseSeparableConv(512, 1024, stride=2)self.dwconv13 = DepthwiseSeparableConv(1024, 1024, stride=1)self.avgpool = nn.AdaptiveAvgPool2d((1, 1))self.fc = nn.Linear(1024, num_classes)def forward(self, x):x = F.relu(self.conv1(x))x = F.relu(self.dwconv1(x))x = F.relu(self.dwconv2(x))x = F.relu(self.dwconv3(x))x = F.relu(self.dwconv4(x))x = F.relu(self.dwconv5(x))x = F.relu(self.dwconv6(x))x = F.relu(self.dwconv7(x))x = F.relu(self.dwconv8(x))x = F.relu(self.dwconv9(x))x = F.relu(self.dwconv10(x))x = F.relu(self.dwconv11(x))x = F.relu(self.dwconv12(x))x = F.relu(self.dwconv13(x))x = self.avgpool(x)x = x.view(-1, 1024)x = self.fc(x)return x# 测试模型
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = MobileNet(num_classes = 10).to(device)
input_tensor = torch.randn(1, 3, 224, 224).to(device)
output = model(input_tensor)
print(output.shape)