【深度学习入门项目】多层感知器(MLP)实现手写数字识别

多层感知器(MLP)实现手写数字识别

    • 导入必要的包
    • 获得软件包的版本信息
  • 下载并可视化数据
    • 查看一个batch的数据
    • 查看图片细节信息
    • 设置随机种子
  • 定义模型架构
      • Build model_1
      • Build model_2
    • Train the Network (30 marks)
      • Train model_1
        • Train model_1
        • Visualize the training process of the model_1
          • Plot the change of the loss of model_1 during training
          • Plot the change of the accuracy of model_1 during training
          • Plot the change of the AUC Score of model_1 during training
    • Test the Trained Network (20 marks)
      • Test model_1

在这个任务中,我们使用PyTorch训练两个多层感知机(MLP),以对MNIST数据库中的手写数字图像进行分类。

该过程将分解为以下步骤:

  • 载入并可视化数据。
  • 定义神经网络
  • 训练模型
  • 评估我们训练的模型在测试数据集上的性能
  • 分析结果

导入必要的包

import torch
from torch import nn
import numpy as np
import logging
import sys# set log
logging.basicConfig(level=logging.INFO,format='%(asctime)s %(levelname)s: %(message)s',datefmt='%Y-%m-%d %H:%M:%S',)

获得软件包的版本信息

logging.info('The version information:')
logging.info(f'Python: {sys.version}')
logging.info(f'PyTorch: {torch.__version__}')
assert torch.cuda.is_available() == True, 'Please finish your GPU develop environment'

下载并可视化数据

from torchvision import datasets
import torchvision.transforms as transforms
from torch.utils.data.dataset import Dataset# number of subprocesses to use for data loading
num_workers = 0
# how many samples per batch to load
batch_size = 20# convert data to torch.FloatTensor
transform = transforms.ToTensor()# choose the training and test datasets
train_data = datasets.MNIST(root='data', train=True,download=True, transform=transform)
test_data = datasets.MNIST(root='data', train=False,download=True, transform=transform)# prepare data loaders
def classify_label(dataset, num_classes):list_index = [[] for _ in range(num_classes)]for idx, datum in enumerate(dataset):list_index[datum[1]].append(idx)return list_indexdef partition_train(list_label2indices: list, num_per_class: int):random_state = np.random.RandomState(0)list_label2indices_train = []for indices in list_label2indices:random_state.shuffle(indices)list_label2indices_train.extend(indices[:num_per_class])return list_label2indices_trainclass Indices2Dataset(Dataset):def __init__(self, dataset):self.dataset = datasetself.indices = Nonedef load(self, indices: list):self.indices = indicesdef __getitem__(self, idx):idx = self.indices[idx]image, label = self.dataset[idx]return image, labeldef __len__(self):return len(self.indices)#  sort train data by label
list_label2indices = classify_label(dataset=train_data, num_classes=10)# how many samples per class to train
list_train = partition_train(list_label2indices, 500)# prepare data loaders  
indices2data = Indices2Dataset(train_data)
indices2data.load(list_train)
train_loader = torch.utils.data.DataLoader(indices2data, batch_size=batch_size, num_workers=num_workers, shuffle=True)
test_loader = torch.utils.data.DataLoader(test_data, batch_size=batch_size, num_workers=num_workers, shuffle=True)

查看一个batch的数据

import matplotlib.pyplot as plt
%matplotlib inline# obtain one batch of training images
dataiter = iter(train_loader)
images, labels = dataiter.next()
images = images.numpy()# plot the images in the batch, along with the corresponding labels
fig = plt.figure(figsize=(25, 4))
for idx in np.arange(20):ax = fig.add_subplot(2, 20//2, idx+1, xticks=[], yticks=[])ax.imshow(np.squeeze(images[idx]), cmap='gray')# print out the correct label for each image# .item() gets the value contained in a Tensorax.set_title(str(labels[idx].item()))

在这里插入图片描述

查看图片细节信息

img = np.squeeze(images[1])fig = plt.figure(figsize = (12,12)) 
ax = fig.add_subplot(111)
ax.imshow(img, cmap='gray')
width, height = img.shape
thresh = img.max()/2.5
for x in range(width):for y in range(height):val = round(img[x][y],2) if img[x][y] !=0 else 0ax.annotate(str(val), xy=(y,x),horizontalalignment='center',verticalalignment='center',color='white' if img[x][y]<thresh else 'black')

在这里插入图片描述

设置随机种子

随机种子用于确保结果是可复现的。

import random
import os## give the number you like such as 2023
seed_value = 2023np.random.seed(seed_value)
random.seed(seed_value)
os.environ['PYTHONHASHSEED'] = str(seed_value)torch.manual_seed(seed_value)     
torch.cuda.manual_seed(seed_value)     
torch.cuda.manual_seed_all(seed_value)   
torch.backends.cudnn.deterministic = True
logging.info(f"tha value of the random seed: {seed_value}")

定义模型架构

  • Input: a 784-dim Tensor of pixel values for each image.
  • Output: a 10-dim Tensor of number of classes that indicates the class scores for an input image.

You need to implement three models:

  1. a vanilla multi-layer perceptron. (10 marks)
  2. a multi-layer perceptron with regularization (dropout or L2 or both). (10 marks)
  3. the corresponding loss functions and optimizers. (10 marks)

Build model_1

## Define the MLP architecture
class VanillaMLP(nn.Module):def __init__(self):super(VanillaMLP, self).__init__()# implement your codes hereself.net = nn.Sequential(nn.Linear(784, 256), nn.ReLU(),nn.Linear(256, 256), nn.ReLU(),                                 nn.Linear(256,10))def forward(self, x):# flatten image inputx = x.view(-1, 28 * 28)# implement your codes herex = self.net(x)return x# initialize the MLP
model_1 = VanillaMLP()# specify loss function
# implement your codes here
loss_model_1 = torch.nn.CrossEntropyLoss()# specify your optimizer
# implement your codes here
optimizer_model_1 = torch.optim.Adam(model_1.parameters(),lr=1e-4)

Build model_2

## Define the MLP architecture
class RegularizedMLP(nn.Module):def __init__(self):super(RegularizedMLP, self).__init__()# implement your codes hereself.net = nn.Sequential(nn.Linear(784, 256), nn.Dropout(0.3),nn.ReLU(),nn.Linear(256, 256), nn.Dropout(0.5),nn.ReLU(),                                 nn.Linear(256,10))def forward(self, x):# flatten image inputx = x.view(-1, 28 * 28)# implement your codes herex=self.net(x)return x# initialize the MLP
model_2 = RegularizedMLP()# specify loss function
# implement your codes here
loss_model_2 = torch.nn.CrossEntropyLoss()# specify your optimizer
# implement your codes here
optimizer_model_2 = torch.optim.Adam(model_2.parameters(),lr=1e-4)# weight_decay=1e-5

Train the Network (30 marks)

Train your models in the following two cells.

The following loop trains for 30 epochs; feel free to change this number. For now, we suggest somewhere between 20-50 epochs. As you train, take a look at how the values for the training loss decrease over time. We want it to decrease while also avoiding overfitting the training data.

We will introduce some metrics of classification tasks and you will learn how implement these metrics with scikit-learn.

There are supply some references for you to learn: evaluation_metrics_spring2020.

In training processing, we will use accuracy, Area Under ROC and top k accuracy.

The key parts in the training process are left for you to implement.

Train model_1

Train model_1
# import scikit-learn packages
# please use the function imported from scikit-learn to metric the process of training of the model
from sklearn.metrics import accuracy_score,roc_auc_score, top_k_accuracy_score
# number of epochs to train the model
n_epochs = 20  # suggest training between 20-50 epochsmodel_1.train() # prep model for trainingtrain_loss_list = []
train_acc_list = []
train_auc_list = []
train_top_k_acc_list = []# GPU check
logging.info(f'GPU is available: {torch.cuda.is_available()}')
if torch.cuda.is_available():gpu_num = torch.cuda.device_count()logging.info(f"Train model on {gpu_num} GPUs:")for i in range(gpu_num):print('\t GPU {}.: {}'.format(i,torch.cuda.get_device_name(i)))model_1 = model_1.cuda()for epoch in range(n_epochs):# monitor training losstrain_loss = 0.0pred_array = Nonelabel_array =  Noneone_hot_label_matrix = Nonepred_matrix = Nonefor data, label in train_loader:data = data.cuda()label = label.cuda()# implement your code hereoptimizer_model_1.zero_grad()pred = model_1(data)loss = loss_model_1(pred, label)loss.backward()optimizer_model_1.step()train_loss += loss# finish the the computation of variables of metric# implement your codes hereif pred_matrix is None:pred_matrix = pred.cpu().detach().numpy()else:pred_matrix = np.concatenate((pred_matrix, pred.cpu().detach().numpy()))if one_hot_label_matrix is None:one_hot_label_matrix = nn.functional.one_hot(label, num_classes=10).cpu().detach().numpy()else:one_hot_label_matrix = np.concatenate((one_hot_label_matrix, nn.functional.one_hot(label, num_classes=10).cpu().detach().numpy()))pred = torch.argmax(pred, axis=1)if pred_array is None:pred_array = pred.cpu().detach().numpy()else:pred_array = np.concatenate((pred_array, pred.cpu().detach().numpy()))if label_array is None:label_array = label.cpu().detach().numpy()else:label_array = np.concatenate((label_array, label.cpu().detach().numpy()))# print training statistics # read the API document at https://scikit-learn.org/stable/modules/classes.html#module-sklearn.metrics to finish your code# don't craft your own code# calculate average loss and accuracy over an epochtop_k = 3train_loss = train_loss / len(train_loader.dataset)train_acc = 100. * accuracy_score(label_array, pred_array)train_auc = roc_auc_score(one_hot_label_matrix, pred_matrix , multi_class='ovo')top_k_acc = top_k_accuracy_score(label_array, pred_matrix , k=top_k,)# append the value of the metric to the listtrain_loss_list.append(train_loss.cpu().detach().numpy())train_acc_list.append(train_acc)train_auc_list.append(train_auc)train_top_k_acc_list.append(top_k_acc)logging.info('Epoch: {} \tTraining Loss: {:.6f} \tTraining Acc: {:.2f}% \t top {} Acc: {:.2f}% \t AUC Score: {:.4f}'.format(epoch+1, train_loss,train_acc,top_k,top_k_acc,train_auc,))
Visualize the training process of the model_1

Please read the document to finish the training process visualization.
For more information, please refer to the document

Plot the change of the loss of model_1 during training
epochs_list = list(range(1,n_epochs+1))
plt.figure(figsize=(20, 8))
plt.plot(epochs_list, train_loss_list)
plt.title('Model_1 loss')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Train'], loc='upper right')
plt.show()

在这里插入图片描述

Plot the change of the accuracy of model_1 during training
plt.figure(figsize=(20, 8))
plt.plot(epochs_list, train_acc_list)
plt.title('Model_1 accuracy')
plt.ylabel('accuracy')
plt.xlabel('Epoch')
plt.legend(['Train'], loc='upper right')
plt.show()

在这里插入图片描述

Plot the change of the AUC Score of model_1 during training
plt.figure(figsize=(20, 8))
plt.plot(epochs_list, train_auc_list)
plt.title('Model_1 auc')
plt.ylabel('Auc')
plt.xlabel('Epoch')
plt.legend(['Train'], loc='upper right')
plt.show()

在这里插入图片描述

Test the Trained Network (20 marks)

Test the performance of trained models on test data. Except the total test accuracy, you should calculate the accuracy for each class.

About metrics, in test processing, we will use accuracy, top k accuracy, precision, recall, f1-score and confusion matrix.

Besides, we will visualize the confusion matrix.

Last but not least, we will compare your implementation of function to compute accuracy with the implementation of scikit-learn.

## define your implementation of function to compute accuracy
def accuracy_score_manual(label_array, pred_array):# implement your codes hereaccuracy = np.sum(label_array == pred_array) / float(len(label_array))return accuracy

Test model_1

from sklearn.metrics import classification_report,ConfusionMatrixDisplay
# initialize lists to monitor test loss and accuracy
test_loss = 0.0pred_array = None
label_array =  Noneone_hot_label_matrix = None
pred_matrix = Nonemodel_1.eval() # prep model for *evaluation*for data, label in test_loader:data = data.cuda()label = label.cuda()# implement your code herepred = model_1(data)test_loss = loss_model_1(pred, label)if pred_matrix is None:pred_matrix = pred.cpu().detach().numpy()else:pred_matrix = np.concatenate((pred_matrix, pred.cpu().detach().numpy()))if one_hot_label_matrix is None:one_hot_label_matrix = nn.functional.one_hot(label, num_classes=10).cpu().detach().numpy()else:one_hot_label_matrix = np.concatenate((one_hot_label_matrix, nn.functional.one_hot(label, num_classes=10).cpu().detach().numpy()))pred = torch.argmax(pred, axis=1)if pred_array is None:pred_array = pred.cpu().detach().numpy()else:pred_array = np.concatenate((pred_array, pred.cpu().detach().numpy()))if label_array is None:label_array = label.cpu().detach().numpy()else:label_array = np.concatenate((label_array, label.cpu().detach().numpy()))
# calculate and print avg test loss
test_loss = test_loss / len(test_loader.dataset)
test_acc = accuracy_score(label_array, pred_array)
test_auc = roc_auc_score(one_hot_label_matrix, pred_matrix , multi_class='ovo')
test_top_k3_acc = top_k_accuracy_score(label_array, pred_matrix , k=3)
test_top_k5_acc = top_k_accuracy_score(label_array, pred_matrix , k=5)logging.info('Test Loss: {:.6f}'.format(test_loss))
logging.info('Test Accuracy: {:.6f}'.format(test_acc))
logging.info('Test top 3 Accuracy: {:.6f}'.format(test_top_k3_acc ))
logging.info('Test top 5 Accuracy: {:.6f}'.format(test_top_k5_acc ))
logging.info('The classification report of test for model_1')
print(classification_report(label_array, pred_array))

在这里插入图片描述

ConfusionMatrixDisplay.from_predictions(label_array,pred_array)
plt.show()

在这里插入图片描述

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/diannao/48579.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

AI+X活动开放报名!Datawhale来南京了

Datawhale线下 主办方&#xff1a;讯飞开放平台、Datawhale、GDG南京 AIX 主题活动今年将走进 10 个城市&#xff0c;100 所高校&#xff0c;目前已经走进32所高校&#xff0c;以及北京、深圳、上海、杭州、武汉五个城市&#xff0c;南京是第六个城市&#xff0c;时间7月27号。…

IP-Trunk简介

定义 IP-Trunk是将多个链路层协议为HDLC的POS接口捆绑到一起&#xff0c;形成一条逻辑上的数据链路&#xff0c;以提供更高的连接可靠性和更大的带宽&#xff0c;实现流量负载分担。 目的 POS是一种应用在城域网及广域网中的技术&#xff0c;利用SONET/SDH提供的高速传输通道…

html改写vue日志

本人最近学了vue&#xff0c;想着练手的方法就是改写之前在公司开发的小系统前端&#xff0c;将前端的AJAXJSThymeleaf改为axiosvue。 改写html 将<html>中的<head>和<body>结构移除&#xff0c;将css部分移入<style>&#xff0c; 重新定义了全局的&…

视频汇聚,GB28181,rtsp,rtmp,sip,webrtc,视频点播等多元异构视频融合,视频通话,视频会议交互方案

现在视频汇聚&#xff0c;视频融合和视频互动&#xff0c;是视频技术的应用方向&#xff0c;目前客户一般有很多视频的业务系统&#xff0c;如已有GB28181的监控&#xff08;GB现在是国内主流&#xff0c;大量开源接入和商用方案&#xff09;&#xff0c;rtsp设备&#xff0c;音…

科研绘图系列:R语言单细胞聚类气泡图(single cell bubble)

介绍 单细胞的标记基因气泡图是一种用于展示单细胞数据中特定基因表达情况的可视化方法。它通常用于展示细胞亚群中标记基因的表达水平,帮助研究者识别和区分不同的细胞类型。在这种图表中,每个细胞亚群用不同的颜色表示,而基因表达水平则通过气泡的大小来表示,从而直观地…

【IEEE出版,会议历史良好、论文录用检索快】第四届计算机科学与区块链国际学术会议 (CCSB 2024,9月6-8)

CCSB 2024会议由深圳大学主办&#xff0c;旨在探讨计算机科学的最新发展如何与区块链技术相结合&#xff0c;以及这一结合如何推动金融、供应链管理、数据安全和其他多个行业的革新&#xff0c; 本次会议将提供一个多学科交流的平台&#xff0c;汇集来自相关领域学者的研究和思…

最优化理论与方法-第十讲-对偶理论的基本性质和割平面法

文章目录 1. 向量化拉格朗日对偶函数2. 对偶问题是凹函数3. 对偶问题转换4. 外逼近法4.1 步骤4.2 注意事项 1. 向量化拉格朗日对偶函数 ( D ) max ⁡ d ( λ , μ ) s t . λ i ≥ 0 , i 1 , ⋯ , m , d ( λ , μ ) min ⁡ x ∈ X { f ( x ) ∑ i 1 m λ i g i ( x ) ∑ …

【AI那些事】YOLO算法在香橙派AIpro上跑起来的初体验

一、本文概述 在之前推出的Yolo算法后&#xff0c;我在windows电脑上使用python语言运行将其跑通了&#xff0c;觉的这个识别算法很是有意思&#xff0c;就一直想着这个算法能不能跑在硬件的开发板上那就太好了。我就开始寻找市面上可行的开发板&#xff0c;一直期盼的事情真的…

【学术研究、研究热点、最新前沿】如何跟踪最新的论文

1.跟踪arxiv 使用https://www.arxivdaily.com/接收每天的推送。 2.跟踪热点文章的引用 使用semantic scholar。 3.跟踪某个学术大佬或者主题 3.1 使用web of science。 3.2 使用文献鸟 4.跟踪某个期刊

迭代学习笔记

一、迭代学习定义和分类 1、直观理解 迭代学习一般应用于重复性的场景。比如控制一个单自由度的小车以特定的速度曲线移动到指定位置&#xff0c;整个时间是10s&#xff0c;控制频率是0.01&#xff0c;那么整个控制序列就会有1000个点。这1000个点在10s内依次发出&#xff0c…

小白快速入门量化交易的自学路径

今年已然过去一半了&#xff0c;年初立的flag都实现了吗&#xff1f; 据我多年来的观察&#xff0c;很多小白萌新开始学习量化&#xff0c;特别是年初的时候立下“宏图大志”&#xff0c;但有相当一部分最终没能"上岸"&#xff0c;从入门到放弃&#xff0c;从然后到没…

抽象java入门1.5.2

前言&#xff1a; 坑留下来是为了补的 正片&#xff1a; 一、面向对象特性 二、面向对象编程详细展开 这些没有加粗的方法究竟来源哪&#xff1f; 在上一期的提示中&#xff0c;我们说了这些方法来源于面向对象编程的一个特性 验证&#xff1a; 第一步&#xff1a;我们先…

代码随想录算法训练营第23天|39. 组合总和、40.组合总和II、131.分割回文串

打卡Day23 1.39. 组合总和2.40.组合总和II3.131.分割回文串 1.39. 组合总和 题目链接&#xff1a;39. 组合总和 文档讲解&#xff1a; 代码随想录 这道题和昨天做的组合之和由两个区别&#xff1a;被选的元素没有数量限制&#xff0c;同时被选的元素可以无限重复&#xff0c;…

JavaScript:节流与防抖

目录 一、前言 二、节流&#xff08;Throttle&#xff09; 1、定义 2、使用场景 3、实现原理 4、代码示例 5、封装节流函数 三、防抖&#xff08;Debounce&#xff09; 1、定义 2、使用场景 3、实现原理 4、代码示例 5、封装防抖函数 四、异同点总结 一、前言 …

Adobe Premiere Pro(Pr)安装包软件下载

一、简介 Adobe Premiere Pro&#xff08;简称Pr&#xff09;是由Adobe公司开发的一款功能强大的视频编辑软件。它支持多平台使用&#xff0c;包括Windows和Mac系统&#xff0c;并且拥有良好的兼容性和高效的性能。Premiere Pro不仅提供了视频剪辑、特效添加、音频处理等基本功…

《从C/C++到Java入门指南》- 9.字符和字符串

字符和字符串 字符类型 Java 中一个字符保存一个Unicode字符&#xff0c;所以一个中文和一个英文字母都占用两个字节。 // 计算1 .. 100 public class Hello {public static void main(String[] args) {char a A;char b 中;System.out.println(a);System.out.println(b)…

【2024最新华为OD-C/D卷试题汇总】[支持在线评测] 二进制游戏(200分)- 三语言AC题解(Python/Java/Cpp)

🍭 大家好这里是清隆学长 ,一枚热爱算法的程序员 ✨ 本系列打算持续跟新华为OD-C/D卷的三语言AC题解 💻 ACM银牌🥈| 多次AK大厂笔试 | 编程一对一辅导 👏 感谢大家的订阅➕ 和 喜欢💗 🍿 最新华为OD机试D卷目录,全、新、准,题目覆盖率达 95% 以上,支持题目在线…

Transformer之Vision Transformer结构解读

论文地址 代码地址 写在前面 什么是Transformer呢&#xff1f;就是把符号向量化为Token&#xff0c; 再和位置编码求和或者做阿达玛积&#xff0c;最后送入一定层数的Attention Block构成的Encoder和Decoder&#xff0c;就完成了Transformer的基础功能。 那么&#xff0c;把上…

idea2019版本创建JavaWeb项目并配置Tomcat步骤

一、创建JavaWeb项目 1.新建项目File->New->Project 2. 选择JavaWeb应用在New Project窗口中选择Java后勾选Java EE中的Web Application后点击next即可 3.设置项目名称后点击finish即可 4.至此项目创建完成&#xff0c;检查文件是否齐全&#xff0c;开始配置Tomcat 二、…

IDEA工具中Java语言写小工具遇到的问题

一&#xff1a;读取excel时遇到 org/apache/poi/ss/usermodel/WorkbookProvider 解决办法&#xff1a; 在pom.xml中把poi的引文包放在最前面即可&#xff08;目前就算放在最后面也不报错了&#xff0c;不知道为啥&#xff09; 二&#xff1a;本地maven打包时&#xff0c;没有…