【深度学习入门项目】多层感知器(MLP)实现手写数字识别

多层感知器(MLP)实现手写数字识别

    • 导入必要的包
    • 获得软件包的版本信息
  • 下载并可视化数据
    • 查看一个batch的数据
    • 查看图片细节信息
    • 设置随机种子
  • 定义模型架构
      • Build model_1
      • Build model_2
    • Train the Network (30 marks)
      • Train model_1
        • Train model_1
        • Visualize the training process of the model_1
          • Plot the change of the loss of model_1 during training
          • Plot the change of the accuracy of model_1 during training
          • Plot the change of the AUC Score of model_1 during training
    • Test the Trained Network (20 marks)
      • Test model_1

在这个任务中,我们使用PyTorch训练两个多层感知机(MLP),以对MNIST数据库中的手写数字图像进行分类。

该过程将分解为以下步骤:

  • 载入并可视化数据。
  • 定义神经网络
  • 训练模型
  • 评估我们训练的模型在测试数据集上的性能
  • 分析结果

导入必要的包

import torch
from torch import nn
import numpy as np
import logging
import sys# set log
logging.basicConfig(level=logging.INFO,format='%(asctime)s %(levelname)s: %(message)s',datefmt='%Y-%m-%d %H:%M:%S',)

获得软件包的版本信息

logging.info('The version information:')
logging.info(f'Python: {sys.version}')
logging.info(f'PyTorch: {torch.__version__}')
assert torch.cuda.is_available() == True, 'Please finish your GPU develop environment'

下载并可视化数据

from torchvision import datasets
import torchvision.transforms as transforms
from torch.utils.data.dataset import Dataset# number of subprocesses to use for data loading
num_workers = 0
# how many samples per batch to load
batch_size = 20# convert data to torch.FloatTensor
transform = transforms.ToTensor()# choose the training and test datasets
train_data = datasets.MNIST(root='data', train=True,download=True, transform=transform)
test_data = datasets.MNIST(root='data', train=False,download=True, transform=transform)# prepare data loaders
def classify_label(dataset, num_classes):list_index = [[] for _ in range(num_classes)]for idx, datum in enumerate(dataset):list_index[datum[1]].append(idx)return list_indexdef partition_train(list_label2indices: list, num_per_class: int):random_state = np.random.RandomState(0)list_label2indices_train = []for indices in list_label2indices:random_state.shuffle(indices)list_label2indices_train.extend(indices[:num_per_class])return list_label2indices_trainclass Indices2Dataset(Dataset):def __init__(self, dataset):self.dataset = datasetself.indices = Nonedef load(self, indices: list):self.indices = indicesdef __getitem__(self, idx):idx = self.indices[idx]image, label = self.dataset[idx]return image, labeldef __len__(self):return len(self.indices)#  sort train data by label
list_label2indices = classify_label(dataset=train_data, num_classes=10)# how many samples per class to train
list_train = partition_train(list_label2indices, 500)# prepare data loaders  
indices2data = Indices2Dataset(train_data)
indices2data.load(list_train)
train_loader = torch.utils.data.DataLoader(indices2data, batch_size=batch_size, num_workers=num_workers, shuffle=True)
test_loader = torch.utils.data.DataLoader(test_data, batch_size=batch_size, num_workers=num_workers, shuffle=True)

查看一个batch的数据

import matplotlib.pyplot as plt
%matplotlib inline# obtain one batch of training images
dataiter = iter(train_loader)
images, labels = dataiter.next()
images = images.numpy()# plot the images in the batch, along with the corresponding labels
fig = plt.figure(figsize=(25, 4))
for idx in np.arange(20):ax = fig.add_subplot(2, 20//2, idx+1, xticks=[], yticks=[])ax.imshow(np.squeeze(images[idx]), cmap='gray')# print out the correct label for each image# .item() gets the value contained in a Tensorax.set_title(str(labels[idx].item()))

在这里插入图片描述

查看图片细节信息

img = np.squeeze(images[1])fig = plt.figure(figsize = (12,12)) 
ax = fig.add_subplot(111)
ax.imshow(img, cmap='gray')
width, height = img.shape
thresh = img.max()/2.5
for x in range(width):for y in range(height):val = round(img[x][y],2) if img[x][y] !=0 else 0ax.annotate(str(val), xy=(y,x),horizontalalignment='center',verticalalignment='center',color='white' if img[x][y]<thresh else 'black')

在这里插入图片描述

设置随机种子

随机种子用于确保结果是可复现的。

import random
import os## give the number you like such as 2023
seed_value = 2023np.random.seed(seed_value)
random.seed(seed_value)
os.environ['PYTHONHASHSEED'] = str(seed_value)torch.manual_seed(seed_value)     
torch.cuda.manual_seed(seed_value)     
torch.cuda.manual_seed_all(seed_value)   
torch.backends.cudnn.deterministic = True
logging.info(f"tha value of the random seed: {seed_value}")

定义模型架构

  • Input: a 784-dim Tensor of pixel values for each image.
  • Output: a 10-dim Tensor of number of classes that indicates the class scores for an input image.

You need to implement three models:

  1. a vanilla multi-layer perceptron. (10 marks)
  2. a multi-layer perceptron with regularization (dropout or L2 or both). (10 marks)
  3. the corresponding loss functions and optimizers. (10 marks)

Build model_1

## Define the MLP architecture
class VanillaMLP(nn.Module):def __init__(self):super(VanillaMLP, self).__init__()# implement your codes hereself.net = nn.Sequential(nn.Linear(784, 256), nn.ReLU(),nn.Linear(256, 256), nn.ReLU(),                                 nn.Linear(256,10))def forward(self, x):# flatten image inputx = x.view(-1, 28 * 28)# implement your codes herex = self.net(x)return x# initialize the MLP
model_1 = VanillaMLP()# specify loss function
# implement your codes here
loss_model_1 = torch.nn.CrossEntropyLoss()# specify your optimizer
# implement your codes here
optimizer_model_1 = torch.optim.Adam(model_1.parameters(),lr=1e-4)

Build model_2

## Define the MLP architecture
class RegularizedMLP(nn.Module):def __init__(self):super(RegularizedMLP, self).__init__()# implement your codes hereself.net = nn.Sequential(nn.Linear(784, 256), nn.Dropout(0.3),nn.ReLU(),nn.Linear(256, 256), nn.Dropout(0.5),nn.ReLU(),                                 nn.Linear(256,10))def forward(self, x):# flatten image inputx = x.view(-1, 28 * 28)# implement your codes herex=self.net(x)return x# initialize the MLP
model_2 = RegularizedMLP()# specify loss function
# implement your codes here
loss_model_2 = torch.nn.CrossEntropyLoss()# specify your optimizer
# implement your codes here
optimizer_model_2 = torch.optim.Adam(model_2.parameters(),lr=1e-4)# weight_decay=1e-5

Train the Network (30 marks)

Train your models in the following two cells.

The following loop trains for 30 epochs; feel free to change this number. For now, we suggest somewhere between 20-50 epochs. As you train, take a look at how the values for the training loss decrease over time. We want it to decrease while also avoiding overfitting the training data.

We will introduce some metrics of classification tasks and you will learn how implement these metrics with scikit-learn.

There are supply some references for you to learn: evaluation_metrics_spring2020.

In training processing, we will use accuracy, Area Under ROC and top k accuracy.

The key parts in the training process are left for you to implement.

Train model_1

Train model_1
# import scikit-learn packages
# please use the function imported from scikit-learn to metric the process of training of the model
from sklearn.metrics import accuracy_score,roc_auc_score, top_k_accuracy_score
# number of epochs to train the model
n_epochs = 20  # suggest training between 20-50 epochsmodel_1.train() # prep model for trainingtrain_loss_list = []
train_acc_list = []
train_auc_list = []
train_top_k_acc_list = []# GPU check
logging.info(f'GPU is available: {torch.cuda.is_available()}')
if torch.cuda.is_available():gpu_num = torch.cuda.device_count()logging.info(f"Train model on {gpu_num} GPUs:")for i in range(gpu_num):print('\t GPU {}.: {}'.format(i,torch.cuda.get_device_name(i)))model_1 = model_1.cuda()for epoch in range(n_epochs):# monitor training losstrain_loss = 0.0pred_array = Nonelabel_array =  Noneone_hot_label_matrix = Nonepred_matrix = Nonefor data, label in train_loader:data = data.cuda()label = label.cuda()# implement your code hereoptimizer_model_1.zero_grad()pred = model_1(data)loss = loss_model_1(pred, label)loss.backward()optimizer_model_1.step()train_loss += loss# finish the the computation of variables of metric# implement your codes hereif pred_matrix is None:pred_matrix = pred.cpu().detach().numpy()else:pred_matrix = np.concatenate((pred_matrix, pred.cpu().detach().numpy()))if one_hot_label_matrix is None:one_hot_label_matrix = nn.functional.one_hot(label, num_classes=10).cpu().detach().numpy()else:one_hot_label_matrix = np.concatenate((one_hot_label_matrix, nn.functional.one_hot(label, num_classes=10).cpu().detach().numpy()))pred = torch.argmax(pred, axis=1)if pred_array is None:pred_array = pred.cpu().detach().numpy()else:pred_array = np.concatenate((pred_array, pred.cpu().detach().numpy()))if label_array is None:label_array = label.cpu().detach().numpy()else:label_array = np.concatenate((label_array, label.cpu().detach().numpy()))# print training statistics # read the API document at https://scikit-learn.org/stable/modules/classes.html#module-sklearn.metrics to finish your code# don't craft your own code# calculate average loss and accuracy over an epochtop_k = 3train_loss = train_loss / len(train_loader.dataset)train_acc = 100. * accuracy_score(label_array, pred_array)train_auc = roc_auc_score(one_hot_label_matrix, pred_matrix , multi_class='ovo')top_k_acc = top_k_accuracy_score(label_array, pred_matrix , k=top_k,)# append the value of the metric to the listtrain_loss_list.append(train_loss.cpu().detach().numpy())train_acc_list.append(train_acc)train_auc_list.append(train_auc)train_top_k_acc_list.append(top_k_acc)logging.info('Epoch: {} \tTraining Loss: {:.6f} \tTraining Acc: {:.2f}% \t top {} Acc: {:.2f}% \t AUC Score: {:.4f}'.format(epoch+1, train_loss,train_acc,top_k,top_k_acc,train_auc,))
Visualize the training process of the model_1

Please read the document to finish the training process visualization.
For more information, please refer to the document

Plot the change of the loss of model_1 during training
epochs_list = list(range(1,n_epochs+1))
plt.figure(figsize=(20, 8))
plt.plot(epochs_list, train_loss_list)
plt.title('Model_1 loss')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Train'], loc='upper right')
plt.show()

在这里插入图片描述

Plot the change of the accuracy of model_1 during training
plt.figure(figsize=(20, 8))
plt.plot(epochs_list, train_acc_list)
plt.title('Model_1 accuracy')
plt.ylabel('accuracy')
plt.xlabel('Epoch')
plt.legend(['Train'], loc='upper right')
plt.show()

在这里插入图片描述

Plot the change of the AUC Score of model_1 during training
plt.figure(figsize=(20, 8))
plt.plot(epochs_list, train_auc_list)
plt.title('Model_1 auc')
plt.ylabel('Auc')
plt.xlabel('Epoch')
plt.legend(['Train'], loc='upper right')
plt.show()

在这里插入图片描述

Test the Trained Network (20 marks)

Test the performance of trained models on test data. Except the total test accuracy, you should calculate the accuracy for each class.

About metrics, in test processing, we will use accuracy, top k accuracy, precision, recall, f1-score and confusion matrix.

Besides, we will visualize the confusion matrix.

Last but not least, we will compare your implementation of function to compute accuracy with the implementation of scikit-learn.

## define your implementation of function to compute accuracy
def accuracy_score_manual(label_array, pred_array):# implement your codes hereaccuracy = np.sum(label_array == pred_array) / float(len(label_array))return accuracy

Test model_1

from sklearn.metrics import classification_report,ConfusionMatrixDisplay
# initialize lists to monitor test loss and accuracy
test_loss = 0.0pred_array = None
label_array =  Noneone_hot_label_matrix = None
pred_matrix = Nonemodel_1.eval() # prep model for *evaluation*for data, label in test_loader:data = data.cuda()label = label.cuda()# implement your code herepred = model_1(data)test_loss = loss_model_1(pred, label)if pred_matrix is None:pred_matrix = pred.cpu().detach().numpy()else:pred_matrix = np.concatenate((pred_matrix, pred.cpu().detach().numpy()))if one_hot_label_matrix is None:one_hot_label_matrix = nn.functional.one_hot(label, num_classes=10).cpu().detach().numpy()else:one_hot_label_matrix = np.concatenate((one_hot_label_matrix, nn.functional.one_hot(label, num_classes=10).cpu().detach().numpy()))pred = torch.argmax(pred, axis=1)if pred_array is None:pred_array = pred.cpu().detach().numpy()else:pred_array = np.concatenate((pred_array, pred.cpu().detach().numpy()))if label_array is None:label_array = label.cpu().detach().numpy()else:label_array = np.concatenate((label_array, label.cpu().detach().numpy()))
# calculate and print avg test loss
test_loss = test_loss / len(test_loader.dataset)
test_acc = accuracy_score(label_array, pred_array)
test_auc = roc_auc_score(one_hot_label_matrix, pred_matrix , multi_class='ovo')
test_top_k3_acc = top_k_accuracy_score(label_array, pred_matrix , k=3)
test_top_k5_acc = top_k_accuracy_score(label_array, pred_matrix , k=5)logging.info('Test Loss: {:.6f}'.format(test_loss))
logging.info('Test Accuracy: {:.6f}'.format(test_acc))
logging.info('Test top 3 Accuracy: {:.6f}'.format(test_top_k3_acc ))
logging.info('Test top 5 Accuracy: {:.6f}'.format(test_top_k5_acc ))
logging.info('The classification report of test for model_1')
print(classification_report(label_array, pred_array))

在这里插入图片描述

ConfusionMatrixDisplay.from_predictions(label_array,pred_array)
plt.show()

在这里插入图片描述

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/diannao/48579.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

理解Go 语言中读写锁 RWMutex

读写锁是计算机程序并发控制的一种针结互斥锁优化的同步机制,也称 “共享-互斥锁” 、多读单写锁等,用于处理大量读、少量写的场景。读操作之间可并发进行,写操作之间是互斥的,读和写又是互斥的。这意味着多个 goroutine 可以同时读数据,但写数据时需要获得一个独占的锁。…

mac电脑显示隐藏文件

方法一&#xff1a; 第一步&#xff1a;打开「终端」应用程序。 第二步&#xff1a;输入如下命令&#xff1a; defaults write com.apple.finder AppleShowAllFiles -boolean true ; killall Finder 第三步&#xff1a;按下「Return」键确认。 现在你将会在 Finder 窗口中…

AI+X活动开放报名!Datawhale来南京了

Datawhale线下 主办方&#xff1a;讯飞开放平台、Datawhale、GDG南京 AIX 主题活动今年将走进 10 个城市&#xff0c;100 所高校&#xff0c;目前已经走进32所高校&#xff0c;以及北京、深圳、上海、杭州、武汉五个城市&#xff0c;南京是第六个城市&#xff0c;时间7月27号。…

IP-Trunk简介

定义 IP-Trunk是将多个链路层协议为HDLC的POS接口捆绑到一起&#xff0c;形成一条逻辑上的数据链路&#xff0c;以提供更高的连接可靠性和更大的带宽&#xff0c;实现流量负载分担。 目的 POS是一种应用在城域网及广域网中的技术&#xff0c;利用SONET/SDH提供的高速传输通道…

html改写vue日志

本人最近学了vue&#xff0c;想着练手的方法就是改写之前在公司开发的小系统前端&#xff0c;将前端的AJAXJSThymeleaf改为axiosvue。 改写html 将<html>中的<head>和<body>结构移除&#xff0c;将css部分移入<style>&#xff0c; 重新定义了全局的&…

视频汇聚,GB28181,rtsp,rtmp,sip,webrtc,视频点播等多元异构视频融合,视频通话,视频会议交互方案

现在视频汇聚&#xff0c;视频融合和视频互动&#xff0c;是视频技术的应用方向&#xff0c;目前客户一般有很多视频的业务系统&#xff0c;如已有GB28181的监控&#xff08;GB现在是国内主流&#xff0c;大量开源接入和商用方案&#xff09;&#xff0c;rtsp设备&#xff0c;音…

x264、x265、libaom 编码对比实验

介绍 x264 是一个开源的高性能 H.264/MPEG-4 AVC 编码器,它以其优秀的压缩比和广泛的适用性而闻名。x265 是一种用于将视频流编码成 H.265/MPEG-H HEVC 压缩格式的免费软件库和应用程序,以其下一代压缩能力和卓越的质量而闻名 。作为 x264 的继任者,x265 支持 HEVC 的 Main、…

科研绘图系列:R语言单细胞聚类气泡图(single cell bubble)

介绍 单细胞的标记基因气泡图是一种用于展示单细胞数据中特定基因表达情况的可视化方法。它通常用于展示细胞亚群中标记基因的表达水平,帮助研究者识别和区分不同的细胞类型。在这种图表中,每个细胞亚群用不同的颜色表示,而基因表达水平则通过气泡的大小来表示,从而直观地…

前端算法入门【栈】

在JavaScript中是不存在栈这个数据结构的&#xff0c;但是我们可以通过数组来模拟栈。 1、基本实现 栈是一种“后进先后”的数据结构&#xff0c;数据在内存中是连续存储的&#xff0c;可以通过数组的 push 来模拟茹栈&#xff0c;pop 来模拟入栈。 // 栈 后进先出 const stac…

Android笔试面试题AI答之Intent(1)

答案仅供参考&#xff0c;来自文心一言 目录 1.请描述一下Intent 和 Intent Filter。IntentIntent Filter总结 2.Intent可以传递哪些类型数据&#xff1f;3.Serializable 和 Parcelable 的区别1. 效率2. 使用方式3. 适用场景4. 其他差异 4.请写出直接拨号、将电话号码传入拨号…

【IEEE出版,会议历史良好、论文录用检索快】第四届计算机科学与区块链国际学术会议 (CCSB 2024,9月6-8)

CCSB 2024会议由深圳大学主办&#xff0c;旨在探讨计算机科学的最新发展如何与区块链技术相结合&#xff0c;以及这一结合如何推动金融、供应链管理、数据安全和其他多个行业的革新&#xff0c; 本次会议将提供一个多学科交流的平台&#xff0c;汇集来自相关领域学者的研究和思…

最优化理论与方法-第十讲-对偶理论的基本性质和割平面法

文章目录 1. 向量化拉格朗日对偶函数2. 对偶问题是凹函数3. 对偶问题转换4. 外逼近法4.1 步骤4.2 注意事项 1. 向量化拉格朗日对偶函数 ( D ) max ⁡ d ( λ , μ ) s t . λ i ≥ 0 , i 1 , ⋯ , m , d ( λ , μ ) min ⁡ x ∈ X { f ( x ) ∑ i 1 m λ i g i ( x ) ∑ …

【AI那些事】YOLO算法在香橙派AIpro上跑起来的初体验

一、本文概述 在之前推出的Yolo算法后&#xff0c;我在windows电脑上使用python语言运行将其跑通了&#xff0c;觉的这个识别算法很是有意思&#xff0c;就一直想着这个算法能不能跑在硬件的开发板上那就太好了。我就开始寻找市面上可行的开发板&#xff0c;一直期盼的事情真的…

【学术研究、研究热点、最新前沿】如何跟踪最新的论文

1.跟踪arxiv 使用https://www.arxivdaily.com/接收每天的推送。 2.跟踪热点文章的引用 使用semantic scholar。 3.跟踪某个学术大佬或者主题 3.1 使用web of science。 3.2 使用文献鸟 4.跟踪某个期刊

迭代学习笔记

一、迭代学习定义和分类 1、直观理解 迭代学习一般应用于重复性的场景。比如控制一个单自由度的小车以特定的速度曲线移动到指定位置&#xff0c;整个时间是10s&#xff0c;控制频率是0.01&#xff0c;那么整个控制序列就会有1000个点。这1000个点在10s内依次发出&#xff0c…

小白快速入门量化交易的自学路径

今年已然过去一半了&#xff0c;年初立的flag都实现了吗&#xff1f; 据我多年来的观察&#xff0c;很多小白萌新开始学习量化&#xff0c;特别是年初的时候立下“宏图大志”&#xff0c;但有相当一部分最终没能"上岸"&#xff0c;从入门到放弃&#xff0c;从然后到没…

数据结构2—顺序表(附源码)

1.线性表 线性表&#xff08;linear list&#xff09;是n个具有相同特性的数据元素的有限序列。线性表是一种在实际中广泛使用的数据结构&#xff0c;常见的线性表&#xff1a;顺序表、链表、栈、队列、字符串... 线性表在逻辑上是线性结构&#xff0c;也就是说连续的一条直线…

抽象java入门1.5.2

前言&#xff1a; 坑留下来是为了补的 正片&#xff1a; 一、面向对象特性 二、面向对象编程详细展开 这些没有加粗的方法究竟来源哪&#xff1f; 在上一期的提示中&#xff0c;我们说了这些方法来源于面向对象编程的一个特性 验证&#xff1a; 第一步&#xff1a;我们先…

区块链空投之空投合约

关于 solidity、空投、智能合约 空投作为区块链行业最大的惊喜之一,很多人都是通过空投才接触到了这一领域。 甚至有很多专业薅空投羊毛的专业玩家。行业内有句话,小薅养活妻儿、大薅… 文章目录 前言空投到底是什么空投什么?空投合约代码空投步骤代码解析代码详解前言 今…

代码随想录算法训练营第23天|39. 组合总和、40.组合总和II、131.分割回文串

打卡Day23 1.39. 组合总和2.40.组合总和II3.131.分割回文串 1.39. 组合总和 题目链接&#xff1a;39. 组合总和 文档讲解&#xff1a; 代码随想录 这道题和昨天做的组合之和由两个区别&#xff1a;被选的元素没有数量限制&#xff0c;同时被选的元素可以无限重复&#xff0c;…