[Kaggle] Digit Recognizer 手写数字识别(卷积神经网络)

文章目录

    • 1. 使用 LeNet 预测
      • 1.1 导入包
      • 1.2 建立 LeNet 模型
      • 1.3 读入数据
      • 1.4 定义模型
      • 1.5 训练
      • 1.6 绘制训练曲线
      • 1.7 预测提交
    • 2. 使用 VGG16 迁移学习
      • 2.1 导入包
      • 2.2 定义模型
      • 2.3 数据处理
      • 2.4 配置模型、训练
      • 2.5 预测提交

Digit Recognizer 练习地址

相关博文:
[Hands On ML] 3. 分类(MNIST手写数字预测)
[Kaggle] Digit Recognizer 手写数字识别
[Kaggle] Digit Recognizer 手写数字识别(简单神经网络)
04.卷积神经网络 W1.卷积神经网络

上一篇的简单神经网络,将28*28的图片展平了,每个像素在空间上的位置关系是没有考虑的,空间的信息丢失。

1. 使用 LeNet 预测

LeNet神经网络 参考博文

1.1 导入包

from keras import backend as K # 兼容不同后端的代码
from keras.models import Sequential
from keras.layers.convolutional import Conv2D
from keras.layers.convolutional import MaxPooling2D
from keras.layers.core import Activation
from keras.layers.core import Dense
from keras.layers.core import Flatten
from keras.utils import np_utils
from keras.optimizers import SGD, Adam, RMSpropimport numpy as np
%matplotlib inline
import matplotlib.pyplot as plt
import pandas as pd

1.2 建立 LeNet 模型

# 图片格式问题
# K.image_data_format() == 'channels_last' 
# 默认是last是通道  K.set_image_dim_ordering("tf")
# K.image_data_format() == 'channels_first' #  K.set_image_dim_ordering("th")class LeNet:@staticmethoddef build(input_shape, classes):model = Sequential()model.add(Conv2D(20,kernel_size=5,padding='same',input_shape=input_shape,activation='relu'))model.add(MaxPooling2D(pool_size=(2,2),strides=(2,2)))model.add(Conv2D(50,kernel_size=5,padding='same',activation='relu'))model.add(MaxPooling2D(pool_size=(2,2),strides=(2,2)))model.add(Flatten())model.add(Dense(500, activation='relu'))model.add(Dense(classes,activation='softmax'))return model

1.3 读入数据

train = pd.read_csv('train.csv')
y_train_full = train['label']
X_train_full = train.drop(['label'], axis=1)
X_test_full = pd.read_csv('test.csv')
X_train_full.shape

输出:

(42000, 784)
  • 数据格式转换,增加一个通道维度
X_train = np.array(X_train_full).reshape(-1,28,28) / 255.0
X_test = np.array(X_test_full).reshape(-1,28,28)/255.0
y_train = np_utils.to_categorical(y_train_full, 10) # 转成oh编码X_train = X_train[:, :, :, np.newaxis]
# m,28,28 -->  m, 28, 28, 1(单通道)
X_test = X_test[:, :, :, np.newaxis]

1.4 定义模型

model = LeNet.build(input_shape=(28, 28, 1), classes=10)
  • 定义优化器,配置模型
opt = Adam(learning_rate=0.001, beta_1=0.9, beta_2=0.999, decay=0.01)
model.compile(loss="categorical_crossentropy",optimizer=opt, metrics=["accuracy"])

注意:标签不采用 one-hot 编码的话,这里使用 loss="sparse_categorical_crossentropy"

1.5 训练

history = model.fit(X_train, y_train, epochs=20, batch_size=128,validation_split=0.2)
Epoch 1/20
263/263 [==============================] - 26s 98ms/step - 
loss: 0.2554 - accuracy: 0.9235 - 
val_loss: 0.0983 - val_accuracy: 0.9699
Epoch 2/20
263/263 [==============================] - 27s 103ms/step - 
loss: 0.0806 - accuracy: 0.9761 - 
val_loss: 0.0664 - val_accuracy: 0.9787
...
...
Epoch 20/20
263/263 [==============================] - 25s 97ms/step - 
loss: 0.0182 - accuracy: 0.9953 - 
val_loss: 0.0405 - val_accuracy: 0.9868

可以看见第2轮迭代结束,训练集准确率就 97.6%了,效果比之前的简单神经网络好很多

  • 模型总结
model.summary()
Model: "sequential"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
conv2d (Conv2D)              (None, 28, 28, 20)        520       
_________________________________________________________________
max_pooling2d (MaxPooling2D) (None, 14, 14, 20)        0         
_________________________________________________________________
conv2d_1 (Conv2D)            (None, 14, 14, 50)        25050     
_________________________________________________________________
max_pooling2d_1 (MaxPooling2 (None, 7, 7, 50)          0         
_________________________________________________________________
flatten (Flatten)            (None, 2450)              0         
_________________________________________________________________
dense (Dense)                (None, 500)               1225500   
_________________________________________________________________
dense_1 (Dense)              (None, 10)                5010      
=================================================================
Total params: 1,256,080
Trainable params: 1,256,080
Non-trainable params: 0
_________________________________________________________________
  • 绘制模型结构图
from keras.utils import plot_model
plot_model(model, './model.png', show_shapes=True)

模型结构

1.6 绘制训练曲线

pd.DataFrame(history.history).plot(figsize=(8, 5))
plt.grid(True)
plt.gca().set_ylim(0, 1) # set the vertical range to [0-1]
plt.show()

1.7 预测提交

y_pred = model.predict(X_test)
pred = y_pred.argmax(axis=1).reshape(-1)
print(pred.shape)image_id = pd.Series(range(1,len(pred)+1))
output = pd.DataFrame({'ImageId':image_id, 'Label':pred})
output.to_csv("submission_NN.csv",  index=False)


LeNet 模型得分 0.98607,比上一篇的简单NN模型(得分 0.97546),好了 1.061%

2. 使用 VGG16 迁移学习

VGG16 help 文档:

Help on function VGG16 in module tensorflow.python.keras.applications.vgg16:VGG16(include_top=True, weights='imagenet', input_tensor=None, input_shape=None, pooling=None, classes=1000, classifier_activation='softmax')Instantiates the VGG16 model.Reference paper:- [Very Deep Convolutional Networks for Large-Scale Image Recognition](https://arxiv.org/abs/1409.1556) (ICLR 2015)By default, it loads weights pre-trained on ImageNet. Check 'weights' forother options.This model can be built both with 'channels_first' data format(channels, height, width) or 'channels_last' data format(height, width, channels).The default input size for this model is 224x224.Caution: Be sure to properly pre-process your inputs to the application.Please see `applications.vgg16.preprocess_input` for an example.Arguments:include_top: whether to include the 3 fully-connectedlayers at the top of the network.weights: one of `None` (random initialization),'imagenet' (pre-training on ImageNet),or the path to the weights file to be loaded.input_tensor: optional Keras tensor(i.e. output of `layers.Input()`)to use as image input for the model.input_shape: optional shape tuple, only to be specifiedif `include_top` is False (otherwise the input shapehas to be `(224, 224, 3)`(with `channels_last` data format)or `(3, 224, 224)` (with `channels_first` data format).It should have exactly 3 input channels,and width and height should be no smaller than 32.E.g. `(200, 200, 3)` would be one valid value.pooling: Optional pooling mode for feature extractionwhen `include_top` is `False`.- `None` means that the output of the model will bethe 4D tensor output of thelast convolutional block.- `avg` means that global average poolingwill be applied to the output of thelast convolutional block, and thusthe output of the model will be a 2D tensor.- `max` means that global max pooling willbe applied.classes: optional number of classes to classify imagesinto, only to be specified if `include_top` is True, andif no `weights` argument is specified.classifier_activation: A `str` or callable. The activation function to useon the "top" layer. Ignored unless `include_top=True`. Set`classifier_activation=None` to return the logits of the "top" layer.Returns:A `keras.Model` instance.Raises:ValueError: in case of invalid argument for `weights`,or invalid input shape.ValueError: if `classifier_activation` is not `softmax` or `None` whenusing a pretrained top layer.

2.1 导入包

import numpy as np
%matplotlib inline
import matplotlib.pyplot as plt
import pandas as pd
import cv2
from keras.optimizers import Adam
from keras.models import Model
from keras.utils import np_utils
from keras.models import Sequential
from keras.layers import Flatten
from keras.layers import Dense
from keras.layers import Input
from keras.layers import Dropout
from keras.applications.vgg16 import VGG16

2.2 定义模型

vgg16 = VGG16(weights='imagenet',include_top=False,input_shape=(32, 32, 3))
# VGG16 模型在include_top=False时,可以自定义输入大小,至少32x32,通道必须是3mylayer = vgg16.output
mylayer = Flatten()(mylayer)
mylayer = Dense(128, activation='relu')(mylayer)
mylayer = Dropout(0.3)(mylayer)
mylayer = Dense(10, activation='softmax')(mylayer)model = Model(inputs=vgg16.inputs, outputs=mylayer)for layer in vgg16.layers:layer.trainable = False # vgg16的各个层不训练

2.3 数据处理

train = pd.read_csv('train.csv')
y_train_full = train['label']
X_train_full = train.drop(['label'], axis=1)
X_test_full = pd.read_csv('test.csv')
  • 将单通道的数据,复制成3通道的(vgg16要求3通道的),再resize成 32*32的,vgg16 要求图片最低分辨率是 32*32
def process(data):data = np.array(data).reshape(-1,28,28)output = np.zeros((data.shape[0], 32, 32, 3))for i in range(data.shape[0]):img = data[i]rgb_array = np.zeros((img.shape[0], img.shape[1], 3), "uint8")rgb_array[:, :, 0], rgb_array[:, :, 1], rgb_array[:, :, 2] = img, img, imgpic = cv2.resize(rgb_array, (32, 32), interpolation=cv2.INTER_LINEAR)output[i] = picoutput = output.astype('float32')/255.0return output
y_train = np_utils.to_categorical(y_train_full, 10)
X_train = process(X_train_full)
X_test = process(X_test_full)print(X_train.shape)
print(X_test.shape)

输出:

(42000, 32, 32, 3)
(28000, 32, 32, 3)
  • 看一看处理后的图片
img = X_train[0]
plt.imshow(img)
np.set_printoptions(threshold=np.inf)# 全部显示矩阵
# print(X_train[0])

resize 为32x32像素的图片

2.4 配置模型、训练

opt = Adam(learning_rate=0.001, beta_1=0.9, beta_2=0.999, decay=0.01)
model.compile(loss="categorical_crossentropy",optimizer=opt, metrics=["accuracy"])
history = model.fit(X_train, y_train, epochs=50, batch_size=128,validation_split=0.2)

输出:

Epoch 1/50
263/263 [==============================] - 101s 384ms/step - 
loss: 0.9543 - accuracy: 0.7212 - 
val_loss: 0.5429 - val_accuracy: 0.8601
...
Epoch 10/50
263/263 [==============================] - 110s 417ms/step - 
loss: 0.3284 - accuracy: 0.9063 - 
val_loss: 0.2698 - val_accuracy: 0.9263
...
Epoch 40/50
263/263 [==============================] - 114s 433ms/step - 
loss: 0.2556 - accuracy: 0.9254 - 
val_loss: 0.2121 - val_accuracy: 0.9389
...
Epoch 50/50
263/263 [==============================] - 110s 420ms/step - 
loss: 0.2466 - accuracy: 0.9272 - 
val_loss: 0.2058 - val_accuracy: 0.9406

训练曲线

model.summary()

输出:

Model: "functional_15"
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
=================================================================
input_23 (InputLayer)        [(None, 32, 32, 3)]       0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 32, 32, 64)        1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 32, 32, 64)        36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 16, 16, 64)        0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 16, 16, 128)       73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 16, 16, 128)       147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, 8, 8, 128)         0         
_________________________________________________________________
block3_conv1 (Conv2D)        (None, 8, 8, 256)         295168    
_________________________________________________________________
block3_conv2 (Conv2D)        (None, 8, 8, 256)         590080    
_________________________________________________________________
block3_conv3 (Conv2D)        (None, 8, 8, 256)         590080    
_________________________________________________________________
block3_pool (MaxPooling2D)   (None, 4, 4, 256)         0         
_________________________________________________________________
block4_conv1 (Conv2D)        (None, 4, 4, 512)         1180160   
_________________________________________________________________
block4_conv2 (Conv2D)        (None, 4, 4, 512)         2359808   
_________________________________________________________________
block4_conv3 (Conv2D)        (None, 4, 4, 512)         2359808   
_________________________________________________________________
block4_pool (MaxPooling2D)   (None, 2, 2, 512)         0         
_________________________________________________________________
block5_conv1 (Conv2D)        (None, 2, 2, 512)         2359808   
_________________________________________________________________
block5_conv2 (Conv2D)        (None, 2, 2, 512)         2359808   
_________________________________________________________________
block5_conv3 (Conv2D)        (None, 2, 2, 512)         2359808   
_________________________________________________________________
block5_pool (MaxPooling2D)   (None, 1, 1, 512)         0         
_________________________________________________________________
flatten_19 (Flatten)         (None, 512)               0         
_________________________________________________________________
dense_28 (Dense)             (None, 128)               65664     
_________________________________________________________________
dropout_9 (Dropout)          (None, 128)               0         
_________________________________________________________________
dense_29 (Dense)             (None, 10)                1290      
=================================================================
Total params: 14,781,642
Trainable params: 66,954
Non-trainable params: 14,714,688
_________________________________________________________________
  • 绘制模型结构
from keras.utils import plot_model
plot_model(model, './model.png', show_shapes=True)

2.5 预测提交

y_pred = model.predict(X_test)
pred = y_pred.argmax(axis=1).reshape(-1)
print(pred.shape)
print(pred)
image_id = pd.Series(range(1,len(pred)+1))
output = pd.DataFrame({'ImageId':image_id, 'Label':pred})
output.to_csv("submission_NN.csv",  index=False)


预测得分:0.93696

可能是由于 VGG16模型是用 224*224 的图片训练的权重,我们使用的是 28*28 的图片,可能不能很好的使用VGG16已经训练好的权重


我的CSDN博客地址 https://michael.blog.csdn.net/

长按或扫码关注我的公众号(Michael阿明),一起加油、一起学习进步!
Michael阿明

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/473740.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

SparkCore基础

目录 Spark简介 1 什么是Spark 2 Spark特点 3 Spark分布式环境安装 3.1 Spark HA的环境安装 3.2 动态增删一个worker节点到集群 4 Spark核心概念 5 Spark案例 5.2 Master URL 5.3 spark日志的管理 5.4 WordCount案例程序的执行过程 6 Spark作业运行架构图&#xff…

python面试总结(五)内存管理与MYSQL引擎选择

1.python是如何进行内存管理的?当内存中有不再使用的部分时,垃圾收集器就会把他们清理掉 Python引入了机制:引用计数与分代回收。 Python提供了对内存的垃圾收集机制, 但是它将不用的内存放到内存池而不是返回给操作系统。 2.谈…

Codeforces 374A - Inna and Pink Pony

原题地址:http://codeforces.com/contest/374/problem/A 好久没写题目总结了,最近状态十分不好,无论是写程序还是写作业还是精神面貌……NOIP挂了之后总觉得缺乏动力精神难以集中……CF做的也是一塌糊涂,各种pretest passed fail…

LeetCode 1320. 二指输入的的最小距离(动态规划)

文章目录1. 题目2. 解题1. 题目 二指输入法定制键盘在 XY 平面上的布局如上图所示,其中每个大写英文字母都位于某个坐标处, 例如字母 A 位于坐标 (0,0),字母 B 位于坐标 (0,1),字母 P 位于坐标 (2,3) 且字母 Z 位于坐标 (4,1)。 …

Python面试常用二十题总结

1.请至少用一种方法下面字符串的反转? 1).A[::-1] 2).交换前后字母的位置 t list(A) l len(t) for i, j inzip(range(l - 1, 0, -1), range(l // 2)): t[i], t[j] t[j], t[i] return"".join(t) 3). 递归的方式, 每次输出一个字符 defstring_re…

SparkStreaming基础

目录 SparkStreaming基础 1 流式计算 1.1 常见的离线和流式计算框架 2 SparkStreaming简介 2.1 核心概念DStream 2.2 工作原理 2.3 Storm,SparkStreaming和Flink的对比 2.4 如何选择流式处理框架 3 SparkStreaming实时案例 3.1 StreamingContext和Receiver…

【Kaggle微课程】Natural Language Processing - 1. Intro to NLP

文章目录1. 使用 spacy 库进行 NLP2. Tokenizing3. 文本处理4. 模式匹配练习:食谱满意度调查1 在评论中找到菜单项2 对所有的评论匹配3 最不受欢迎的菜4 菜谱出现的次数learn from https://www.kaggle.com/learn/natural-language-processing 1. 使用 spacy 库进行…

谈一谈HTTP中Get与Post的区别与主要应用场景

Http定义了与服务器交互的不同方法,最基本的方法有4种,分别是GET,POST,PUT,DELETE。URL全称是资源描述符,我们可以这样认为:一个URL地址,它用于描述一个网络上的资源,而H…

Python基础(一)--初识Python

目录 Python基础(一)--初识Python 1 Python基本概念 1.1 什么是Python 1.2 Python的语言特征 1.3 Python的应用领域 2 Python开发环境 2.1 Windows操作系统 2.2 Linux / Mac操作系统 2.3 Python虚拟环境 2.4 Python开发工具 3 环境变量 4 变量…

【Kaggle微课程】Natural Language Processing - 2.Text Classification

文章目录1. bag of words2. 建立词袋模型3. 训练文本分类模型4. 预测练习:1. 评估方法2. 数据预处理、建模3. 训练4. 预测5. 评估模型6. 改进learn from https://www.kaggle.com/learn/natural-language-processing NLP中的一个常见任务是文本分类。这是传统机器学…

淡定啊,学习啊

最近一直在背单词,已经忽略本质工作,写代码。这个要改啊,本职的代码还是要写的,不然怎么吃饭啊。发现这个还是昨天晚上看了会js,发现一个很好的特性,一阵狂喜,后来仔细一想是,最近荒…

Django框架—富文本编辑器

借助富文本编辑器,网站的编辑人员能够像使用offfice一样编写出漂亮的、所见即所得的页面此处以tinymce为例,其它富文本编辑器的使用也是类似的在虚拟环境中安装包 pip install django-tinymce2.6.0安装完成后,可以使用在Admin管理中&#xf…

Python基础(二)--数据类型,运算符与流程控制

目录 Python基础(二)--数据类型,运算符与流程控制 1 数据类型 1.1 Python中的数据类型 1.2 整数类型(int) 1.3 布尔类型 1.4 浮点类型 1.5 复数类型 1.6 类型转换 2 运算符 2.1 算术运算符 2.2 布尔运算符 …

求二维数组的转置矩阵

1 /*2 求二维数组的转置矩阵3 输入4 两个整数n和m 5 n行m列的二维数组 6 输出7 输出该二维数组的转置矩阵 8 数据范围9 0<n<20; 0<m<20; 10 */ 11 #include<stdio.h> 12 int main() 13 { 14 int a[20][20]; 15 int b[20][20]; 16 …

【Kaggle微课程】Natural Language Processing - 3. Word Vectors

文章目录1. 词嵌入 Word Embeddings2. 分类模型3. 文档相似度练习&#xff1a;1. 使用文档向量训练模型2. 文本相似度learn from https://www.kaggle.com/learn/natural-language-processing 1. 词嵌入 Word Embeddings 参考博文&#xff1a;05.序列模型 W2.自然语言处理与词…

Django搜索工具——全文检索

全文检索不同于特定字段的模糊查询&#xff0c;使用全文检索的效率更高&#xff0c;并且能够对于中文进行分词处理haystack&#xff1a;全文检索的框架&#xff0c;支持whoosh、solr、Xapian、Elasticsearc四种全文检索引擎&#xff0c;点击查看官方网站whoosh&#xff1a;纯Py…

LeetCode 787. K 站中转内最便宜的航班(Dijkstra最短路径 + 优先队列)

文章目录1. 题目2. 解题1. 题目 有 n 个城市通过 m 个航班连接。每个航班都从城市 u 开始&#xff0c;以价格 w 抵达 v。 现在给定所有的城市和航班&#xff0c;以及出发城市 src 和目的地 dst&#xff0c;你的任务是找到从 src 到 dst 最多经过 k 站中转的最便宜的价格。 如…

Windows Phone 资源管理与换肤思考

Windows Phone 资源管理与换肤思考 原文 Windows Phone 资源管理与换肤思考 新入手一台Windows 8的笔记本&#xff0c;安装了VS2013后&#xff0c;终于又可以开发WP了。公司暂时不愿意开发WP&#xff0c;那么咱就自行研究吧&#xff01; 在没有WP开发环境的时候&#xff0c;曾经…

Django完成异步工具——celery

情景&#xff1a;用户发起request&#xff0c;并等待response返回。在本些views中&#xff0c;可能需要执行一段耗时的程序&#xff0c;那么用户就会等待很长时间&#xff0c;造成不好的用户体验&#xff0c;比如发送邮件、手机验证码等使用celery后&#xff0c;情况就不一样了…

Python基础(三)--序列

Python基础&#xff08;三&#xff09;--序列 1 序列相关的概念 1.1 什么是序列 序列是一种可迭代对象&#xff0c;可以存储多个数据&#xff0c;并提供数据的访问。 序列中的数据称为元素&#xff0c;Python内置的序列类型有&#xff1a;列表&#xff08;list&#xff09;…