一 项目展示
下面可以看到验证集可以到了0.9823了,实际上,在下面的另外一个训练,可以得到0.9839,我保守的写了0.982
二 项目参数展示
我们先来看看LeNet 5 的结构与参数,参数有61,706个。
这个是我用keras写的,可以看到参数只有886个。
项目代码
我们来看一下模型的代码,模型由传统卷积改成了可分离卷积;
这里padding 主要是为了后面降维方便而设置
这里设置了5层卷积,是考虑到如果层数比较少的话,感受野比较小,增加卷积层,从而增加感受野(我觉得感受野,这时候比模型的复杂度重要)
因为mnist的字体大概占图像面积70-85之间
代码的注释也写得比较多,可以自行研究。
# 我用的是GPU训练,如果训练报错的话,可以把SperableConv2D改成Conv2D
# 先训练1个epoch,再改成如下的sperableconv2d
def my_network(input_shape=(32, 32, 1), classes=10):X_input = Input(input_shape)X = X_input# 这里padding 主要是为了后面降维方便而设置# 这里设置了5层卷积,是考虑到如果层数比较少的话,感受野比较小,增加卷积层,从而增加感受野# 因为mnist的字体大概占图像面积70-85之间X = ZeroPadding2D((2, 2))(X_input)X = Conv2D(4, (3, 3), strides=(1, 1), padding='valid', activation='elu', name='conv1')(X)X = Conv2D(6, (3, 3), strides=(2, 2), padding='valid', activation='elu', name='conv2')(X)X = SeparableConv2D(8, (3, 3), strides=(2, 2), padding='valid', activation='elu', name='conv3')(X)X = SeparableConv2D(10, (3, 3), strides=(2, 2), padding='same', activation='elu', name='conv4')(X)X = SeparableConv2D(12, (3, 3), strides=(2, 2), padding='valid', activation='elu', name='conv5')(X)# 利用浮点卷积做为输出,注意激活函数是softmaxX = Conv2D(classes, (1, 1), strides=(1, 1), padding='same', activation='softmax')(X)X = keras.layers.Reshape((classes,))(X)model = Model(inputs=X_input, outputs=X, name='my_network')return modelmodel_con = my_network(input_shape=(28, 28, 1), classes=10)# 这里的lr,如果用了学习衰减的话,就可以不设置了
optimizer = keras.optimizers.Adam(lr=0.001)
model_con.compile(optimizer=optimizer, loss='categorical_crossentropy', metrics=['accuracy'])
model_con.summary()
训练过程
训练的过程,用的是Adam优化器,利用了学习率衰减的策略,先由大的学习,加快训练,再调整比较小的学习率来训练,慢慢训练得出的结果比较稳定。
下面是100个epochs的全部,识别的输出,其它lr初始值为0.01,而不是0.001。
Epoch 1/100
lr change to 0.01
235/235 [==============================] - 2s 7ms/step - loss: 0.1065 - acc: 0.9673 - val_loss: 0.0792 - val_acc: 0.9755
Epoch 2/100
lr change to 0.01
235/235 [==============================] - 1s 6ms/step - loss: 0.0886 - acc: 0.9726 - val_loss: 0.1013 - val_acc: 0.9686
Epoch 3/100
lr change to 0.01
235/235 [==============================] - 1s 6ms/step - loss: 0.0884 - acc: 0.9732 - val_loss: 0.0744 - val_acc: 0.9768
Epoch 4/100
lr change to 0.01
235/235 [==============================] - 2s 6ms/step - loss: 0.0873 - acc: 0.9730 - val_loss: 0.0815 - val_acc: 0.9740
Epoch 5/100
lr change to 0.01
235/235 [==============================] - 1s 6ms/step - loss: 0.0851 - acc: 0.9733 - val_loss: 0.0724 - val_acc: 0.9768
Epoch 6/100
lr change to 0.01
235/235 [==============================] - 2s 6ms/step - loss: 0.0817 - acc: 0.9752 - val_loss: 0.0826 - val_acc: 0.9745
Epoch 7/100
lr change to 0.01
235/235 [==============================] - 1s 6ms/step - loss: 0.0855 - acc: 0.9741 - val_loss: 0.0830 - val_acc: 0.9740
Epoch 8/100
lr change to 0.01
235/235 [==============================] - 2s 6ms/step - loss: 0.0870 - acc: 0.9728 - val_loss: 0.0847 - val_acc: 0.9729
Epoch 9/100
lr change to 0.01
235/235 [==============================] - 2s 7ms/step - loss: 0.0813 - acc: 0.9752 - val_loss: 0.0701 - val_acc: 0.9787
Epoch 10/100
lr change to 0.01
235/235 [==============================] - 2s 7ms/step - loss: 0.0815 - acc: 0.9748 - val_loss: 0.0742 - val_acc: 0.9765
Epoch 11/100
lr change to 0.01
235/235 [==============================] - 2s 7ms/step - loss: 0.0809 - acc: 0.9749 - val_loss: 0.0721 - val_acc: 0.9770
Epoch 12/100
lr change to 0.005
235/235 [==============================] - 1s 6ms/step - loss: 0.0672 - acc: 0.9796 - val_loss: 0.0635 - val_acc: 0.9788
Epoch 13/100
lr change to 0.005
235/235 [==============================] - 2s 6ms/step - loss: 0.0662 - acc: 0.9800 - val_loss: 0.0575 - val_acc: 0.9808
Epoch 14/100
lr change to 0.005
235/235 [==============================] - 2s 6ms/step - loss: 0.0659 - acc: 0.9796 - val_loss: 0.0624 - val_acc: 0.9803
Epoch 15/100
lr change to 0.005
235/235 [==============================] - 1s 6ms/step - loss: 0.0662 - acc: 0.9793 - val_loss: 0.0631 - val_acc: 0.9791
Epoch 16/100
lr change to 0.005
235/235 [==============================] - 2s 6ms/step - loss: 0.0655 - acc: 0.9802 - val_loss: 0.0561 - val_acc: 0.9816
Epoch 17/100
lr change to 0.005
235/235 [==============================] - 2s 7ms/step - loss: 0.0659 - acc: 0.9799 - val_loss: 0.0653 - val_acc: 0.9808
Epoch 18/100
lr change to 0.005
235/235 [==============================] - 2s 6ms/step - loss: 0.0653 - acc: 0.9798 - val_loss: 0.0765 - val_acc: 0.9748
Epoch 19/100
lr change to 0.005
235/235 [==============================] - 2s 7ms/step - loss: 0.0656 - acc: 0.9792 - val_loss: 0.0565 - val_acc: 0.9811
Epoch 20/100
lr change to 0.005
235/235 [==============================] - 2s 7ms/step - loss: 0.0645 - acc: 0.9800 - val_loss: 0.0569 - val_acc: 0.9814
Epoch 21/100
lr change to 0.005
235/235 [==============================] - 2s 7ms/step - loss: 0.0639 - acc: 0.9804 - val_loss: 0.0566 - val_acc: 0.9814
Epoch 22/100
lr change to 0.005
235/235 [==============================] - 2s 7ms/step - loss: 0.0637 - acc: 0.9801 - val_loss: 0.0672 - val_acc: 0.9778
Epoch 23/100
lr change to 0.005
235/235 [==============================] - 2s 7ms/step - loss: 0.0631 - acc: 0.9807 - val_loss: 0.0607 - val_acc: 0.9804
Epoch 24/100
lr change to 0.0025
235/235 [==============================] - 2s 6ms/step - loss: 0.0562 - acc: 0.9825 - val_loss: 0.0587 - val_acc: 0.9804
Epoch 25/100
lr change to 0.0025
235/235 [==============================] - 2s 6ms/step - loss: 0.0550 - acc: 0.9831 - val_loss: 0.0542 - val_acc: 0.9819
Epoch 26/100
lr change to 0.0025
235/235 [==============================] - 2s 7ms/step - loss: 0.0541 - acc: 0.9833 - val_loss: 0.0541 - val_acc: 0.9818
Epoch 27/100
lr change to 0.0025
235/235 [==============================] - 2s 7ms/step - loss: 0.0544 - acc: 0.9829 - val_loss: 0.0545 - val_acc: 0.9825
Epoch 28/100
lr change to 0.0025
235/235 [==============================] - 2s 7ms/step - loss: 0.0548 - acc: 0.9829 - val_loss: 0.0514 - val_acc: 0.9827
Epoch 29/100
lr change to 0.0025
235/235 [==============================] - 2s 7ms/step - loss: 0.0536 - acc: 0.9833 - val_loss: 0.0587 - val_acc: 0.9805
Epoch 30/100
lr change to 0.0025
235/235 [==============================] - 2s 7ms/step - loss: 0.0534 - acc: 0.9831 - val_loss: 0.0683 - val_acc: 0.9783
Epoch 31/100
lr change to 0.0025
235/235 [==============================] - 2s 7ms/step - loss: 0.0535 - acc: 0.9839 - val_loss: 0.0517 - val_acc: 0.9831
Epoch 32/100
lr change to 0.0025
235/235 [==============================] - 2s 7ms/step - loss: 0.0537 - acc: 0.9831 - val_loss: 0.0524 - val_acc: 0.9821
Epoch 33/100
lr change to 0.0025
235/235 [==============================] - 2s 8ms/step - loss: 0.0533 - acc: 0.9833 - val_loss: 0.0543 - val_acc: 0.9820
Epoch 34/100
lr change to 0.0025
235/235 [==============================] - 2s 8ms/step - loss: 0.0527 - acc: 0.9836 - val_loss: 0.0529 - val_acc: 0.9814
Epoch 35/100
lr change to 0.0025
235/235 [==============================] - 2s 7ms/step - loss: 0.0534 - acc: 0.9835 - val_loss: 0.0558 - val_acc: 0.9814
Epoch 36/100
lr change to 0.00125
235/235 [==============================] - 2s 7ms/step - loss: 0.0487 - acc: 0.9852 - val_loss: 0.0523 - val_acc: 0.9823
Epoch 37/100
lr change to 0.00125
235/235 [==============================] - 2s 7ms/step - loss: 0.0478 - acc: 0.9854 - val_loss: 0.0500 - val_acc: 0.9837
Epoch 38/100
lr change to 0.00125
235/235 [==============================] - 2s 7ms/step - loss: 0.0482 - acc: 0.9851 - val_loss: 0.0532 - val_acc: 0.9823
Epoch 39/100
lr change to 0.00125
235/235 [==============================] - 2s 7ms/step - loss: 0.0480 - acc: 0.9849 - val_loss: 0.0509 - val_acc: 0.9832
Epoch 40/100
lr change to 0.00125
235/235 [==============================] - 2s 7ms/step - loss: 0.0477 - acc: 0.9850 - val_loss: 0.0509 - val_acc: 0.9824
Epoch 41/100
lr change to 0.00125
235/235 [==============================] - 2s 7ms/step - loss: 0.0481 - acc: 0.9850 - val_loss: 0.0509 - val_acc: 0.9831
Epoch 42/100
lr change to 0.00125
235/235 [==============================] - 2s 7ms/step - loss: 0.0482 - acc: 0.9847 - val_loss: 0.0511 - val_acc: 0.9827
Epoch 43/100
lr change to 0.00125
235/235 [==============================] - 2s 7ms/step - loss: 0.0480 - acc: 0.9851 - val_loss: 0.0513 - val_acc: 0.9838
Epoch 44/100
lr change to 0.00125
235/235 [==============================] - 2s 7ms/step - loss: 0.0474 - acc: 0.9850 - val_loss: 0.0530 - val_acc: 0.9831
Epoch 45/100
lr change to 0.00125
235/235 [==============================] - 2s 7ms/step - loss: 0.0476 - acc: 0.9852 - val_loss: 0.0513 - val_acc: 0.9831
Epoch 46/100
lr change to 0.00125
235/235 [==============================] - 2s 7ms/step - loss: 0.0470 - acc: 0.9855 - val_loss: 0.0554 - val_acc: 0.9817
Epoch 47/100
lr change to 0.00125
235/235 [==============================] - 2s 7ms/step - loss: 0.0471 - acc: 0.9850 - val_loss: 0.0527 - val_acc: 0.9817
Epoch 48/100
lr change to 0.000625
235/235 [==============================] - 2s 7ms/step - loss: 0.0451 - acc: 0.9857 - val_loss: 0.0506 - val_acc: 0.9828
Epoch 49/100
lr change to 0.000625
235/235 [==============================] - 2s 7ms/step - loss: 0.0448 - acc: 0.9861 - val_loss: 0.0501 - val_acc: 0.9838
Epoch 50/100
lr change to 0.000625
235/235 [==============================] - 2s 7ms/step - loss: 0.0448 - acc: 0.9859 - val_loss: 0.0490 - val_acc: 0.9829
Epoch 51/100
lr change to 0.000625
235/235 [==============================] - 2s 7ms/step - loss: 0.0447 - acc: 0.9865 - val_loss: 0.0501 - val_acc: 0.9834
Epoch 52/100
lr change to 0.000625
235/235 [==============================] - 2s 7ms/step - loss: 0.0444 - acc: 0.9864 - val_loss: 0.0488 - val_acc: 0.9826
Epoch 53/100
lr change to 0.000625
235/235 [==============================] - 2s 7ms/step - loss: 0.0446 - acc: 0.9860 - val_loss: 0.0507 - val_acc: 0.9826
Epoch 54/100
lr change to 0.000625
235/235 [==============================] - 2s 7ms/step - loss: 0.0446 - acc: 0.9856 - val_loss: 0.0511 - val_acc: 0.9827
Epoch 55/100
lr change to 0.000625
235/235 [==============================] - 2s 7ms/step - loss: 0.0446 - acc: 0.9862 - val_loss: 0.0506 - val_acc: 0.9830
Epoch 56/100
lr change to 0.000625
235/235 [==============================] - 2s 7ms/step - loss: 0.0446 - acc: 0.9858 - val_loss: 0.0500 - val_acc: 0.9830
Epoch 57/100
lr change to 0.000625
235/235 [==============================] - 2s 7ms/step - loss: 0.0443 - acc: 0.9861 - val_loss: 0.0517 - val_acc: 0.9831
Epoch 58/100
lr change to 0.000625
235/235 [==============================] - 2s 7ms/step - loss: 0.0442 - acc: 0.9860 - val_loss: 0.0507 - val_acc: 0.9828
Epoch 59/100
lr change to 0.000625
235/235 [==============================] - 2s 7ms/step - loss: 0.0445 - acc: 0.9864 - val_loss: 0.0504 - val_acc: 0.9833
Epoch 60/100
lr change to 0.0003125
235/235 [==============================] - 2s 7ms/step - loss: 0.0430 - acc: 0.9868 - val_loss: 0.0499 - val_acc: 0.9825
Epoch 61/100
lr change to 0.0003125
235/235 [==============================] - 2s 7ms/step - loss: 0.0429 - acc: 0.9865 - val_loss: 0.0507 - val_acc: 0.9834
Epoch 62/100
lr change to 0.0003125
235/235 [==============================] - 2s 7ms/step - loss: 0.0430 - acc: 0.9866 - val_loss: 0.0494 - val_acc: 0.9830
Epoch 63/100
lr change to 0.0003125
235/235 [==============================] - 2s 7ms/step - loss: 0.0432 - acc: 0.9865 - val_loss: 0.0501 - val_acc: 0.9836
Epoch 64/100
lr change to 0.0003125
235/235 [==============================] - 2s 7ms/step - loss: 0.0426 - acc: 0.9866 - val_loss: 0.0518 - val_acc: 0.9830
Epoch 65/100
lr change to 0.0003125
235/235 [==============================] - 2s 7ms/step - loss: 0.0428 - acc: 0.9865 - val_loss: 0.0500 - val_acc: 0.9824
Epoch 66/100
lr change to 0.0003125
235/235 [==============================] - 2s 7ms/step - loss: 0.0427 - acc: 0.9866 - val_loss: 0.0506 - val_acc: 0.9824
Epoch 67/100
lr change to 0.0003125
235/235 [==============================] - 2s 7ms/step - loss: 0.0430 - acc: 0.9866 - val_loss: 0.0504 - val_acc: 0.9826
Epoch 68/100
lr change to 0.0003125
235/235 [==============================] - 2s 7ms/step - loss: 0.0427 - acc: 0.9868 - val_loss: 0.0501 - val_acc: 0.9832
Epoch 69/100
lr change to 0.0003125
235/235 [==============================] - 2s 7ms/step - loss: 0.0425 - acc: 0.9866 - val_loss: 0.0505 - val_acc: 0.9823
Epoch 70/100
lr change to 0.0003125
235/235 [==============================] - 2s 7ms/step - loss: 0.0427 - acc: 0.9869 - val_loss: 0.0505 - val_acc: 0.9839
Epoch 71/100
lr change to 0.0003125
235/235 [==============================] - 2s 7ms/step - loss: 0.0432 - acc: 0.9868 - val_loss: 0.0494 - val_acc: 0.9838
Epoch 72/100
lr change to 0.00015625
235/235 [==============================] - 2s 7ms/step - loss: 0.0419 - acc: 0.9870 - val_loss: 0.0493 - val_acc: 0.9833
Epoch 73/100
lr change to 0.00015625
235/235 [==============================] - 2s 7ms/step - loss: 0.0419 - acc: 0.9871 - val_loss: 0.0498 - val_acc: 0.9835
Epoch 74/100
lr change to 0.00015625
235/235 [==============================] - 2s 7ms/step - loss: 0.0419 - acc: 0.9871 - val_loss: 0.0496 - val_acc: 0.9831
Epoch 75/100
lr change to 0.00015625
235/235 [==============================] - 2s 7ms/step - loss: 0.0419 - acc: 0.9868 - val_loss: 0.0499 - val_acc: 0.9834
Epoch 76/100
lr change to 0.00015625
235/235 [==============================] - 2s 7ms/step - loss: 0.0419 - acc: 0.9868 - val_loss: 0.0496 - val_acc: 0.9835
Epoch 77/100
lr change to 0.00015625
235/235 [==============================] - 2s 7ms/step - loss: 0.0420 - acc: 0.9870 - val_loss: 0.0500 - val_acc: 0.9834
Epoch 78/100
lr change to 0.00015625
235/235 [==============================] - 2s 7ms/step - loss: 0.0419 - acc: 0.9871 - val_loss: 0.0496 - val_acc: 0.9833
Epoch 79/100
lr change to 0.00015625
235/235 [==============================] - 2s 7ms/step - loss: 0.0419 - acc: 0.9868 - val_loss: 0.0497 - val_acc: 0.9835
Epoch 80/100
lr change to 0.00015625
235/235 [==============================] - 2s 7ms/step - loss: 0.0419 - acc: 0.9869 - val_loss: 0.0499 - val_acc: 0.9835
Epoch 81/100
lr change to 0.00015625
235/235 [==============================] - 2s 7ms/step - loss: 0.0420 - acc: 0.9866 - val_loss: 0.0499 - val_acc: 0.9830
Epoch 82/100
lr change to 0.00015625
235/235 [==============================] - 2s 7ms/step - loss: 0.0418 - acc: 0.9870 - val_loss: 0.0503 - val_acc: 0.9839
Epoch 83/100
lr change to 0.00015625
235/235 [==============================] - 2s 7ms/step - loss: 0.0420 - acc: 0.9868 - val_loss: 0.0494 - val_acc: 0.9832
Epoch 84/100
lr change to 7.8125e-05
235/235 [==============================] - 2s 7ms/step - loss: 0.0414 - acc: 0.9873 - val_loss: 0.0499 - val_acc: 0.9837
Epoch 85/100
lr change to 7.8125e-05
235/235 [==============================] - 2s 7ms/step - loss: 0.0416 - acc: 0.9871 - val_loss: 0.0496 - val_acc: 0.9833
Epoch 86/100
lr change to 7.8125e-05
235/235 [==============================] - 2s 7ms/step - loss: 0.0414 - acc: 0.9872 - val_loss: 0.0501 - val_acc: 0.9837
Epoch 87/100
lr change to 7.8125e-05
235/235 [==============================] - 2s 7ms/step - loss: 0.0415 - acc: 0.9871 - val_loss: 0.0502 - val_acc: 0.9836
Epoch 88/100
lr change to 7.8125e-05
235/235 [==============================] - 2s 7ms/step - loss: 0.0415 - acc: 0.9871 - val_loss: 0.0501 - val_acc: 0.9834
Epoch 89/100
lr change to 7.8125e-05
235/235 [==============================] - 2s 7ms/step - loss: 0.0414 - acc: 0.9870 - val_loss: 0.0496 - val_acc: 0.9835
Epoch 90/100
lr change to 7.8125e-05
235/235 [==============================] - 2s 7ms/step - loss: 0.0414 - acc: 0.9872 - val_loss: 0.0498 - val_acc: 0.9837
Epoch 91/100
lr change to 7.8125e-05
235/235 [==============================] - 2s 7ms/step - loss: 0.0416 - acc: 0.9870 - val_loss: 0.0502 - val_acc: 0.9839
Epoch 92/100
lr change to 7.8125e-05
235/235 [==============================] - 2s 7ms/step - loss: 0.0414 - acc: 0.9872 - val_loss: 0.0503 - val_acc: 0.9834
Epoch 93/100
lr change to 7.8125e-05
235/235 [==============================] - 2s 7ms/step - loss: 0.0415 - acc: 0.9870 - val_loss: 0.0500 - val_acc: 0.9836
Epoch 94/100
lr change to 7.8125e-05
235/235 [==============================] - 2s 7ms/step - loss: 0.0415 - acc: 0.9869 - val_loss: 0.0499 - val_acc: 0.9832
Epoch 95/100
lr change to 7.8125e-05
235/235 [==============================] - 2s 7ms/step - loss: 0.0415 - acc: 0.9871 - val_loss: 0.0502 - val_acc: 0.9836
Epoch 96/100
lr change to 6.25e-05
235/235 [==============================] - 2s 7ms/step - loss: 0.0413 - acc: 0.9871 - val_loss: 0.0501 - val_acc: 0.9838
Epoch 97/100
lr change to 6.25e-05
235/235 [==============================] - 2s 7ms/step - loss: 0.0413 - acc: 0.9872 - val_loss: 0.0501 - val_acc: 0.9837
Epoch 98/100
lr change to 6.25e-05
235/235 [==============================] - 2s 7ms/step - loss: 0.0413 - acc: 0.9872 - val_loss: 0.0503 - val_acc: 0.9834
Epoch 99/100
lr change to 6.25e-05
235/235 [==============================] - 2s 7ms/step - loss: 0.0413 - acc: 0.9870 - val_loss: 0.0499 - val_acc: 0.9833
Epoch 100/100
lr change to 6.25e-05
235/235 [==============================] - 2s 7ms/step - loss: 0.0417 - acc: 0.9870 - val_loss: 0.0500 - val_acc: 0.9834
总结
通过此次,认识到了感受野有时候比模型的复杂度更重要,还有学习率衰减策略也对模型有比较大的影响。
就是你用CPU训练,这么少的参数,也会很快的。
下面附上本项目的代码的Jupyter notebook。
链接:https://pan.baidu.com/s/1EEzfRDD_PAgSeS999Eq9-Q 提取码:3131
如果你觉得好,请给我个star,如有问题,也可以与我联系。