今天主要介绍无人驾驶当中深度学习技术的应用。
本文是根据博客专家AdamShan的文章整理而来,在此表示感谢。
关于深度学习的图像分类技术,网上已有很多关于深度学习的课程(如吴恩达老师的深度学习专项课程),故本文不对理论部分进行阐述,只关注实践部分,用到的深度学习框架是keras。关于keras框架的学习,之前转载了一篇黄海广博士的文章,是很不错的学习书籍。
首发:深度学习入门宝典-《python深度学习》原文代码中文注释版及电子书
(2021-03-03补充) 本文中使用数据集下载地址:动手学无人驾驶(1):交通标志识别》训练集和测试集
目录
1.导入库
2. 数据处理(重点)
2.1 读取数据集
2.2 显示图片
2.3 图片尺寸处理
2.4 将图片转换为numpy 数组
2.5 将图像转为灰度图
2.6 数据集增强
3. 训练神经网络
4. 测试验证
5. 使用模型
6. 总结
关于图像分类的介绍,可以参考以下视频(CS231n:用于视觉识别的卷积神经网络(2017年秋季)1080p_哔哩哔哩_bilibili)。
CS231n:用于视觉识别的卷积神经网络(2017年秋季)1080p
1.导入库
在编写代码时,首先是加载本文中所需要的库,如下:
import os
import random
import scipy
import skimage
import matplotlib import matplotlib.pyplot as plt
import numpy as np
from scipy import ndimage
from skimage import colorimport keras
from keras.models import Sequential
from keras.layers import Dense, Flatten
from keras.optimizers import SGD
from keras.layers import Dropout
from keras.optimizers import RMSprop# Allow image embeding in notebook
%matplotlib inline
注意:在Jupyter Notebook运行上述代码时scipy库和skimage库会冲突,解决办法是通过pip包安装scipy,如下:
sudo pip3 install scipy
2. 数据处理(重点)
本文中用到的交通标志数据集是BelgiumTS数据集,将数据集解压位于notebook同一目录下,
- data/Training/
- data/Testing/
每个文件夹包含了62个类,编号从 00000 到 00061.
2.1 读取数据集
def load_data(data_dir):"""Loads a data set and returns two lists:(加载数据,返回图像列表和标签列表)images: a list of Numpy arrays, each representing an image.labels: a list of numbers that represent the images labels."""# Get all subdirectories of data_dir. Each represents a label.#获取data_dir所有子文件夹,每一个包含一种标签#os.listdir(绝对路径),返回文件名和目录名组成的列表#os.path.isdir(),发【判断是否为目录#os.path.join(),将多个路径组合为一个路径directories = [d for d in os.listdir(data_dir) if os.path.isdir(os.path.join(data_dir, d))]#print(directories)# Loop through the label directories and collect the data in(循环遍历每一个子目录,收集数据)# two lists, labels and images.labels = []images = []for d in directories:label_dir = os.path.join(data_dir, d)file_names = [os.path.join(label_dir, f) for f in os.listdir(label_dir) if f.endswith(".ppm")]#print(file_names)# For each label, load it's images and add them to the images list.# And add the label number (i.e. directory name) to the labels list.for f in file_names:images.append(skimage.data.imread(f))labels.append(int(d))return images, labels# Load training and testing datasets.
ROOT_PATH = "data"
train_data_dir = os.path.join(ROOT_PATH, "Training")
test_data_dir = os.path.join(ROOT_PATH, "Testing")images, labels = load_data(train_data_dir)
2.2 显示图片
加载完数据集后,显示每个交通标识类别的第一张图片。
def display_images_and_labels(images, labels):"""Display the first image of each label."""unique_labels = set(labels)plt.figure(figsize=(15, 15))i = 1for label in unique_labels:# Pick the first image for each label.image = images[labels.index(label)]plt.subplot(8, 8, i) # A grid of 8 rows x 8 columnsplt.axis('off')plt.title("Label {0} ({1})".format(label, labels.count(label)))i += 1_ = plt.imshow(image)plt.show()display_images_and_labels(images, labels)
效果如下,共有62类交通标志。
2.3 图片尺寸处理
数据集图片大小尺寸并不统一,为了更好的训练神经网络,需要将所有图片调整到一个相同的尺寸,这里将图片尺寸调整到(32,32)。
# Resize images(调整到(32,32))
images32 = [skimage.transform.resize(image, (32, 32))for image in images]
display_images_and_labels(images32, labels)
调整后的图片如下图所示,此时图片大小尺寸已经统一。
显示调整后的图片尺寸信息:
for image in images32[:5]:print("shape: {0}, min: {1}, max: {2}".format(image.shape, image.min(), image.max()))
shape: (32, 32, 3), min: 0.03529411764705882, max: 0.996078431372549
shape: (32, 32, 3), min: 0.03395373774509821, max: 0.996078431372549
shape: (32, 32, 3), min: 0.03694182751225482, max: 0.996078431372549
shape: (32, 32, 3), min: 0.06460056678921586, max: 0.9191425398284314
shape: (32, 32, 3), min: 0.060355392156862725, max: 0.9028492647058823
2.4 将图片转换为numpy 数组
labels_a = np.array(labels)
images_a = np.array(images32)
print("labels: ", labels_a.shape, "\nimages: ", images_a.shape)
2.5 将图像转为灰度图
对图像进行一定的预处理,在这里将原来的RGB三通道的图像转换为灰度图。
images_a = color.rgb2gray(images_a)
display_images_and_labels(images_a, labels)
2.6 数据集增强
这里将数据集扩充为5倍,并显示其中的几类。
def expend_training_data(train_x, train_y):"""Augment training data扩充5被"""expanded_images = np.zeros([train_x.shape[0] * 5, train_x.shape[1], train_x.shape[2]])expanded_labels = np.zeros([train_x.shape[0] * 5])counter = 0for x, y in zip(train_x, train_y):# register original data(加载原始数据)expanded_images[counter, :, :] = xexpanded_labels[counter] = ycounter = counter + 1# get a value for the background# zero is the expected value, but median() is used to estimate background's valuebg_value = np.median(x) # this is regarded as background's valuefor i in range(4):# rotate the image with random degreeangle = np.random.randint(-15, 15, 1)#new_img = ndimage.rotate(x, angle, reshape=False, cval=bg_value)# shift the image with random distanceshift = np.random.randint(-2, 2, 2)new_img_ = ndimage.shift(x, shift, cval=bg_value)# register new training dataexpanded_images[counter, :, :] = new_img_expanded_labels[counter] = ycounter = counter + 1return expanded_images, expanded_labelsimages_a, labels_a = expend_training_data(images_a, labels_a)
print(images_a.shape, labels_a.shape)
labels = labels_a.tolist()
print(len(labels))
def plot_agument(images, labels):plt.figure(figsize=(16, 9))unique_labels = set(labels)i = 1for label in unique_labels:# Pick the first image for each label.if i > 3:breakimg_index = labels.index(label)for j in range(5):image = images_a[img_index+j]plt.subplot(3, 5, (i-1)*5 + j+1) # A grid of 8 rows x 8 columnsplt.axis('off')plt.title("Label {0} ({1})".format(label, labels.count(label)))_ = plt.imshow(image, cmap='gray')i += 1plot_agument(images_a, labels)
3. 训练神经网络
这里用到keras搭建神经网络,模式为Sequential模式。
from sklearn.utils import shuffleindx = np.arange(0, len(labels_a))
indx = shuffle(indx)
images_a = images_a[indx]
labels_a = labels_a[indx]#总样本数为22875
train_x, val_x = images_a[:20000], images_a[20000:]
train_y, val_y = labels_a[:20000], labels_a[20000:]train_y = keras.utils.to_categorical(train_y, 62)
val_y = keras.utils.to_categorical(val_y, 62)#构建神经网络
model = Sequential()
model.add(Flatten(input_shape=(32, 32)))
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(512, activation='relu'))
model.add(Dropout(0.5))
model.add(Dense(62, activation='softmax'))model.summary()model.compile(loss='categorical_crossentropy',optimizer=RMSprop(),metrics=['accuracy'])history = model.fit(train_x, train_y,batch_size=128,epochs=20,verbose=1,validation_data=(val_x, val_y))### print the keys contained in the history object
#print(history.history.keys())
model.save('model.json')
网络的训练误差和验证误差如图所示:
def plot_training(history):### plot the training and validation loss for each epochplt.plot(history.history['loss'])plt.plot(history.history['val_loss'])plt.title('model mean squared error loss')plt.ylabel('mean squared error loss')plt.xlabel('epoch')plt.legend(['training set', 'validation set'], loc='upper right')plt.show()plot_training(history=history)
4. 测试验证
加载测试集:
# Load the test dataset.
test_images, test_labels = load_data(test_data_dir)
# Transform the images, just like we did with the training set.
test_images32 = [skimage.transform.resize(image, (32, 32))for image in test_images]test_images_a = np.array(test_images32)
test_labels_a = np.array(test_labels)test_images_a = color.rgb2gray(test_images_a)display_images_and_labels(test_images_a, test_labels)
test_x = test_images_a
test_y = keras.utils.to_categorical(test_labels_a, 62)
score = model.evaluate(test_x, test_y, verbose=0)
print('Test loss:', score[0])
print('Test accuracy:', score[1])
测试精度为:88%。
Test loss: 0.49647544848156117
Test accuracy: 0.8789682388305664
5. 使用模型测试
这里选取几张图片样本,进行测试,测试结果如图所示,其中有2张图片测试结构与标签不一致。
# Display the predictions and the ground truth visually.
fig = plt.figure(figsize=(10, 10))
j = 1
for i in range(0, 1000, 100):truth = test_labels_a[i]prediction = predicted[i]plt.subplot(5, 2, j)j = j+1 plt.axis('off')color='green' if truth == prediction else 'red'plt.text(40, 10, "Truth: {0}\nPrediction: {1}".format(truth, prediction), fontsize=12, color=color)plt.imshow(test_x[i], cmap='gray')
6. 总结
本文利用深度学习进行了交通标志识别,首先进行了数据的处理,然后训练了神经网络,最后进行了测试,测试精度达到88%。主要的工作是图像数据处理。为进一步提高精度,可考虑使用卷积神经网络。