A Generative Adversarial Network-based Deep Learning Method for Low-quality Defect ImageReconstruction and Recognition

中文名：基于生成对抗网络的深度低质量缺陷图像的学习方法重建与认可

主体思路概述：将模糊的图像使用Gan进行生成与重建，生成更为清晰的图片，进而进行实现，所以可以提高精度。（Gan+VGG16）

数据集：NEU数据集是一个包含六种缺陷类型的热轧带钢表面缺陷数据集。缺陷包括银纹(Cr)、夹杂物(In)、斑点(Pa)、点蚀表面(PS)、卷入垢(RS)和划痕(Sc)。

0 Abstract

this paper proposes a generative adversarial network (GAN)-based DL method for low-quality defect image recognition. A GAN is used to reconstruct the low-quality defect images, and a VGG16 network is built to recognize the reconstructed images.

本文提出了一种基于生成对抗网络 (GAN) 的 DL 方法，用于低质量缺陷图像识别。 GAN用于重建低质量缺陷图像，并建立VGG16网络识别重建图像。

the results on PSNR, SSIM, cosine and mutual information indicate that the quality of the reconstructed image is improved greatly, which is very helpful for defect analysis.

PSNR、SSIM、余弦cosine和互信息的结果表明重建图像的质量大大提高，这对缺陷分析非常有帮助。

I. INTRODUCTION

Although these methods, especially DL, achieved good results, most of their performances are sensitive to the quality of the defect images.

尽管这些方法，尤其是深度学习，取得了很好的效果，但它们的大部分性能对缺陷图像的质量很敏感。

In the low-quality defect images, the mask and noise are two main causes.

在低质量缺陷图像中，掩模和噪声是两个主要原因。

First, most of these methods require explicit knowledge to design or select the operator. If the operator is not suitable, these methods might not work well as expected.

首先，这些方法中的大多数都需要明确的知识来设计或选择操作员。如果操作员不合适，这些方法可能无法按预期工作。

Second, most of these methods only consider the noisy defect images, the masked images, which is also important, has not been considered yet.

其次，这些方法中的大多数只考虑了有噪声的缺陷图像，还没有考虑掩蔽图像，这也很重要。

Furthermore,previous researches only aimed to avoid the influences on the recognition results, but the lost information is not filled so that it is still difficult for defect analysis.

此外，以往的研究只是为了避免对识别结果的影响，而丢失的信息没有被填补，因此仍然难以进行缺陷分析。

it is urgently needed to propose an effective DL method for the low-quality defect images.

迫切需要针对低质量缺陷图像提出一种有效的深度学习方法。

The motivation of this paper is to propose a generic and automatic method to improve the image quality and recognition results, which avoids the explicit operator design and replaces manual repair.

本文的动机是提出一种通用的自动化方法来提高图像质量和识别结果，避免显式的算子设计并取代人工修复。

To achieve this goal, the proposed method uses a GAN for low-quality image reconstruction, and a contextual loss [17] is introduced to ensure the reconstructed image as realistic as possible. And then, a VGG16 network [12] is built to recognize the reconstructed defect images.

为了实现这一目标，所提出的方法使用 GAN 进行低质量图像重建，并引入上下文损失 [17] 以确保重建图像尽可能逼真。然后，建立一个 VGG16 网络 [12] 来识别重建的缺陷图像。

In the first one, the proposed method is compared with conventional reconstruction methods. This part is to evaluate the reconstruction performances of the proposed method. In the second part, the proposed method is compared with the conventional DL methods and defect recognition methods. This part is to show whether the proposed method has better performances for low-quality defect recognition or not.

在第一个中，将所提出的方法与传统的重建方法进行了比较。这部分是评估所提出方法的重建性能。在第二部分中，将所提出的方法与传统的深度学习方法和缺陷识别方法进行了比较。这部分是为了说明所提出的方法是否具有更好的低质量缺陷识别性能。

II. PROPOSED GAN-BASE DL METHOD FOR LOW-QUALITY DEFECT IMAGE

A. The Framework of the Proposed Method

The motivation of the proposed method is to use a GAN to learn the textural information from the defect image, and reconstruct a high-quality defect image to improve the image quality and recognition results. The proposed method is composed of defect reconstruction and defect recognition, and the diagram of the proposed method is presented in Fig. 1.

该方法的动机是使用 GAN 从缺陷图像中学习纹理信息，并重建高质量的缺陷图像以提高图像质量和识别结果。该方法由缺陷重建和缺陷识别组成，该方法的示意图如图1所示。

The detail of the proposed GAN is presented in Section II.B. To ensure the reconstructed as realistic as the ground truth high-quality defect images, a contextual loss Lcon is introduced into the proposed GAN.

所提议的 GAN 的详细信息在第 II.B 节中介绍。为了确保重建与地面实况高质量缺陷图像一样逼真，在所提出的 GAN 中引入了上下文损失 Lcon。

Since previous work suggests that a pretrained VGG16 will perform better [12], the VGG16 network is firstly pretrained by ImageNet.

由于之前的工作表明预训练的 VGG16 会表现更好 [12]，因此 VGG16 网络首先由 ImageNet 预训练。

In the training phase, the GAN is firstly to learn the textural information from the high-quality defect images. After that, the GAN is fixed to reconstruct the low-quality defect images, and the reconstructed images are used to train the VGG16 network.

在训练阶段，GAN首先从高质量的缺陷图像中学习纹理信息。之后，固定 GAN 重建低质量缺陷图像，重建图像用于训练 VGG16 网络。

B. The Proposed GAN for Low-quality Defect Reconstruction

The whole GAN canbe regarded as a two-player minimax game, and a well-trained GAN can generate fake images and fool the discriminator.

整个GAN可以被视为一个两人极大极小博弈，一个训练有素的GAN可以产生虚假图像并欺骗鉴别者。

The architecture of the proposed GAN is Fig. 2.

所提出的GaN的结构如图2所示。

After training, the discriminator is discarded and only the well-trained generator is retained and fixed to reconstruct the low-quality defect images.

训练结束后，丢弃鉴别器，只保留训练好的生成器，并对其进行固定，以重建低质量的缺陷图像。

C. Contextual Loss

Traditional regularizations, such as L1 and L2, can produce a rough sketch, but fail to reconstruct the exquisite information.

传统的正则化方法，如L1和L2，可以产生一个粗略的草图，但不能重建精细的信息。

To overcome this limitation, a contextual loss Lcon [17] is introduced into the proposed GAN.

为了克服这一限制，在所提出的GAN中引入了上下文损失[17]。

D. The VGG16 Network for Defect Recognition

The VGG16 network has five blocks.The first two blocks have two convolutional layers and one max-pooling layer, while the remaining three have three convolutional layers and one max-pooling layer. At the end of the last block, a global average pooling (GAP) layer is added for vectorization, and a classification layer is connected to the GAP.

VGG16网络有五个块。前两个块具有两个卷积层和一个最大池层，而其余三个块具有三个卷积层和一个最大池层。在最后一个块的末尾，增加了一个全局平均池(GAP)层用于矢量化，并将分类层连接到GAP。

Furthermore, the image is normalized into [0,1] before feeding into the VGG16, and no extra data augmentation is adopted.

此外，图像在送入VGG16之前被归一化为[0，1]，并且不采用额外的数据增强。

III. EXPERIMENTAL RESULTS OF LOW-QUALITY DEFECT IMAGE RECONSTRUCTION AND RECOGNITION

To evaluate the performances, the proposed method is tested on several types of low-quality defect images with masks and noises. The masked defect images involve center mask, random mask, multi-mask and Nvidia mask [18], and the noisy defect images involve the binary noise with different fractions.

为了评价该方法的性能，对几种带有掩膜和噪声的低质量缺陷图像进行了测试。被掩蔽的缺陷图像包括中心掩模、随机掩模、多掩模和Nvidia掩模[18]，而噪声缺陷图像包含不同分数的二值噪声。

A. Experimental Dataset and Setting-up

In vision-based defect recognition, masks and noises are two main causes that influence the image quality.

在基于视觉的缺陷识别中，模板和噪声是影响图像质量的两个主要原因。

The masked defect images involve center mask, random mask, multi-mask and Nvidia mask [18].The center mask blocks the center of the defect image and the random mask blocks the defect images randomly. Multi-mask adds several blocks into the images, and the Nvidia mask is painting randomly.

掩模缺陷图像包括中心掩模、随机掩模、多掩模和Nvidia掩模[18]。中心掩模对缺陷图像的中心进行分块，随机掩模对缺陷图像进行随机分块。多遮罩将几个块添加到图像中，而Nvidia mask是随机绘制的。

NEU dataset is a hot-rolled steel strip surface defect dataset with six defect types. The defects include crazing (Cr), inclusion (In), patches (Pa), pitted surface (PS), rolled-in scale (RS) and scratches (Sc).

NEU数据集是一个包含六种缺陷类型的热轧带钢表面缺陷数据集。缺陷包括银纹(Cr)、夹杂物(In)、斑点(Pa)、点蚀表面(PS)、卷入垢(RS)和划痕(Sc)。

In the training phase, the masks and noises are added to the high-quality defect images, and the proposed method is trained to reconstruct and recognize these low-quality defect images.

在训练阶段，将掩模和噪声加入到高质量的缺陷图像中，并对提出的方法进行训练，以重建和识别这些低质量的缺陷图像。

In the application, the testing set, which adds some masks and noises, is fed into the well-trained model to evaluate the performances. The examples of low-quality defect images are presented in Fig. 5.

在应用中，将添加了掩码和噪声的测试集输入到训练好的模型中来评估性能。低质量缺陷图像的例子如图5所示。

B. The Reconstruction Results of the Masked Defect Images

GAN can learn useful information from the defect images, and based on the learnt information, the proposed GAN could reconstruct the high-quality images successfully.

GaN可以从缺陷图像中学习有用的信息，并且基于所学习的信息，所提出的GaN能够成功地重建出高质量的图像。

the L1-AE and L2-AE are failed to reconstruct the images, and the CAE can only reconstruct a blur one.

L1-AE和L2-AE不能重建图像，CAE只能重建模糊图像。

the contextual loss is useful to improve the quality of the reconstructed images.

上下文丢失对于提高重建图像的质量是有用的。

C. The Reconstruction Results of the Noisy Defect Images

The reconstruction images are presented in Fig. 7.

重建图像如图7所示。

From this result, it can be seen that the proposed method outperforms the other methods, and the recognition accuracies for the low-quality defect images are improved greatly.

从结果可以看出，该方法的性能优于其他方法，对低质量缺陷图像的识别精度也有了很大的提高。

F. Recognition Results with the Reconstructed Defect Images

To evaluate the performances of the proposed method, this section presents the recognition results of the comparison methods with the reconstructed defect images. The comparison methods include GP-CNN, PDDNET, Alexnet and CASAE, and all of them are trained and tested by the reconstructed images. The recognition results are shown in TABLE V.

为了评价该方法的性能，本部分给出了与重建缺陷图像进行比较的识别结果。比较方法包括GP-CNN、PDDNET、Alexnet和CASAE，并通过重建图像对它们进行训练和测试。识别结果如表五所示。

This result suggests that the VGG16 in the proposed method is more suitable to recognize the reconstructed defect images. This is mainly because the VGG16 has an appropriate network architecture, and the pretrained weights are also important.

该结果表明，所提出的方法中的VGG16更适合于识别重建的缺陷图像。这主要是因为VGG16具有适当的网络架构，并且预训练权重也很重要。

These results also manifest that the purpose of the proposed method, that using a GAN to reconstruct the defect images and improve image quality, is effective.

这些结果也证明了本文提出的利用GaN来重建缺陷图像和改善图像质量的目的是有效的。

IV. DISCUSSION

A. The Performance of the Proposed Method under the Mixture of Mask and Noise

From TABLE I, it indicates the multi-mask is more difficult to reconstruct and recognize, thus, this section will evaluate the proposed GAN under the multi-mask and different noise fraction. All the experimental setup is as same as Section III, and the evaluation results are shown in TABLE VI and Fig. 8.

从表I可以看出，多掩模的重构和识别更加困难，因此，本节将在多掩模和不同噪声分数下对所提出的GaN进行评估。所有实验设置与第三节相同，评估结果如表VI和图8所示。

在多掩模和不同噪声比例情况下的性能分析 :

B. The Analysis of the Proposed GAN

the proposed GAN uses a hybrid loss function, including reconstruction loss Lrec , contextual loss Lcon, and adversarial loss Ladv, and warming up strategy is also used for stable training.

该算法采用混合损失函数，包括重构损失Lrec、上下文损失Lcon和对抗性损失Ladv，并采用预热策略进行稳定训练。

Fig. 9 presents the reconstructed samples under center mask.From the reconstructed images, it can be seen that the GANs with contextual loss have a clear reconstruction result in the masked area. For the non-masked area, the last GAN also retains some detailed information. Therefore, this discussion result suggests the last GAN, which is also used in the proposed method, is more suitable to reconstruct these low-quality defect images.

图9给出了中心掩码下的重建样本，从重建图像中可以看出，上下文丢失的GAN在掩蔽区域具有明显的重建结果。对于非屏蔽区，最后的GaN也保留了一些详细信息。因此，这一讨论结果表明，最后的GaN，也被用于所提出的方法，更适合于重建这些低质量的缺陷图像。

C. The Recognition Results of the Other GANs

These comparison methods are trained to reconstruct the low-quality defect images, and the reconstructed images are fed into the VGG16 for recognition.

训练这些比较方法以重建低质量缺陷图像，并且将重建图像传入VGG16用于识别。

Although previous work has shown that these GANs have strong abilities for images generation, these GANs fail to reconstruct these low-quality defect images and improve the recognition results.

虽然前人的工作表明这些遗传算法具有很强的图像生成能力，但这些遗传算法不能重建这些低质量的缺陷图像，从而提高了识别结果。

The main reason for the worse results is because the defect recognition requires more detailed textural information, while most of the comparison methods focus on the rough sketches.

结果较差的主要原因是缺陷识别需要更详细的纹理信息，而大多数比较方法都集中在粗略的草图上。

V. CONCLUSION AND FUTURE WORK

a GAN-based DL method for low-quality defect recognition.

一种基于GaN的低质量缺陷识别的DL方法。

Firstly, this paper proposes a new manner that uses GAN to reconstruct the low-quality defect images and improve image quality.

首先，本文提出了一种利用GaN来重建低质量缺陷图像，提高图像质量的新方法。

the proposed method is generic, and it not only shows the effectiveness on the noisy image but also on masked images,

该方法具有较好的通用性，不仅对含噪图像具有较好的鲁棒性，而且对掩蔽图像也具有较好的鲁棒性。

a VGG16 network with global average pooling is built to recognize the reconstructed defect images.

构建了一个全局平均池的VGG16网络来识别重建的缺陷图像。

Therefore, the future work of this paper will focus on two directions.

因此，本论文未来的工作将集中在两个方向上。

One is to develop a lightweight model, which is fast for model training. Another one is introducing incremental learning into the proposed method, and make it adaptable for the new-coming defect type.

一个是开发一个轻量级的模型，这是快速的模型训练。另一种是将增量学习引入到所提出的方法中，使其能够适应新出现的缺陷类型。