SSD列子

一、介绍
本博文主要介绍实现通过SSD物体检测方式实现工件裂纹检测。裂纹图像如下所示:

二、关于SSD算法


具体算法不再阐述,详细请参考:
https://blog.csdn.net/u013989576/article/details/73439202
https://blog.csdn.net/xiaohu2022/article/details/79833786
https://www.sohu.com/a/168738025_717210

三、训练数据的制作
训练数据制作的时候选择LabelImg,关于LabelImg的安装使用请参考:https://blog.csdn.net/xunan003/article/details/78720189

关于选取裂纹数据的一点建议:建议选的检测框数据一定要小,这样方便收敛。

这里使用的是VOC2007的数据格式,文件夹下面一共三个子文件夹。

其中,Annotations文件夹存放的是LbaelImg制作数据生成的xml文件。


JPEGImages存放的是原图像,.jpg格式。

ImageSets下面有一个Main文件夹,Main文件夹下面主要是四个txt文件。

分别对应训练集、测试集、验证集等。该文件夹中的四个txt文件,是从Annotations文件夹中随机选取的图像名称,并按照一定的比例划分。

从xml文件生成Main文件夹的四个txt文件,实现源码如下:

import os
import random

trainval_percent = 0.9
train_percent = 0.9
xmlfilepath = 'F:/competition code/ssd_keras-master/ssd_keras-master/data/liewen_two_class/Annotations'
txtsavepath = 'F:/competition code/ssd_keras-master/ssd_keras-master/data/liewen_two_class/ImageSets/Main'
total_xml = os.listdir(xmlfilepath)

num=len(total_xml)
list=range(num)
tv=int(num*trainval_percent)
tr=int(tv*train_percent)
trainval= random.sample(list,tv)
train=random.sample(trainval,tr)

ftrainval = open(txtsavepath+'/trainval.txt', 'w')
ftest = open(txtsavepath+'/test.txt', 'w')
ftrain = open(txtsavepath+'/train.txt', 'w')
fval = open(txtsavepath+'/val.txt', 'w')

for i  in list:
    name=total_xml[i][:-4]+'\n'
    if i in trainval:
        ftrainval.write(name)
        if i in train:
            ftrain.write(name)
        else:
            fval.write(name)
    else:
        ftest.write(name)

ftrainval.close()
ftrain.close()
fval.close()
ftest .close()

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
四、训练数据
训练数据的文件为train_ssd300.py,顾名思义就是图像的输入是300x300,不过不用担心,代码内部已经实现转换的程序,可以输入任意尺寸的图像,源码如下:

from keras.optimizers import Adam, SGD
from keras.callbacks import ModelCheckpoint, LearningRateScheduler, TerminateOnNaN, CSVLogger
from keras import backend as K
from keras.models import load_model
from math import ceil
import numpy as np
from matplotlib import pyplot as plt

from models.keras_ssd300 import ssd_300
from keras_loss_function.keras_ssd_loss import SSDLoss
from keras_layers.keras_layer_AnchorBoxes import AnchorBoxes
from keras_layers.keras_layer_DecodeDetections import DecodeDetections
from keras_layers.keras_layer_DecodeDetectionsFast import DecodeDetectionsFast
from keras_layers.keras_layer_L2Normalization import L2Normalization

from ssd_encoder_decoder.ssd_input_encoder import SSDInputEncoder
from ssd_encoder_decoder.ssd_output_decoder import decode_detections, decode_detections_fast

from data_generator.object_detection_2d_data_generator import DataGenerator
from data_generator.object_detection_2d_geometric_ops import Resize
from data_generator.object_detection_2d_photometric_ops import ConvertTo3Channels
from data_generator.data_augmentation_chain_original_ssd import SSDDataAugmentation
from data_generator.object_detection_2d_misc_utils import apply_inverse_transforms
import tensorflow as tf

from keras import backend as K
from focal_loss import focal_loss


img_height = 300 # Height of the model input images
img_width = 300 # Width of the model input images
img_channels = 3 # Number of color channels of the model input images
mean_color = [123, 117, 104] # The per-channel mean of the images in the dataset. Do not change this value if you're using any of the pre-trained weights.
swap_channels = [2, 1, 0] # The color channel order in the original SSD is BGR, so we'll have the model reverse the color channel order of the input images.
n_classes = 1 # 类的数量,不算背景
scales_pascal = [0.1, 0.2, 0.37, 0.54, 0.71, 0.88, 1.05] # The anchor box scaling factors used in the original SSD300 for the Pascal VOC datasets
#一共在六个不同scale层次上进行采样,最后一个1.05应该是无效的,scales中的数字代表生成检测框的长度是feature map的长度的0.1,0.2,0.37,0.54.。。倍,
# 长宽比例对应在aspect_ratios中,不同scale采样的anchor数量和比例也不相同
scales_coco = [0.07, 0.15, 0.33, 0.51, 0.69, 0.87, 1.05] # The anchor box scaling factors used in the original SSD300 for the MS COCO datasets
scales = scales_pascal
aspect_ratios = [[1.0, 2.0, 0.5],
                 [1.0, 2.0, 0.5, 3.0, 1.0/3.0],
                 [1.0, 2.0, 0.5, 3.0, 1.0/3.0],
                 [1.0, 2.0, 0.5, 3.0, 1.0/3.0],
                 [1.0, 2.0, 0.5],
                 [1.0, 2.0, 0.5]] # The anchor box aspect ratios used in the original SSD300; the order matters
two_boxes_for_ar1 = True
steps = [8, 16, 32, 64, 100, 300] # The space between two adjacent anchor box center points for each predictor layer.
offsets = [0.5, 0.5, 0.5, 0.5, 0.5, 0.5] # The offsets of the first anchor box center points from the top and left borders of the image as a fraction of the step size for each predictor layer.
clip_boxes = False # Whether or not to clip the anchor boxes to lie entirely within the image boundaries
variances = [0.1, 0.1, 0.2, 0.2] # The variances by which the encoded target coordinates are divided as in the original implementation
normalize_coords = True

# 加载或者重新建立一个模型,二者选其一
# 1: Build the Keras model.

K.clear_session() # Clear previous models from memory.

model = ssd_300(image_size=(img_height, img_width, img_channels),
                n_classes=n_classes,
                mode='training',
                l2_regularization=0.0005,
                scales=scales,
                aspect_ratios_per_layer=aspect_ratios,
                two_boxes_for_ar1=two_boxes_for_ar1,
                steps=steps,
                offsets=offsets,
                clip_boxes=clip_boxes,
                variances=variances,
                normalize_coords=normalize_coords,
                subtract_mean=mean_color,
                swap_channels=swap_channels)

# 2: Load some weights into the model.

# TODO: Set the path to the weights you want to load.
weights_path = 'VGG_ILSVRC_16_layers_fc_reduced.h5'

model.load_weights(weights_path, by_name=True)
model.summary()
# 3: Instantiate an optimizer and the SSD loss function and compile the model.
#    If you want to follow the original Caffe implementation, use the preset SGD
#    optimizer, otherwise I'd recommend the commented-out Adam optimizer.

# adam = Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)
sgd = SGD(lr=0.0001, momentum=0.9, decay=0.001, nesterov=False)

ssd_loss = SSDLoss(neg_pos_ratio=3, alpha=1.0)

model.compile(optimizer=sgd, loss=ssd_loss.compute_loss, metrics=['accuracy'])


# model.compile(optimizer=sgd,  loss='categorical_crossentropy', metrics=['accuracy'])
#模型加载结束

# 注意,这里出现了梯度爆炸

#加载数据

# 1: Instantiate two `DataGenerator` objects: One for training, one for validation.

# Optional: If you have enough memory, consider loading the images into memory for the reasons explained above.

train_dataset = DataGenerator(load_images_into_memory=False, hdf5_dataset_path=None)
val_dataset = DataGenerator(load_images_into_memory=False, hdf5_dataset_path=None)

# 2: Parse the image and label lists for the training and validation datasets. This can take a while.

# TODO: Set the paths to the datasets here.

# The directories that contain the images.
VOC_2007_images_dir      = 'F:/competition code/ssd_keras-master/ssd_keras-master/data/liewen_expand/JPEGImages/'
# VOC_2012_images_dir      = '../../datasets/VOCdevkit/VOC2012/JPEGImages/'

# The directories that contain the annotations.
VOC_2007_annotations_dir      = 'F:/competition code/ssd_keras-master/ssd_keras-master/data/liewen_expand/Annotations/'
# VOC_2012_annotations_dir      = '../../datasets/VOCdevkit/VOC2012/Annotations/'

# The paths to the image sets.
VOC_2007_train_image_set_filename    = 'F:/competition code/ssd_keras-master/ssd_keras-master/data/liewen_expand/ImageSets/Main/train.txt'
# VOC_2012_train_image_set_filename    = '../../datasets/VOCdevkit/VOC2012/ImageSets/Main/train.txt'
VOC_2007_val_image_set_filename      = 'F:/competition code/ssd_keras-master/ssd_keras-master/data/liewen_expand/ImageSets/Main/val.txt'
# VOC_2012_val_image_set_filename      = '../../datasets/VOCdevkit/VOC2012/ImageSets/Main/val.txt'
VOC_2007_trainval_image_set_filename = 'F:/competition code/ssd_keras-master/ssd_keras-master/data/liewen_expand/ImageSets/Main/trainval.txt'
# VOC_2012_trainval_image_set_filename = '../../datasets/VOCdevkit/VOC2012/ImageSets/Main/trainval.txt'
VOC_2007_test_image_set_filename     = 'F:/competition code/ssd_keras-master/ssd_keras-master/data/liewen_expand/ImageSets/Main/test.txt'

# The XML parser needs to now what object class names to look for and in which order to map them to integers.
# classes = ['background',
#            'aeroplane', 'bicycle', 'bird', 'boat',
#            'bottle', 'bus', 'car', 'cat',
#            'chair', 'cow', 'diningtable', 'dog',
#            'horse', 'motorbike', 'person', 'pottedplant',
#            'sheep', 'sofa', 'train', 'tvmonitor']

classes = ['background','neg']#类的名称,此时要加上background

train_dataset.parse_xml(images_dirs=[VOC_2007_images_dir],
                        image_set_filenames=[VOC_2007_trainval_image_set_filename],
                        annotations_dirs=[VOC_2007_annotations_dir],
                        classes=classes,
                        include_classes='all',
                        exclude_truncated=False,
                        exclude_difficult=False,
                        ret=False)

val_dataset.parse_xml(images_dirs=[VOC_2007_images_dir],
                      image_set_filenames=[VOC_2007_test_image_set_filename],
                      annotations_dirs=[VOC_2007_annotations_dir],
                      classes=classes,
                      include_classes='all',
                      exclude_truncated=False,
                      exclude_difficult=True,
                      ret=False)

# Optional: Convert the dataset into an HDF5 dataset. This will require more disk space, but will
# speed up the training. Doing this is not relevant in case you activated the `load_images_into_memory`
# option in the constructor, because in that cas the images are in memory already anyway. If you don't
# want to create HDF5 datasets, comment out the subsequent two function calls.

train_dataset.create_hdf5_dataset(file_path='dataset_pascal_voc_07+12_trainval.h5',
                                  resize=False,
                                  variable_image_size=True,
                                  verbose=True)

val_dataset.create_hdf5_dataset(file_path='dataset_pascal_voc_07_test.h5',
                                resize=False,
                                variable_image_size=True,
                                verbose=True)


# 3: Set the batch size.

batch_size = 8 # Change the batch size if you like, or if you run into GPU memory issues.

# 4: Set the image transformations for pre-processing and data augmentation options.

# For the training generator:
ssd_data_augmentation = SSDDataAugmentation(img_height=img_height,
                                            img_width=img_width,
                                            background=mean_color)

# For the validation generator:
convert_to_3_channels = ConvertTo3Channels()
resize = Resize(height=img_height, width=img_width)

# 5: Instantiate an encoder that can encode ground truth labels into the format needed by the SSD loss function.

# The encoder constructor needs the spatial dimensions of the model's predictor layers to create the anchor boxes.
predictor_sizes = [model.get_layer('conv4_3_norm_mbox_conf').output_shape[1:3],
                   model.get_layer('fc7_mbox_conf').output_shape[1:3],
                   model.get_layer('conv6_2_mbox_conf').output_shape[1:3],
                   model.get_layer('conv7_2_mbox_conf').output_shape[1:3],
                   model.get_layer('conv8_2_mbox_conf').output_shape[1:3],
                   model.get_layer('conv9_2_mbox_conf').output_shape[1:3]]

ssd_input_encoder = SSDInputEncoder(img_height=img_height,
                                    img_width=img_width,
                                    n_classes=n_classes,
                                    predictor_sizes=predictor_sizes,
                                    scales=scales,
                                    aspect_ratios_per_layer=aspect_ratios,
                                    two_boxes_for_ar1=two_boxes_for_ar1,
                                    steps=steps,
                                    offsets=offsets,
                                    clip_boxes=clip_boxes,
                                    variances=variances,
                                    matching_type='multi',
                                    pos_iou_threshold=0.5,
                                    neg_iou_limit=0.5,
                                    normalize_coords=normalize_coords)

# 6: Create the generator handles that will be passed to Keras' `fit_generator()` function.

train_generator = train_dataset.generate(batch_size=batch_size,
                                         shuffle=True,
                                         transformations=[ssd_data_augmentation],
                                         label_encoder=ssd_input_encoder,
                                         returns={'processed_images',
                                                  'encoded_labels'},
                                         keep_images_without_gt=False)

val_generator = val_dataset.generate(batch_size=batch_size,
                                     shuffle=False,
                                     transformations=[convert_to_3_channels,
                                                      resize],
                                     label_encoder=ssd_input_encoder,
                                     returns={'processed_images',
                                              'encoded_labels'},
                                     keep_images_without_gt=False)

# Get the number of samples in the training and validations datasets.
train_dataset_size = train_dataset.get_dataset_size()
val_dataset_size   = val_dataset.get_dataset_size()

print("Number of images in the training dataset:\t{:>6}".format(train_dataset_size))
print("Number of images in the validation dataset:\t{:>6}".format(val_dataset_size))
print("cuiwei")

def lr_schedule(epoch):#通过回调函数设置学习率
    if epoch < 80:
        return 0.0001
    elif epoch < 100:
        return 0.0001
    else:
        return 0.00001

# Define model callbacks.

# TODO: Set the filepath under which you want to save the model.
model_checkpoint = ModelCheckpoint(filepath='ssd300_model_liehen_expand.h5',#模型保存名称
                                   monitor='val_loss',
                                   verbose=1,
                                   save_best_only=True,
                                   save_weights_only=False,
                                   mode='auto',
                                   period=1)
#model_checkpoint.best =

csv_logger = CSVLogger(filename='ssd300_pascal_07+12_training_log.csv',
                       separator=',',
                       append=True)

learning_rate_scheduler = LearningRateScheduler(schedule=lr_schedule,
                                                verbose=1)

terminate_on_nan = TerminateOnNaN()

callbacks = [model_checkpoint,
             csv_logger,
             learning_rate_scheduler,
             terminate_on_nan]

# If you're resuming a previous training, set `initial_epoch` and `final_epoch` accordingly.
initial_epoch   = 0
final_epoch     = 20
steps_per_epoch = 80

history = model.fit_generator(generator=train_generator,
                              steps_per_epoch=steps_per_epoch,
                              epochs=final_epoch,
                              callbacks=callbacks,
                              validation_data=val_generator,
                              validation_steps=ceil(val_dataset_size/batch_size),
                              initial_epoch=initial_epoch)


1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
五、测试数据
训练完成后,对模型进行测试,test_ssd300.py文件,源代码如下:

from keras import backend as K
from keras.models import load_model
from keras.preprocessing import image
from keras.optimizers import Adam
from imageio import imread
import numpy as np
from matplotlib import pyplot as plt

from models.keras_ssd300 import ssd_300
from keras_loss_function.keras_ssd_loss import SSDLoss
from keras_layers.keras_layer_AnchorBoxes import AnchorBoxes
from keras_layers.keras_layer_DecodeDetections import DecodeDetections
from keras_layers.keras_layer_DecodeDetectionsFast import DecodeDetectionsFast
from keras_layers.keras_layer_L2Normalization import L2Normalization

from ssd_encoder_decoder.ssd_output_decoder import decode_detections, decode_detections_fast

from data_generator.object_detection_2d_data_generator import DataGenerator
from data_generator.object_detection_2d_photometric_ops import ConvertTo3Channels
from data_generator.object_detection_2d_geometric_ops import Resize
from data_generator.object_detection_2d_misc_utils import apply_inverse_transforms
import cv2

# Set the image size.
img_height = 300
img_width = 300

# # TODO: Set the path to the `.h5` file of the model to be loaded.
# # model_path = 'ssd300_model.h5'
# model_path = 'VGG_VOC0712Plus_SSD_300x300_iter_240000.h5'
# # We need to create an SSDLoss object in order to pass that to the model loader.
# ssd_loss = SSDLoss(neg_pos_ratio=3, n_neg_min=0, alpha=1.0)
#
# K.clear_session() # Clear previous models from memory.
#
# model = load_model(model_path, custom_objects={'AnchorBoxes': AnchorBoxes,
#                                                'L2Normalization': L2Normalization,
#                                                'DecodeDetections': DecodeDetections,
#                                                'compute_loss': ssd_loss.compute_loss})
K.clear_session() # Clear previous models from memory.

model = ssd_300(image_size=(img_height, img_width, 3),
                n_classes=1,
                mode='inference',
                l2_regularization=0.0005,
                scales=[0.1, 0.2, 0.37, 0.54, 0.71, 0.88, 1.05], # The scales for MS COCO are [0.07, 0.15, 0.33, 0.51, 0.69, 0.87, 1.05]
                aspect_ratios_per_layer=[[1.0, 2.0, 0.5],
                                         [1.0, 2.0, 0.5, 3.0, 1.0/3.0],
                                         [1.0, 2.0, 0.5, 3.0, 1.0/3.0],
                                         [1.0, 2.0, 0.5, 3.0, 1.0/3.0],
                                         [1.0, 2.0, 0.5],
                                         [1.0, 2.0, 0.5]],
                two_boxes_for_ar1=True,
                steps=[8, 16, 32, 64, 100, 300],
                offsets=[0.5, 0.5, 0.5, 0.5, 0.5, 0.5],
                clip_boxes=False,
                variances=[0.1, 0.1, 0.2, 0.2],
                normalize_coords=True,
                subtract_mean=[123, 117, 104],
                swap_channels=[2, 1, 0],
                confidence_thresh=0.5,
                iou_threshold=0.45,
                top_k=200,
                nms_max_output_size=400)

# 2: Load the trained weights into the model.

# TODO: Set the path of the trained weights.
# weights_path ='VGG_VOC0712Plus_SSD_300x300_iter_240000.h5'
# weights_path ='ssd300_model_liehen_small.h5'
# weights_path ='ssd300_model_liehen_expand.h5'
weights_path ='ssd300_model_liehen.h5'
model.load_weights(weights_path, by_name=True)

# 3: Compile the model so that Keras won't complain the next time you load it.

adam = Adam(lr=0.001, beta_1=0.9, beta_2=0.999, epsilon=1e-08, decay=0.0)

ssd_loss = SSDLoss(neg_pos_ratio=3, alpha=1.0)

model.compile(optimizer=adam, loss=ssd_loss.compute_loss)
model.summary()

orig_images = [] # Store the images here.
input_images = [] # Store resized versions of the images here.

# We'll only load one image in this example.
# img_path = 'VOC2007/JPEGImages/16.jpg'
img_path='F:/Data/crack image/ChallengeDataset/ChallengeDataset/train/neg/428.jpg'
image_opencv=cv2.imread(img_path)
# img_path='VOCtest_06-Nov-2007/VOCdevkit/VOC2007/JPEGImages/000001.jpg'
orig_images.append(imread(img_path))
img = image.load_img(img_path, target_size=(img_height, img_width))
img = image.img_to_array(img)
input_images.append(img)
input_images = np.array(input_images)

#对新的图像进行预测
y_pred = model.predict(input_images)
#
confidence_threshold = 0

y_pred_thresh = [y_pred[k][y_pred[k,:,1] > confidence_threshold] for k in range(y_pred.shape[0])]

np.set_printoptions(precision=2, suppress=True, linewidth=90)
print("Predicted boxes:\n")
print('   class   conf xmin   ymin   xmax   ymax')
print(y_pred_thresh[0])


# Display the image and draw the predicted boxes onto it.

# Set the colors for the bounding boxes
colors = plt.cm.hsv(np.linspace(0, 1, 21)).tolist()
# classes = ['background',
#            'aeroplane', 'bicycle', 'bird', 'boat',
#            'bottle', 'bus', 'car', 'cat',
#            'chair', 'cow', 'diningtable', 'dog',
#            'horse', 'motorbike', 'person', 'pottedplant',
#            'sheep', 'sofa', 'train', 'tvmonitor']

classes=['background','neg']

plt.figure(figsize=(20,12))
plt.imshow(orig_images[0])

current_axis = plt.gca()

for box in y_pred_thresh[0]:
    # Transform the predicted bounding boxes for the 300x300 image to the original image dimensions.
    xmin = box[2] * orig_images[0].shape[1] / img_width
    ymin = box[3] * orig_images[0].shape[0] / img_height
    xmax = box[4] * orig_images[0].shape[1] / img_width
    ymax = box[5] * orig_images[0].shape[0] / img_height
    color = colors[int(box[0])]
    label = '{}: {:.2f}'.format(classes[int(box[0])], box[1])
    current_axis.add_patch(plt.Rectangle((xmin, ymin), xmax-xmin, ymax-ymin, color=color, fill=False, linewidth=2))
    current_axis.text(xmin, ymin, label, size='x-large', color='white', bbox={'facecolor':color, 'alpha':1.0})
    cv2.putText(image_opencv, label, (int(xmin), int(ymin)-10), cv2.FONT_HERSHEY_COMPLEX, 0.8, (255, 255, 0), 1)
    cv2.rectangle(image_opencv, (int(xmin), int(ymin)), (int(xmax), int(ymax)), (0, 255, 0), 2)

cv2.namedWindow("Canvas",0)
cv2.imshow("Canvas", image_opencv)
cv2.waitKey(0)
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
测试结果:


六、源代码和数据
SSD裂纹检测源代码:https://download.csdn.net/download/qq_29462849/10748838
裂纹图像数据:https://download.csdn.net/download/qq_29462849/10748828

七、需要注意的问题
该SSD源代码,在设置学习率的时候,需要设置小些,本代码中设置0.0001,设置过大,会导致梯度爆炸,这个亲身体验过~~~
在制作数据的时候,需要把前景和背景区别开来,对裂纹比较明显的特征,检测框可以设置小些,而且尽量不要包含无关的背景;对裂纹不明显的特征,需要设置检测框大些,不明显的特征只能通过长裂纹来和背景区分。
八、另外一个思路
如果不需要定位裂纹在图像中的位置,只需要识别整幅图像是否有裂纹,并以此来做识别分类。有一个思路很好,那就是对整幅图像进行切分,比如在y轴方向做切割,切割成四份图像,如下图所示。因为裂纹大都是处于同一水平位置上的,研y轴进行切割,可以最大程度保留完整的裂纹。把切割好的裂纹和非裂纹图像挑选出来,分别给予标签,这样送进分类网络,比如DenseNet中进行训练。

网络训练完成后,就可以对场景图像进行识别了,只不过在识别的过程中同样需要对场景图像进行切分,然后对每一个切片图像进行分类识别。判断依据,只要场景图像的切片图像有一个被分类为有裂纹的,那该场景图像就是有裂纹的。

关于切片的方向和数量,可以根据自己工况的需要来完成,不一定非得是四份。

如有疑问,欢迎加企鹅516999497交流~


————————————————
版权声明:本文为CSDN博主「Oliver Cui」的原创文章,遵循 CC 4.0 BY-SA 版权协议,转载请附上原文出处链接及本声明。
原文链接:https://blog.csdn.net/qq_29462849/article/details/83472430

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/251972.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

linux硬链接与软链接

Linux 系统中有软链接和硬链接两种特殊的“文件”。 软链接可以看作是Windows中的快捷方式&#xff0c;可以让你快速链接到目标档案或目录。 硬链接则透过文件系统的inode来产生新档名&#xff0c;而不是产生新档案。 创建方法都很简单&#xff1a; 软链接&#xff08;符号链接…

int转时间

int转时间 public static string FormatDuration(int duration) { if (duration 0) return "00:00:00"; int hours duration / 3600; int minutes duration % 3600 / 60; int seconds duration % 3600 % 60; string _hours hours.ToString("00") &qu…

企业级区块链现状研究报告:小企业的投资总额是大企业的28倍

根据企业级区块链现状研究报告表明&#xff0c;当前企业采用区块链技术的势头正在逐步增强。参与该报告的企业表示&#xff0c;区块链投资今年共增长了 62% &#xff0c;预计到 2025 年区块链将成为主流技术。其中&#xff0c;有 28% 的企业正在积极开展区块链发展计划。现在看…

特征匹配

Python 使用Opencv实现图像特征检测与匹配 2018-06-13 11:36:58 Xy-Huang 阅读数 19203更多 分类专栏&#xff1a; Python 人工智能 版权声明&#xff1a;本文为博主原创文章&#xff0c;遵循 CC 4.0 BY-SA 版权协议&#xff0c;转载请附上原文出处链接和本声明。 本文链接…

bzoj 1015 并查集

代码&#xff1a; //这题可以反着想&#xff0c;把要去掉的点倒着处理变成往图中一个一个的加点&#xff0c;然后用并查集处理联通快就好了。 #include<iostream> #include<cstdio> #include<cstring> #include<vector> using namespace std; const in…

页面中切换echarts主题

要做的效果是&#xff1a;点击下拉框切换echarts主题 下面是效果图&#xff1a; 项目环境&#xff1a; vue ts es6 echarts(4.2.1) 步骤 安装依赖&#xff0c; npm install echarts -S / yarn add echarts -S引入主题 参考链接选择下拉框中的主题时&#xff0c;拿到图表主题…

画极线

OpenCV学习日记5 2017-05-27 10:44:35 1000sprites 阅读数 2339更多 分类专栏&#xff1a; 计算机视觉 版权声明&#xff1a;本文为博主原创文章&#xff0c;遵循 CC 4.0 BY-SA 版权协议&#xff0c;转载请附上原文出处链接和本声明。 本文链接&#xff1a;https://blog.cs…

Win10开启Administrator超级管理员账户

方法1 1、在系统的开始菜单上&#xff0c;我们单击鼠标右键&#xff0c;然后选择计算机管理打开进入 2、打开的计算机管理窗口&#xff0c;点击本地用户和组中的用户打开&#xff0c;然后点击右侧的Administrator账户&#xff0c;双击鼠标打开进入 3、打开的属性窗口中&#xf…

Mysql异常问题排查与处理——mysql的DNS反向解析和客户端网卡重启

中午刚想趴一会&#xff0c;不料锅从天降&#xff01;&#xff01;&#xff01;Mysql连不上了。。。。。。。 现象如下&#xff1a; 现象1&#xff1a;登录mysql所在服务器&#xff0c;连接MySQL 成功&#xff1b; 现象2&#xff1a;通过客户端远程连接MySQL&#xff0c;返回失…

最近很火的MySQL:抛开复杂的架构设计,MySQL优化思想基本都在这

优化一览图 优化 笔者将优化分为了两大类&#xff1a;软优化和硬优化。软优化一般是操作数据库即可&#xff1b;而硬优化则是操作服务器硬件及参数设置。 1、软优化 1&#xff09;查询语句优化 首先我们可以用EXPLAIN或DESCRIBE(简写:DESC)命令分析一条查询语句的执行信息。 例…

【读书笔记】《深入浅出Webpack》

Webpack版本 分析版本为3.6.0 4.0为最近升级的版本&#xff0c;与之前版本变化较大&#xff0c;编译输出的文件与3.0版本会不一致&#xff0c;目前项目中使用的版本3.0版本&#xff0c;所以基于3.0版本进行分析学习。 Webpack构建流程 初始化&#xff1a;启动构建&#xff0c;读…

《JAVA与模式》之桥梁模式

在阎宏博士的《JAVA与模式》一书中开头是这样描述桥梁&#xff08;Bridge&#xff09;模式的&#xff1a; 桥梁模式是对象的结构模式。又称为柄体(Handle and Body)模式或接口(Interface)模式。桥梁模式的用意是“将抽象化(Abstraction)与实现化(Implementation)脱耦&#xff0…

LABLEME UPDATE DAMOD

Labelme的改进——海量图片的自动标注 深度学习一般需要对大量的图片进行标注&#xff0c;但是手动标注耗时耗力&#xff0c;所以模仿labelme软件的功能&#xff0c;使用程序对大批量的图片进行自动标注&#xff0c;大大减少手动操作。下面介绍如何实现对大批量的图片进行标…

Java基础教程:面向对象编程[2]

Java基础教程&#xff1a;面向对象编程[2] 内容大纲 访问修饰符 四种访问修饰符 Java中&#xff0c;可以使用访问控制符来保护对类、变量、方法和构造方法的访问。Java 支持 4 种不同的访问权限。 default (即缺省&#xff0c;什么也不写&#xff09;: 在同一包内可见&#xff…

【javascript】异步编年史,从“纯回调”到Promise

异步和分块——程序的分块执行 一开始学习javascript的时候&#xff0c; 我对异步的概念一脸懵逼&#xff0c; 因为当时百度了很多文章&#xff0c;但很多各种文章不负责任的把笼统的描述混杂在一起&#xff0c;让我对这个 JS中的重要概念难以理解&#xff0c; “异步是非阻塞的…

Shell编程之if语法练习(LNMP)全过程

大家好&#xff0c;我是延凯&#xff0c;本人原来在CSDN写作已经快一年了 都是相关Linux运维这方面的技术知识&#xff0c;现在搬到博客园也是我一直想的&#xff0c;本博客主要写Python&#xff0c;docker&#xff0c;shell等偏向开发云计算等知识点&#xff0c;谢谢各位&…

基于UNet和camvid数据集的道路分割

基于UNet和camvid数据集的道路分割h(1.3.0)&#xff1a; 背景 语义分割是深度学习中的一个非常重要的研究方向&#xff0c;并且UNet是语义分割中一个非常经典的模型。在本次博客中&#xff0c;我尝试用UNet对camvid dataset数据集进行道路分割&#xff0c;大致期望的效果如下&…

二分法查找和普通查找

一、普通查找 对于数组和一个需要查找的元素来说&#xff0c;普通查找的原理很简单&#xff0c;即为从数组的第一个元素到最后一个元素进行遍历&#xff0c;如果第i个元素的值等于我们需要查找的值&#xff0c;那么返回找到的角标i&#xff0c;否则返回-1表示没有查找到。这里以…

Linux下安装zookeeper集群(奇数个)

1、 解压zookeeper压缩包 2、 data里创建“myid”文件&#xff08;命令touch myid&#xff09;&#xff0c;内容是1&#xff08;命令 echo 1 >> myid&#xff09; 3、 zoo.cnf里配置dataDir、clientport、server.nIP:端口1&#xff08;2881&#xff09;&#xff1a;端…

立体标定

立体标定应用标定数据转换成深度图标定 由于摄像头目前是我们手动进行定位的&#xff0c;我们现在还不知道两张图像与世界坐标之间的耦合关系&#xff0c;所以下一步要进行的是标定&#xff0c;用来确定分别获取两个摄像头的内部参数&#xff0c;并且根据两个摄像头在同一个世…