转载自非极大抑制(Non-Maximum Suppression)。
参考文章:
1. Non-Maximum Suppression for Object Detection in Python
2. NMS非极大值抑制
最近在做人脸识别的项目,其中在人脸检测算法中MTCNN算法是用到了NMS算法来筛选候选的人脸区域得到最佳的人脸位置。
这个算法其实应用非常广泛,在比较流行的检测算法中都有使用,包括RCNN、SPP-Net中,因为它主要作用就是在一堆候选区域找到最好最佳的区域。
大概原理如下:
假设从一个图像中得到了2000region proposals,通过在RCNN和SPP-net之后我们会得到2000*4096的一个特征矩阵,然后通过N的SVM来判断每一个region属于N的类的scores。其中,SVM的权重矩阵大小为4096*N,最后得到2000*N的一个score矩阵(其中,N为类别的数量)。
Non-Maximum Suppression就是需要根据
score
矩阵和region
的坐标信息,从中找到置信度比较高的bounding box。首先,NMS计算出每一个bounding box的面积,然后根据score进行排序,把score最大的bounding box作为队列中。接下来,计算其余bounding box与当前最大score与box的IoU,去除IoU大于设定的阈值的bounding box。然后重复上面的过程,直至候选bounding box为空。最终,检测了bounding box的过程中有两个阈值,一个就是IoU,另一个是在过程之后,从候选的bounding box中剔除score小于阈值的bounding box。需要注意的是:Non-Maximum Suppression一次处理一个类别,如果有N个类别,Non-Maximum Suppression就需要执行N次。
python实现代码如下(参考自Non-Maximum Suppression for Object Detection in Python):
# import the necessary packages
import numpy as np
import cv2# Felzenszwalb et al.
def non_max_suppression_slow(boxes, overlapThresh):# if there are no boxes, return an empty listif len(boxes) == 0:return []# initialize the list of picked indexespick = []# grab the coordinates of the bounding boxesx1 = boxes[:,0]y1 = boxes[:,1]x2 = boxes[:,2]y2 = boxes[:,3]# compute the area of the bounding boxes and sort the bounding# boxes by the bottom-right y-coordinate of the bounding boxarea = (x2 - x1 + 1) * (y2 - y1 + 1)idxs = np.argsort(y2)# keep looping while some indexes still remain in the indexes# listwhile len(idxs) > 0:# grab the last index in the indexes list, add the index# value to the list of picked indexes, then initialize# the suppression list (i.e. indexes that will be deleted)# using the last indexlast = len(idxs) - 1i = idxs[last]pick.append(i)suppress = [last]# loop over all indexes in the indexes listfor pos in xrange(0, last):# grab the current indexj = idxs[pos]# find the largest (x, y) coordinates for the start of# the bounding box and the smallest (x, y) coordinates# for the end of the bounding boxxx1 = max(x1[i], x1[j])yy1 = max(y1[i], y1[j])xx2 = min(x2[i], x2[j])yy2 = min(y2[i], y2[j])# compute the width and height of the bounding boxw = max(0, xx2 - xx1 + 1)h = max(0, yy2 - yy1 + 1)# compute the ratio of overlap between the computed# bounding box and the bounding box in the area listoverlap = float(w * h) / area[j]# if there is sufficient overlap, suppress the# current bounding boxif overlap > overlapThresh:suppress.append(pos)# delete all indexes from the index list that are in the# suppression listidxs = np.delete(idxs, suppress)# return only the bounding boxes that were pickedreturn boxes[pick]# construct a list containing the images that will be examined
# along with their respective bounding boxes
images = [("images/audrey.jpg", np.array([(12, 84, 140, 212),(24, 84, 152, 212),(36, 84, 164, 212),(12, 96, 140, 224),(24, 96, 152, 224),(24, 108, 152, 236)])),("images/bksomels.jpg", np.array([(114, 60, 178, 124),(120, 60, 184, 124),(114, 66, 178, 130)])),("images/gpripe.jpg", np.array([(12, 30, 76, 94),(12, 36, 76, 100),(72, 36, 200, 164),(84, 48, 212, 176)]))]# loop over the images
for (imagePath, boundingBoxes) in images:# load the image and clone itprint "[x] %d initial bounding boxes" % (len(boundingBoxes))image = cv2.imread(imagePath)orig = image.copy()# loop over the bounding boxes for each image and draw themfor (startX, startY, endX, endY) in boundingBoxes:cv2.rectangle(orig, (startX, startY), (endX, endY), (0, 0, 255), 2)# perform non-maximum suppression on the bounding boxespick = non_max_suppression_slow(boundingBoxes, 0.3)print "[x] after applying non-maximum, %d bounding boxes" % (len(pick))# loop over the picked bounding boxes and draw themfor (startX, startY, endX, endY) in pick:cv2.rectangle(image, (startX, startY), (endX, endY), (0, 255, 0), 2)# display the imagescv2.imshow("Original", orig)cv2.imshow("After NMS", image)cv2.waitKey(0)
效果如下图: