目录
- HOG是什么?
- HOG vs SIFT
- HOG步骤
- HOG在检测行人中的方式
- Opencv实现
- HOGDescriptor的构造函数:
- 行人检测HOG+SVM步骤
- 简化版的HOG计算
HOG是什么?
方向梯度直方图( Histogram of Oriented Gradient, HOG )特征是一种在计算机视觉和图像处理中用来进行物体检测的特征描述子。它通过
计算和统计图像局部区域的梯度方向直方图来构成特征。Hog特征结合SVM分类器已经被广泛应用于图像识别中。
HOG vs SIFT
SIFT :对特征点的描述方法
HOG :对一定区域的特征量的描述方法
1、可以表现较大的形状
2、非常适合行人及车辆检测
假设我们在智能驾驶中要检测行人:
正样本:
负样本:
识别的本质是要找到正样本和负样本最本质的区别。例如行人在肩部具有横向边缘、两臂具有竖向边缘。而非行人样本中的边缘是杂乱无章的。因此可以通过构建梯度的直方图来检测形状。由于直方图损失了空间信息,所以HOG将图像分割为一个一个小的区域(联系阈值处理中的分块处理法),对小的区域分别构建直方图,然后拼接得到一个大的直方图。
HOG的缺点: 速度慢,实时性差;难以处理遮挡问题。
HOG特征不具有旋转鲁棒性,以及尺度鲁棒性
HOG步骤
1、Gamma矫正(增强图像的对比度)
2、计算梯度信息
3、以cell(一个像素块)为单位计算梯度直方图
4、以block(几个cell为一个block)为单位,对特征量进行归一化
具体步骤:
一般来说对梯度方向进行九等分量化。
一般以3* 3的像素组成一个cell,这样每个cell就可以得到一个9维的直方图。
每 3*3个cell组成一个block,在每个block进行归一化:
归一化的目的:增强对亮度的鲁棒性。
HOG在检测行人中的方式
通常采用滑窗的方式:
计算滑窗中包含的像素的梯度直方图,然后与行人模板中的直方图进行对比(如利用各种矩),当两者十分相似时,我们就认为这个区域是行人区域。
从而延生出的问题:
由于模板是固定大小的,因此只能检测固定大小的行人。当图像中的行人尺寸发生变化时,如何使用一个单一的模板检测?
Opencv实现
OpenCV实现了两种类型的基于HOG特征的行人检测,分别是SVM和Cascade,OpenCV自带的级联分类器的文件的位置在“XX\opencv\sources\data\hogcascades”(OpenCV4.x版本可用)。
opencv自带的人数检测文件,所在位置在opencv的安装目录下(下面是我的安装位置):
D:\Program Files\opencv\sources\samples\cpp
HOGDescriptor的构造函数:
CV_WRAP HOGDescriptor() : winSize(64,128), blockSize(16,16), blockStride(8,8),cellSize(8,8), nbins(9), derivAperture(1), winSigma(-1),histogramNormType(HOGDescriptor::L2Hys), L2HysThreshold(0.2), gammaCorrection(true),free_coef(-1.f), nlevels(HOGDescriptor::DEFAULT_NLEVELS), signedGradient(false){}
窗口大小 winSize(64,128), 块大小blockSize(16,16), 块滑动增量blockStride(8,8), 胞元大小cellSize(8,8), 梯度方向数nbins(9)。
上面这些都是HOGDescriptor的成员变量,括号里的数值是它们的默认值,它们反应了HOG描述子的参数。
HOGDescriptor中有两种Detector分别是:getDaimlerPeopleDetector、getDefaultPeopleDetector
行人检测HOG+SVM步骤
参考的代码:
#include <opencv2/objdetect.hpp>
#include <opencv2/highgui.hpp>
#include <opencv2/imgproc.hpp>
#include <opencv2/videoio.hpp>
#include <iostream>
#include <iomanip>using namespace cv;
using namespace std;class Detector
{//enum Mode { Default, Daimler } m;enum { Default, Daimler };//定义枚举类型int m;HOGDescriptor hog, hog_d;
public:Detector(int a) : m(a), hog(), hog_d(Size(48, 96), Size(16, 16), Size(8, 8), Size(8, 8), 9)//构造函数,初始化对象时自动调用,m,hog,hog_d是数据成员,后跟一个放在圆括号中的初始化形式{hog.setSVMDetector(HOGDescriptor::getDefaultPeopleDetector());hog_d.setSVMDetector(HOGDescriptor::getDaimlerPeopleDetector());}void toggleMode() { m = (m == Default ? Daimler : Default); }string modeName() const { return (m == Default ? "Default" : "Daimler"); }vector<Rect> detect(InputArray img){// Run the detector with default parameters. to get a higher hit-rate// (and more false alarms, respectively), decrease the hitThreshold and// groupThreshold (set groupThreshold to 0 to turn off the grouping completely).vector<Rect> found;if (m == Default)hog.detectMultiScale(img, found, 0, Size(8, 8), Size(32, 32), 1.05, 2, false);else if (m == Daimler)hog_d.detectMultiScale(img, found, 0.5, Size(8, 8), Size(32, 32), 1.05, 2, true);return found;}void adjustRect(Rect& r) const{// The HOG detector returns slightly larger rectangles than the real objects,// so we slightly shrink the rectangles to get a nicer output.r.x += cvRound(r.width * 0.1);r.width = cvRound(r.width * 0.8);r.y += cvRound(r.height * 0.07);r.height = cvRound(r.height * 0.8);}
};
//修改参数区域
static const string keys = "{ help h | | print help message }"
"{ camera c | 0 | capture video from camera (device index starting from 0) }"
"{ video v | D:/opencv/opencv4.0/opencv4.0.0/sources/samples/data/vtest.avi| use video as input }";int main(int argc, char** argv)
{CommandLineParser parser(argc, argv, keys); //keys:描述可接受的命令行参数的字符串parser.about("This sample demonstrates the use ot the HoG descriptor.");//设置相关信息。相关信息会在 printMessage 被调用时显示。if (parser.has("help")){parser.printMessage();return 0;}int camera = parser.get<int>("camera");string file = parser.get<string>("video");if (!parser.check())//检查解析错误。当错误发生时返回true。错误可能是转换错误、丢失参数等。{parser.printErrors();return 1;}VideoCapture cap;if (file.empty())cap.open(camera);elsecap.open(file.c_str());if (!cap.isOpened()){cout << "Can not open video stream: '" << (file.empty() ? "<camera>" : file) << "'" << endl;return 2;}cout << "Press 'q' or <ESC> to quit." << endl;cout << "Press <space> to toggle between Default and Daimler detector" << endl;//Default and Daimler detectorDetector detector(1); //初始化使用Daimler detectorMat frame;for (;;){cap >> frame;if (frame.empty()){cout << "Finished reading: empty frame" << endl;break;}int64 t = getTickCount();vector<Rect> found = detector.detect(frame);t = getTickCount() - t;// show the window{ostringstream buf;buf << "Mode: " << detector.modeName() << " ||| "<< "FPS: " << fixed << setprecision(1) << (getTickFrequency() / (double)t);putText(frame, buf.str(), Point(10, 30), FONT_HERSHEY_PLAIN, 2.0, Scalar(0, 0, 255), 2, LINE_AA);}for (vector<Rect>::iterator i = found.begin(); i != found.end(); ++i){Rect& r = *i;detector.adjustRect(r);rectangle(frame, r.tl(), r.br(), cv::Scalar(0, 255, 0), 2);}imshow("People detector", frame);// interact with userconst char key = (char)waitKey(30);if (key == 27 || key == 'q') // ESC{cout << "Exit requested" << endl;break;}else if (key == ' '){detector.toggleMode();}}return 0;
}
简化后的对单张图片的检测
#include <opencv2/objdetect.hpp>
#include <opencv2/highgui.hpp>
#include <opencv2/imgproc.hpp>
#include <opencv2/videoio.hpp>
#include <iostream>
#include <iomanip>using namespace cv;
using namespace std;class Detector
{//enum Mode { Default, Daimler } m;enum { Default, Daimler };//定义枚举类型int m;HOGDescriptor hog, hog_d;
public:Detector(int a) : m(a), hog(), hog_d(Size(48, 96), Size(16, 16), Size(8, 8), Size(8, 8), 9)//构造函数,初始化对象时自动调用,m,hog,hog_d是数据成员,后跟一个放在圆括号中的初始化形式{hog.setSVMDetector(HOGDescriptor::getDefaultPeopleDetector());hog_d.setSVMDetector(HOGDescriptor::getDaimlerPeopleDetector());}void toggleMode() { m = (m == Default ? Daimler : Default); }string modeName() const { return (m == Default ? "Default" : "Daimler"); }vector<Rect> detect(InputArray img){// Run the detector with default parameters. to get a higher hit-rate// (and more false alarms, respectively), decrease the hitThreshold and// groupThreshold (set groupThreshold to 0 to turn off the grouping completely).vector<Rect> found;if (m == Default)hog.detectMultiScale(img, found, 0, Size(8, 8), Size(32, 32), 1.05, 2, false);else if (m == Daimler)hog_d.detectMultiScale(img, found, 0.5, Size(8, 8), Size(32, 32), 1.05, 2, true);return found;}void adjustRect(Rect& r) const{// The HOG detector returns slightly larger rectangles than the real objects,// so we slightly shrink the rectangles to get a nicer output.r.x += cvRound(r.width * 0.1);r.width = cvRound(r.width * 0.8);r.y += cvRound(r.height * 0.07);r.height = cvRound(r.height * 0.8);}
};int main(int argc, char** argv)
{Detector detector(1); //初始化使用Daimler detectorMat img=imread("D:\\opencv_picture_test\\HOG行人检测\\timg.jpg");vector<Rect> found = detector.detect(img);for (vector<Rect>::iterator i = found.begin(); i != found.end(); ++i){Rect& r = *i;detector.adjustRect(r); rectangle(img, r.tl(), r.br(), cv::Scalar(0, 255, 0), 2);}imshow("People detector", img);waitKey(0);return 0;
}
结果:
简化版的HOG计算
需要用到的知识点:
曼哈顿距离:
由于一般的建立数组的方法在【】中填变量是行不通的,这里我们采用动态建立数组的方法。在程序返回前必须将内存释放
#include <opencv2/opencv.hpp>
#include "opencv2/features2d.hpp"
#include <iostream>
#include "windows.h"
#include <stdio.h>
#include <time.h>
#include <math.h>
//#include "My_ImageProssing_base.h"
#define WINDOW_NAME1 "【程序窗口1】"
#define WINDOW_NAME2 "【程序窗口2】"
using namespace cv;
using namespace std;
RNG g_rng(12345);Mat src_image;
Mat img1;
Mat img2;//*--------------------------手动实现HOG描述子-------------------------------------*/
int angle_lianghua(float angle)
{int result = angle/45;return result;
}
int main()
{//改变控制台字体颜色system("color 02");//读取图像src_image = imread("D:\\opencv_picture_test\\HOG行人检测\\hogTemplate.jpg");img1 = imread("D:\\opencv_picture_test\\HOG行人检测\\img1.jpg");img2 = imread("D:\\opencv_picture_test\\HOG行人检测\\img2.jpg");//出错判断if (!(src_image.data || img1.data || img2.data)){cout << "image load failed!" << endl;return -1;}//【1】计算hogTemplate//所有像素计算梯度和角度方向Mat gx, gy;Mat mag, angle; //幅值和角度Sobel(src_image, gx, CV_32F, 1, 0, 1);Sobel(src_image, gy, CV_32F, 0, 1, 1);cartToPolar(gx, gy, mag, angle, false); //false获得的是角度int cellSize = 16; //每个cell的大小int nx = src_image.cols / cellSize; //每行有几个int ny = src_image.rows / cellSize; //每列有几个int cellnums = nx * ny; //有几个cellint bins = cellnums * 8;float* ref_hist = new float[bins];memset(ref_hist, 0, sizeof(float) * bins);int binnum = 0;//计算一张图for (int j = 0;j < ny;j++){for (int i = 0;i < nx;i++){//计算每个cell的直方图for (int y = j * cellSize;y < (j + 1) * cellSize;y++){for (int x = i * cellSize;x < (i + 1) * cellSize;x++){//对角度进行量化int tempangle1 = 0;float tempangle2 = angle.at<float>(y, x); //当前像素的角度值tempangle1 = angle_lianghua(tempangle2); //当前cell的角度分量float magnitude = mag.at<float>(y, x); //当前像素的幅度值ref_hist[tempangle1 + binnum * 8] += magnitude; //在数组中加上当前的}}binnum++; //cell数目+1}}//【2】计算img1//所有像素计算梯度和角度方向Mat gx_img1, gy_img1;Mat mag_img1, angle_img1; //幅值和角度Sobel(img1, gx_img1, CV_32F, 1, 0, 1);Sobel(img1, gy_img1, CV_32F, 0, 1, 1);cartToPolar(gx_img1, gy_img1, mag_img1, angle_img1, false); //false获得的是角度nx = img1.cols / cellSize; //每行有几个ny = img1.rows / cellSize; //每列有几个cellnums = nx * ny; //有几个cellbins = cellnums * 8;float* ref_hist_img1 = new float[bins];memset(ref_hist_img1, 0, sizeof(float) * bins);binnum = 0;//计算一张图for (int j = 0;j < ny;j++){for (int i = 0;i < nx;i++){//计算每个cell的直方图for (int y = j * cellSize;y < (j + 1) * cellSize;y++){for (int x = i * cellSize;x < (i + 1) * cellSize;x++){//对角度进行量化int tempangle1 = 0;float tempangle2 = angle_img1.at<float>(y, x); //当前像素的角度值tempangle1 = angle_lianghua(tempangle2); //当前cell的角度分量float magnitude = mag_img1.at<float>(y, x); //当前像素的幅度值ref_hist_img1[tempangle1 + binnum * 8] += magnitude; //在数组中加上当前的}}binnum++; //cell数目+1}}//【3】计算img2//所有像素计算梯度和角度方向Mat gx_img2, gy_img2;Mat mag_img2, angle_img2; //幅值和角度Sobel(img2, gx_img2, CV_32F, 1, 0, 1);Sobel(img2, gy_img2, CV_32F, 0, 1, 1);cartToPolar(gx_img2, gy_img2, mag_img2, angle_img2, false); //false获得的是角度nx = img2.cols / cellSize; //每行有几个ny = img2.rows / cellSize; //每列有几个cellnums = nx * ny; //有几个cellbins = cellnums * 8;float* ref_hist_img2 = new float[bins];memset(ref_hist_img2, 0, sizeof(float) * bins);binnum = 0;//计算一张图for (int j = 0;j < ny;j++){for (int i = 0;i < nx;i++){//计算每个cell的直方图for (int y = j * cellSize;y < (j + 1) * cellSize;y++){for (int x = i * cellSize;x < (i + 1) * cellSize;x++){//对角度进行量化int tempangle1 = 0;float tempangle2 = angle_img2.at<float>(y, x); //当前像素的角度值tempangle1 = angle_lianghua(tempangle2); //当前像素的角度分量float magnitude = mag_img2.at<float>(y, x); //当前像素的幅度值ref_hist_img2[tempangle1 + binnum * 8] += magnitude; //在数组中加上当前的}}binnum++; //cell数目+1}}//【4】分别计算ref_hist_img1和ref_hist\ref_hist_img2和ref_hist的矩int result1 = 0;int result2 = 0;for (int i = 0;i < bins;i++){//这里简化运算,不计算平方根,而是计算absresult1 += abs(ref_hist[i]- ref_hist_img1[i]);result2 += abs(ref_hist[i] - ref_hist_img2[i]);}cout << result1 << endl;cout << result2 << endl;if (result1 < result2){cout << "img1更与原图相似" << endl;}elsecout << "img2更与原图相似" << endl;waitKey(0);delete[] ref_hist;delete[] ref_hist_img1;delete[] ref_hist_img2;return 0;
}
结果:
Reference:
OpenCV实战4: HOG+SVM实现行人检测
HOG detectMultiScale 参数分析
CommandLineParser类(命令行解析类)
C++语法:构造函数以及析构函数
《数字图像处理PPT.李竹版》