上一篇博客介绍了无人驾驶中的车辆检测算法(YOLO模型),该检测是基于图像进行的2D目标检测。在无人驾驶环境感知传感器中还有另一种重要的传感器:激光雷达。今天就介绍一篇无人驾驶中基于激光雷达目标检测的3D多目标追踪论文,使用的数据集也是KITTI数据集,使用的目标检测算法是PointRCNN。该3D多目标追踪系统不需要使用GPU且取得了最快的处理速度(214.7FPS)。
论文地址:A Baseline for 3D Multi-Object Tracking
Github地址:https://github.com/xinshuoweng/AB3DMOT
大家有时间可以看看这个视频,这是论文作者此前分享的一个3D多目标追踪报告,视频地址:https://www.bilibili.com/video/BV1DD4y1Q7JG/
3D Multi-Object Tracking: A Baseline and New Evaluation Metrics
文章目录
- 1.概述
- 2.方法
- A. 3D 目标检测
- B. 3D 卡尔曼滤波---状态预测
- C. 数据关联
- D. 3D卡尔曼滤波-状态更新
- E. Birth and Death内存
- 3. 测试结果
1.概述
无论是在自动驾驶或辅助机器人等实时应用中,3D多目标追踪(MOT)都是非常重要的技术环节。然而,在近期的一些3D多目标追踪研究当中更多的都是关注于追踪系统的准确率而忽略了其计算成本和复杂性。
在本文中作者提出了一种简单而又准确的实时3D MOT基线系统。本文使用了现成的3D目标检测器从LiDAR点云数据中获取对应目标3D bounding box。 然后,将3D卡尔曼滤波器和匈牙利算法结合起来进行目标状态估计和数据关联。为了评估提出的基线系统,论文在原有的KITTI官方2D MOT评价指标上进行了扩展,提出了三个新的3D MOT评价指标,本文主要贡献有:
- 设计了一种简单而又准确的实时3D MOT系统;
- 对KITTI官方2D MOT评价指标进行扩展,提出了新的3D MOT评价指标;
- 在KITTI评价工具上获得了最优的3D MOT性能,在2D MOT评价工具上获得了最快的处理速度(如下图所示)。
2.方法
3D MOT的目的是对一段连续数据中的3D 检测目标进行关联。在本系统中,需要使用当前帧中的检测结果以及上一帧中的关联轨迹。系统结构如下图所示,由5个部分组成:
(A)3D检测模块用于对LiDAR点云数据进行目标检测;
(B)3D卡尔曼滤波模块用于预测关联轨迹当前帧的状态;
(C)数据关联模块用于对检测模块输出结果和卡尔曼滤波模块输出的轨迹进行匹配关联;
(D)3D卡尔曼滤波根据检测结果更新匹配轨迹的状态;
(E)birth和death内存模块用于生成新的轨迹和删除未匹配的目标轨迹。
除了3D目标检测块以外,3D MOT系统不需要进行训练,可直接使用。
A. 3D 目标检测
在这里,直接使用在KITTI 3D数据及上训练好的3D 检测模型(即:PointRCNN)作为目标检测模块。
下面是一些术语规定,假设当前为ttt时刻,3D检测模块输出是:Dt={Dt1,Dt2,...,Dtnt}D_{t}=\{{D_{t}^{1}, D_{t}^{2}, ...,D_{t}^{n_{t}}}\}Dt={Dt1,Dt2,...,Dtnt},(ntn_tnt为ttt时刻中检测目标数量)。每一个检测目标向量表示形式为:(x,y,z,θ,l,w,h,s)(x, y, z,\theta, l, w, h, s)(x,y,z,θ,l,w,h,s),即目标中心坐标(x,y,z)(x,y,z)(x,y,z),目标尺寸(l,w,h)(l,w,h)(l,w,h),偏航角θ\thetaθ以及置信度sss
B. 3D 卡尔曼滤波—状态预测
为了预测目标从上一帧到当前帧的轨迹状态,这里选择使用的运动模型为恒速运动模型,忽略了相邻帧内的目标位置移动。
具体地,我们使用11维向量来表示目标轨迹的状态T=(x,y,z,θ,l,w,h,s,vx,vy,vz)T=(x, y, z, \theta, l, w, h, s, v_{x}, v_{y,} v_{z})T=(x,y,z,θ,l,w,h,s,vx,vy,vz),附加的3个变量表示目标在3个维度上的运动速度。
在每一帧,所有t-1帧的关联轨迹为Tt−1={Tt−11,Tt−12,⋯,Tt−1mt−1}T_{t-1}=\left\{T_{t-1}^{1}, T_{t-1}^{2}, \cdots, T_{t-1}^{m_{t-1}}\right\}Tt−1={Tt−11,Tt−12,⋯,Tt−1mt−1}(mt−1m_{t-1}mt−1是t-1时刻的轨迹数量)将会被传播到下一帧ttt,用TestT_{est}Test表示。由恒速运动模型可以推出:
xest=x+vxyest=y+vyzest=z+vzx_{est}=x+v_x \quad y_{est}=y+v_y \quad z_{est}=z+v_z xest=x+vxyest=y+vyzest=z+vz
因此,Tt−1T_{t-1}Tt−1中的每个轨迹Tt−1iT^i_{t-1}Tt−1i传播到t帧中的预测状态将是Testi=(xest,yest,zest,θ,l,w,h,s,vx,vy,vz)T_{\mathrm{est}}^{i}=\left(x_{\mathrm{est}}, y_{\mathrm{est}}, z_{\mathrm{est}}, \theta, l, w, h, s, v_{x}, v_{y}, v_{z}\right)Testi=(xest,yest,zest,θ,l,w,h,s,vx,vy,vz),轨迹数据将会用于之后的关联模块。
3D 卡尔曼滤波的代码如下:
class KalmanBoxTracker(object):"""This class represents the internel state of individual tracked objects observed as bbox."""count = 0def __init__(self, bbox3D, info):"""Initialises a tracker using initial bounding box."""# define constant velocity modelself.kf = KalmanFilter(dim_x=10, dim_z=7) self.kf.F = np.array([[1,0,0,0,0,0,0,1,0,0], # state transition matrix[0,1,0,0,0,0,0,0,1,0],[0,0,1,0,0,0,0,0,0,1],[0,0,0,1,0,0,0,0,0,0], [0,0,0,0,1,0,0,0,0,0],[0,0,0,0,0,1,0,0,0,0],[0,0,0,0,0,0,1,0,0,0],[0,0,0,0,0,0,0,1,0,0],[0,0,0,0,0,0,0,0,1,0],[0,0,0,0,0,0,0,0,0,1]]) self.kf.H = np.array([[1,0,0,0,0,0,0,0,0,0], # measurement function,[0,1,0,0,0,0,0,0,0,0],[0,0,1,0,0,0,0,0,0,0],[0,0,0,1,0,0,0,0,0,0],[0,0,0,0,1,0,0,0,0,0],[0,0,0,0,0,1,0,0,0,0],[0,0,0,0,0,0,1,0,0,0]])# # with angular velocity# self.kf = KalmanFilter(dim_x=11, dim_z=7) # self.kf.F = np.array([[1,0,0,0,0,0,0,1,0,0,0], # state transition matrix# [0,1,0,0,0,0,0,0,1,0,0],# [0,0,1,0,0,0,0,0,0,1,0],# [0,0,0,1,0,0,0,0,0,0,1], # [0,0,0,0,1,0,0,0,0,0,0],# [0,0,0,0,0,1,0,0,0,0,0],# [0,0,0,0,0,0,1,0,0,0,0],# [0,0,0,0,0,0,0,1,0,0,0],# [0,0,0,0,0,0,0,0,1,0,0],# [0,0,0,0,0,0,0,0,0,1,0],# [0,0,0,0,0,0,0,0,0,0,1]]) # self.kf.H = np.array([[1,0,0,0,0,0,0,0,0,0,0], # measurement function,# [0,1,0,0,0,0,0,0,0,0,0],# [0,0,1,0,0,0,0,0,0,0,0],# [0,0,0,1,0,0,0,0,0,0,0],# [0,0,0,0,1,0,0,0,0,0,0],# [0,0,0,0,0,1,0,0,0,0,0],# [0,0,0,0,0,0,1,0,0,0,0]])# self.kf.R[0:,0:] *= 10. # measurement uncertainty# state uncertainty, give high uncertainty to the unobservable initial velocities, covariance matrixself.kf.P[7:, 7:] *= 1000. self.kf.P *= 10.# self.kf.Q[-1,-1] *= 0.01 # process uncertaintyself.kf.Q[7:, 7:] *= 0.01self.kf.x[:7] = bbox3D.reshape((7, 1))self.time_since_update = 0self.id = KalmanBoxTracker.countKalmanBoxTracker.count += 1self.history = []self.hits = 1 # number of total hits including the first detectionself.hit_streak = 1 # number of continuing hit considering the first detectionself.first_continuing_hit = 1self.still_first = Trueself.age = 0self.info = info # other info associateddef update(self, bbox3D, info): """ Updates the state vector with observed bbox."""self.time_since_update = 0self.history = []self.hits += 1self.hit_streak += 1 # number of continuing hitif self.still_first:self.first_continuing_hit += 1 # number of continuing hit in the fist time######################### orientation correctionif self.kf.x[3] >= np.pi: self.kf.x[3] -= np.pi * 2 # make the theta still in the rangeif self.kf.x[3] < -np.pi: self.kf.x[3] += np.pi * 2new_theta = bbox3D[3]if new_theta >= np.pi: new_theta -= np.pi * 2 # make the theta still in the rangeif new_theta < -np.pi: new_theta += np.pi * 2bbox3D[3] = new_thetapredicted_theta = self.kf.x[3]if abs(new_theta - predicted_theta) > np.pi / 2.0 and abs(new_theta - predicted_theta) < np.pi * 3 / 2.0: # if the angle of two theta is not acute angleself.kf.x[3] += np.pi if self.kf.x[3] > np.pi: self.kf.x[3] -= np.pi * 2 # make the theta still in the rangeif self.kf.x[3] < -np.pi: self.kf.x[3] += np.pi * 2# now the angle is acute: < 90 or > 270, convert the case of > 270 to < 90if abs(new_theta - self.kf.x[3]) >= np.pi * 3 / 2.0:if new_theta > 0: self.kf.x[3] += np.pi * 2else: self.kf.x[3] -= np.pi * 2######################### # flipself.kf.update(bbox3D)if self.kf.x[3] >= np.pi: self.kf.x[3] -= np.pi * 2 # make the theta still in the rageif self.kf.x[3] < -np.pi: self.kf.x[3] += np.pi * 2self.info = infodef predict(self): """Advances the state vector and returns the predicted bounding box estimate."""self.kf.predict() if self.kf.x[3] >= np.pi: self.kf.x[3] -= np.pi * 2if self.kf.x[3] < -np.pi: self.kf.x[3] += np.pi * 2self.age += 1if (self.time_since_update > 0):self.hit_streak = 0self.still_first = Falseself.time_since_update += 1self.history.append(self.kf.x)return self.history[-1]def get_state(self):"""Returns the current bounding box estimate."""return self.kf.x[:7].reshape((7, ))
C. 数据关联
为了将检测结果DtD_tDt与预测轨迹TestT_{est}Test进行匹配,我们将使用匈牙利算法。
维度为(mt−1,nt)(m_{t-1}, n_t)(mt−1,nt)的矩阵将用来计算每对DtD_tDt和TestT_{est}Test的3DIoU3D IoU3DIoU或目标中心距离。然后,使用匈牙利算法来处理二分图匹配问题。除此之外,当3DIoU3D IoU3DIoU小于IoUminIoU_{min}IoUmin时或目标中心距离大于distmaxdist_{max}distmax时,将拒绝进行匹配。数据关联模块的输出将分为4类:
Tmatch={Tmatch1,Tmatch2,..,Tmatchwt}Dmatch={Dmatch1,Dmatch2,..,Dmatchwt}Tunmatch={Tunmatch1,Tunmatch2,..,Tunmatchmt−1−wt}Dunmatch={Dunmatch1,Dunmatch2,..,Dunmatchnt−wt}T_{match}=\{{T^1_{match},T^2_{match},..,T^{wt}_{match}}\}\\ D_{match}=\{{D^1_{match},D^2_{match},..,D^{wt}_{match}}\}\\ T_{unmatch}=\{{T^1_{unmatch},T^2_{unmatch},..,T^{m_{t-1}-wt}_{unmatch}}\}\\ D_{unmatch}=\{{D^1_{unmatch},D^2_{unmatch},..,D^{nt-wt}_{unmatch}}\}Tmatch={Tmatch1,Tmatch2,..,Tmatchwt}Dmatch={Dmatch1,Dmatch2,..,Dmatchwt}Tunmatch={Tunmatch1,Tunmatch2,..,Tunmatchmt−1−wt}Dunmatch={Dunmatch1,Dunmatch2,..,Dunmatchnt−wt}
这里wtw_twt是匹配目标数量。
def associate_detections_to_trackers(detections, trackers, iou_threshold=0.01): """Assigns detections to tracked object (both represented as bounding boxes)detections: N x 8 x 3trackers: M x 8 x 3Returns 3 lists of matches, unmatched_detections and unmatched_trackers"""if (len(trackers)==0): return np.empty((0, 2), dtype=int), np.arange(len(detections)), np.empty((0, 8, 3), dtype=int) iou_matrix = np.zeros((len(detections), len(trackers)), dtype=np.float32)for d, det in enumerate(detections):for t, trk in enumerate(trackers):iou_matrix[d, t] = iou3d(det, trk)[0] # det: 8 x 3, trk: 8 x 3# matched_indices = linear_assignment(-iou_matrix) # hougarian algorithm, compatible to linear_assignment in sklearn.utilsrow_ind, col_ind = linear_sum_assignment(-iou_matrix) # hougarian algorithmmatched_indices = np.stack((row_ind, col_ind), axis=1)unmatched_detections = []for d, det in enumerate(detections):if (d not in matched_indices[:, 0]): unmatched_detections.append(d)unmatched_trackers = []for t, trk in enumerate(trackers):if (t not in matched_indices[:, 1]): unmatched_trackers.append(t)#filter out matched with low IOUmatches = []for m in matched_indices:if (iou_matrix[m[0], m[1]] < iou_threshold):unmatched_detections.append(m[0])unmatched_trackers.append(m[1])else: matches.append(m.reshape(1, 2))if (len(matches) == 0): matches = np.empty((0, 2),dtype=int)else: matches = np.concatenate(matches, axis=0)return matches, np.array(unmatched_detections), np.array(unmatched_trackers)
D. 3D卡尔曼滤波-状态更新
由于TmatchT_{match}Tmatch的不确定性,我们根据检测结果DmatchD_{match}Dmatch对TmatchT_{match}Tmatch整个状态空间的每个轨迹进行更新,最后获得ttt帧时刻的关联轨迹Tt={Tt1,Tt2,⋯,Ttwt}T_{t}=\left\{T_{t}^{1}, T_{t}^{2}, \cdots, T_{t}^{w_{t}}\right\}Tt={Tt1,Tt2,⋯,Ttwt}。
根据贝叶斯规则,更新后的每一条轨迹Ttk=(x′,y′,z′,θ′,l′,w′,h′,s′,vx′,vy,′vz′)T_{t}^k=(x', y', z', \theta', l', w', h',s', v_{x}', v_{y,}' v_{z}')Ttk=(x′,y′,z′,θ′,l′,w′,h′,s′,vx′,vy,′vz′)是状态空间TmatchkT_{match}^kTmatchk和DmatchkD_{match}^kDmatchk的加权和,这的权重是由轨迹和检测的不确定性所决定的。
除此之外,我们发现原始的加权和对确定目标方向效果一般。对于匹配目标TtkT_{t}^kTtk,其检测结果上的方向和运动轨迹方向可能完全相反,相差π\piπ个相位。但事实上,目标都是平缓移动,不可能在一帧中改变移动方向。从结果上来看,TtkT_t^kTtk目标的加权轨迹方向应该位于DmatchkD_{match}^kDmatchk和TmatchkT_{match}^kTmatchk之间。这往往将会导致比较低的3DIoU3D IoU3DIoU值。为此,我们提出一个方向修正技术,当两者相位相差超过π/2\pi/2π/2时,我们在TmatchkT_{match}^kTmatchk上添加π\piπ个相位,以便TmatchkT_{match}^kTmatchk和DmatchkD_{match}^kDmatchk方向能大致一致。
E. Birth and Death内存
因为现存的目标可能会消失以及新的目标可能会进入画面,因此需要一个新的模块来管理出现和消失的轨迹。一方面,我们认为所有不匹配的检测可能会进入帧中,为了避免假阳性追踪,当在BirminBir_{min}Birmin中持续检测到DunmatchpD_{unmatch}^pDunmatchp时,将会生成 一条新的轨迹TnewpT_{new}^pTnewp。一旦新的轨迹产生,我们将初始化TnewpT_{new}^pTnewp作为最新的测量DunmatchpD_{unmatch}^pDunmatchp,三个方向的速度初始化为0。
另一方面,我们认为所有不匹配的轨迹可能会离开,为了避免删除真阳性的轨迹,我们在AgemaxAge_{max}Agemax帧中持续追踪不匹配轨迹TunmatchqT_{unmatch}^qTunmatchq直到从关联轨迹删除。
3. 测试结果
在KITTI上的追踪结果如下,这里只给出了可视化结果,详细结果可查阅论文。