TY - JOUR
T1 - On-Board Object Detection
T2 - Multicue, Multimodal, and Multiview Random Forest of Local Experts
AU - Gonzalez, Alejandro
AU - Vazquez, David
AU - Lopez, Antonio M.
AU - Amores, Jaume
N1 - Publisher Copyright:
© 2013 IEEE.
PY - 2017/11
Y1 - 2017/11
N2 - Despite recent significant advances, object detection continues to be an extremely challenging problem in real scenarios. In order to develop a detector that successfully operates under these conditions, it becomes critical to leverage upon multiple cues, multiple imaging modalities, and a strong multiview (MV) classifier that accounts for different object views and poses. In this paper, we provide an extensive evaluation that gives insight into how each of these aspects (multicue, multimodality, and strong MV classifier) affect accuracy both individually and when integrated together. In the multimodality component, we explore the fusion of RGB and depth maps obtained by high-definition light detection and ranging, a type of modality that is starting to receive increasing attention. As our analysis reveals, although all the aforementioned aspects significantly help in improving the accuracy, the fusion of visible spectrum and depth information allows to boost the accuracy by a much larger margin. The resulting detector not only ranks among the top best performers in the challenging KITTI benchmark, but it is built upon very simple blocks that are easy to implement and computationally efficient.
AB - Despite recent significant advances, object detection continues to be an extremely challenging problem in real scenarios. In order to develop a detector that successfully operates under these conditions, it becomes critical to leverage upon multiple cues, multiple imaging modalities, and a strong multiview (MV) classifier that accounts for different object views and poses. In this paper, we provide an extensive evaluation that gives insight into how each of these aspects (multicue, multimodality, and strong MV classifier) affect accuracy both individually and when integrated together. In the multimodality component, we explore the fusion of RGB and depth maps obtained by high-definition light detection and ranging, a type of modality that is starting to receive increasing attention. As our analysis reveals, although all the aforementioned aspects significantly help in improving the accuracy, the fusion of visible spectrum and depth information allows to boost the accuracy by a much larger margin. The resulting detector not only ranks among the top best performers in the challenging KITTI benchmark, but it is built upon very simple blocks that are easy to implement and computationally efficient.
KW - Multicue
KW - multimodal
KW - multiview (MV)
KW - object detection
UR - http://www.scopus.com/inward/record.url?scp=84981314160&partnerID=8YFLogxK
U2 - 10.1109/TCYB.2016.2593940
DO - 10.1109/TCYB.2016.2593940
M3 - Article
C2 - 28708566
AN - SCOPUS:84981314160
SN - 2168-2267
VL - 47
SP - 3980
EP - 3990
JO - IEEE Transactions on Cybernetics
JF - IEEE Transactions on Cybernetics
IS - 11
M1 - 7533479
ER -