Object detection with LiDAR Point cloud Algorithm

Junho Koh
4 min readNov 1, 2018

--

Recently, there are a lot of researches related to the object detection with LiDAR point cloud. The LiDAR point cloud inputs have the advantage of accurate depth information. From now on, I’ll introduce the LiDAR point cloud detection with Deep learning.

Multi-view 3D Object Detection Network for Autonomous Driving

Computer Vision and Pattern Recognition (CVPR), 2017

This paper propose Multi-view 3D networks (MV3D), a sensory-fusion framework that takes both LIDAR point cloud and RGB images as input and predicts oriented 3D bounding boxes. The network is composed of two subnetworks: one for 3D object proposal generation and another for multi-view feature fusion. The proposal network generates 3D candidate boxes efficiently from the bird’s eye view representation of 3D point cloud. The deep fusion scheme combines region-wise features from multiple views and enable interactions between intermediate layers of different paths.

Multi-View 3D object detection network (MV3D)

Fast and Furious: Real time End-to-End 3D Detection, Tracking and Motion Forecasting with a Single Convolution Net

Computer Vision and Pattern Recognition (CVPR), 2018

This paper proposes a novel deep neural network that is able to jointly reason about 3D detection, tracking and motion forecasting given data captured by a 3D sensor. This paper use the 3D convolution for taking temporal information.

Modeling temporal information

PIXOR: Real-time 3D Object Detection from Point Clouds

Computer Vision and Pattern Recognition (CVPR), 2018

This paper address the problem of real-time 3D object ddetection from point clouds in the context of autonomous driving. This paper utilize the 3D data more efficiently by representing thr scene from the Bird’s Eye View (BEV), and propose PIXOR, a proposal-free, single stage detector that outputs oriented 3D object estimates decoded from pixel-wise neural network predictions.

Overview of the proposed 3D object detector from BEV of LIDAR point cloud.
The network architecture of PIXOR

HDNET: Exploiting HD Maps for 3D Object Detection

2nd Conference on Robot Learning (CoRL), 2018

This paper show that High-Definition (HD) maps provide strong priors that can boost the performance and robustness of modern 3D object detectors.

BEV LiDAR representation that exploits geometric and semantic HD map information.
Network structures for object detection (left) and online map estimation (right)

Deep Continuous Fusion for Multi-Sensor 3D Object Detection

European Conference on Computer Vision (ECCV), 2018

This paper propose a novel 3D object detector that can exploit both LiDAR as well as cameras to perform very accurate localization. Towards this goal, they design an end-to-end learnable architecture that exploits continuous convolutions to fuse image and LiDAR feature maps at different levels of resolution. Proposed continuous fusion layer encode both discrete state image features as well as continuous geometric information. This enables us to design a novel, reliable and efficient end-to-end learnable 3D object detector based on multiple sensors

Architecture of proposed model

Conclusion

There are so many researches of object detection with LiDAR point cloud dataset. This post shows only abstract of each model. In the future, I’ll show the detail of each model for understanding well.

Reference

Chen, Xiaozhi, et al. “Multi-view 3d object detection network for autonomous driving.” IEEE CVPR. Vol. 1. №2. 2017.

Luo, Wenjie, Bin Yang, and Raquel Urtasun. “Fast and Furious: Real Time End-to-End 3D Detection, Tracking and Motion Forecasting With a Single Convolutional Net.” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018.

Yang, Bin, Wenjie Luo, and Raquel Urtasun. “PIXOR: Real-Time 3D Object Detection From Point Clouds.” Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. 2018.

B. Yang, M. Liang and R. Urtasun: HDNET: Exploiting HD Maps for 3D Object Detection. 2nd Conference on Robot Learning (CoRL) 2018.

Liang, Ming, et al. “Deep Continuous Fusion for Multi-Sensor 3D Object Detection.” Proceedings of the European Conference on Computer Vision (ECCV). 2018.

--

--