Sensor Fusion of LiDAR and Camera — An Overview

4 min readJul 29, 2018

--

LiDAR and Camera are sensors that are widely seen in almost every autonomous robots, including the self-driving cars. Recently, I had to dig deep into how LiDAR and camera can be fused and had to read through a few research papers to understand what it really means and how it is done. Below are the summary of the learning that I had.

LiDAR

LiDAR, or light detection and ranging, is a sensor that throws out laser rays and provides points in the environment around based on the light rays that reflects on the surroundings and come back.

Recent LiDARs provide a 360° horizontal field of view (FOV) and a limited vertical FOV. The main advantage of a LiDAR is that it gives highly accurate depth values. But, their output is sparse, i.e., they don’t give a very high resolution output.

Point Cloud output from a LiDAR (Source: https://www.youtube.com/watch?v=Se5U2ne5eLk)

Camera

Camera is a sensor that gives us images. It is the one that we have in our smartphones these days and are very good in giving rich information about the world. It provides high resolution outputs. The disadvantage of a camera is that it has a limited FOV and gives us no depth information, although there are stereo cameras available that can give depth values.

Image from camera (Source: https://kiriproject.wordpress.com/2015/08/03/vehicle-dataset/)

Sensor Fusion

Combining the outputs from the LiDAR and camera help in overcoming their individual limitations. The fusion provides confident results for the various applications, be it in depth finding from an image or object detection applications.

Fusion of camera and LiDAR can be done in two ways — fusion of data or fusion of the results.

Data Fusion

Fusion of data is the overlapping of the camera image and LiDAR point cloud so that we get depth information for the pixels in the camera image.

LiDAR points over the camera image (Source: https://www.youtube.com/watch?v=8Kc1UHrI_9o)

This fusion requires finding the intersection of the FOV’s of the LiDAR and camera and then assigning the remaining points in the point cloud to corresponding pixel in the image. This output then could be upsampled to obtain depth values for all the pixels in the image.

Output Fusion

Fusion of results is where, say we do object detection in the camera image and in the LiDAR point cloud separately, and fuse the results to increase our confidence.

Result fusion block diagram (Source: Link)

We process the camera image separately, get an output and verify against the processed LiDAR output or vice-versa. This is a very useful fusion method since it can help in increasing the reliability of a system.

The major focus area is to find a suitable way to transform from one space to another, ideally in real time. This allows us to find the points in the LiDAR space that corresponds to certain pixels or vice-versa.

There are a lot of sub problems in the fusion of LiDAR and camera including the manual and automatic calibration of these sensors, up sampling the depth images etc., and there are plenty of research papers on each of these topics.

The LiDAR and camera market is expected to reach $52.5B in 2032 [Source]. The fusion of these sensors is playing a significant role in perceiving the environment in many applications including the autonomous domain. Reliable fusion is also critical in the safety aspect of these technologies. Many challenges lie ahead and it is one of the exciting problems in this industry.