Camera-Lidar Projection: Navigating between 2D and 3D

Published in

The Startup

8 min readJan 30, 2020

Figure 1. Lidar points on image (source)

Lidars and cameras are two essential sensors for perception and scene understanding. They build an environment in tandem and provide a means for detection and localisation of other objects, giving robots rich semantic information required for safe navigation. Many researchers have started exploring multi-modal approaches for precise 3D object detection. An interesting example would be an algorithm developed by Aptiv, PointPainting[1]

So why is this 2 sensor complimentary?

Camera outperforms LIDAR when it comes to capturing denser and richer representation. From fig 2, looking at the sparse point cloud alone, it is relatively difficult to correctly identify the black box as a pedestrian. However, paying attention to the RGB image, even with the person back facing, we could easily tell the object looks like a pedestrian. Besides that, other useful visual features that could be extracted include traffic light and road sign which LIDAR struggles.

Fig 2. RGB & Point cloud representation with pedestrian detection

In contrast, lidar excels when it comes to extracting distance information. It is extremely difficult to measure distance using the camera in standard perspective view…

Camera-Lidar Projection: Navigating between 2D and 3D

So why is this 2 sensor complimentary?

Written by Daryl Tan