Camera-Lidar Projection: Navigating between 2D and 3D

Daryl Tan
The Startup
Published in
8 min readJan 30, 2020

--

Figure 1. Lidar points on image (source)

Lidars and cameras are two essential sensors for perception and scene understanding. They build an environment in tandem and provide a means for detection and localisation of other objects, giving robots rich semantic information required for safe navigation. Many researchers have started exploring multi-modal approaches for precise 3D object detection. An interesting example would be an algorithm developed by Aptiv, PointPainting[1]

So why is this 2 sensor complimentary?

Camera outperforms LIDAR when it comes to capturing denser and richer representation. From fig 2, looking at the sparse point cloud alone, it is relatively difficult to correctly identify the black box as a pedestrian. However, paying attention to the RGB image, even with the person back facing, we could easily tell the object looks like a pedestrian. Besides that, other useful visual features that could be extracted include traffic light and road sign which LIDAR struggles.

Fig 2. RGB & Point cloud representation with pedestrian detection

In contrast, lidar excels when it comes to extracting distance information. It is extremely difficult to measure distance using the camera in standard perspective view…

--

--