Colorize Lidar point clouds with camera images

Shikhar Gupta
Mindkosh AI
Published in
5 min readAug 10, 2024
A colorized point cloud from the Nuscenes dataset

Lidars are powerful sensors that can create a high-resolution, three-dimensional view of the environment. Compared with images, lidar point clouds can better distinguish between objects because there is no overlap. In addition, the 3-dimensional nature of the data ensures accurate depth perception. However a crucial disadvantage of using point clouds for perception is that they are not colored. This makes identifying objects more difficult compared to camera images.

To overcome this disadvantage, Lidar point clouds can be colored by mapping lidar points onto synchronized camera images. In order to do this, the images and point clouds should first be synchronized to make sure they capture the environment at the same time. Once this time synchronization is complete, we can project points from the lidar point cloud onto camera images using extrinsic and intrinsic matrices. This projection is also crucial for fusing camera and lidar point clouds together into a unified representation — a process known as Lidar Sensor fusion. The projection process can be performed as follows:

  1. Transform the x,y,z co-ordinates from the lidar frame of reference to the camera frame of reference. This involves applying a transformation matrix, that is derived from the position of the camera with respect to the Lidar.
  2. Transform the points from camera frame onto the 2D image plane. Just like a real camera transforms rays of light coming through its lens into image pixels through a wall of tiny sensors, we can transform the 3D points onto a 2D image plane.

Now let’s see how this can be done through some python magic. We are going to skip reading and saving the files and only focus on the more important aspects of colorizing. If you would like to understand how you can read point cloud files in various formats, have a look this post.

Projecting lidar points onto images

The complete code can be found at this Github repo

The first step in the colorizing process is to project lidar points onto camera images.

for pt in enumerate(pointcloud_data):
# Move the point from lidar frame to camera frame
cam_pt = lidarProjectionUtils.world2cam(
camera_params[device_id]["extrinsic"], pt[:3])

# Project the point from camera frame to the image plane
pixel = lidarProjectionUtils.cam2image(
camera_params[device_id], cam_pt, camera_params[device_id]["cameraModel"])

x = int(pixel[0])
y = int(pixel[1])
depth = cam_pt[2]

The depth variable holds the depth of each point as seen from the camera. This will be important later on. Once we have a pixel we can attach to a lidar point cloud, we can use its RGB value to color the point.

Colorizing the points

if depth > DEPTH_THRESHOLD:
if point_depth[ind] > depth:
point_depth[ind] = depth
if pixel_point_map_set[y][x] is False or (point_depth[pixel_point_map[y][x]] > depth) or abs(point_depth[pixel_point_map[y][x]] - depth) < 0.5:
old_ind = pixel_point_map[y][x]
pixel_point_map[y][x] = ind
pixel_point_map_set[y][x] = True
point_rgb[ind] = image[y][x]

# If a point was already mapped to this pixel, but the new point is closer, then set the old point to white
if (point_depth[old_ind] > depth) and abs(point_depth[old_ind] - depth) > 0.5:
point_rgb[old_ind] = [255, 255, 255]

There’s a lot going on here, so let’s try to understand this more clearly.

The first thing to note is that we want to skip all points below a DEPTH_THRESHOLD. This is because an image cannot contain points from behind the camera. A depth value of < 0 indicates that the point is behind the camera. But we also want to ignore any points that are very close to the camera, since they are likely noise.

A key point to note here is that there can be multiple points in the point cloud that map to the same pixel in the image. The other checks ensure that we make the right decision about which point to color with the RGB values of that pixel.

  1. Check if the pixel has not been mapped to a point cloud point yet. If so, assign the RGB value.
  2. If the point has already been mapped to this pixel, but it is closer to the camera than the previous point, then we assign the RGB value to this pixel. This can happen if an object is in front of another object. In the image the object in front would be superimposed over the object behind it. This can also happen if there is some overlap between the areas covered by two or more cameras. In such a case, we want to consider the point that is closer to the camera, as it would contain the more authentic color.
  3. In addition, the lidar points belonging to the obstructed object would also map to the pixels that belong to the object in front. However we don’t have any way to color them, since that information is lost in the image. So it’s better to color them white, so that the colors from the front object are not inaccurately imposed on the object at the back.

Adjusting the colors

Adjusting the colors of an image before colorizing a point cloud can make the output point cloud look much better. The original colorized point cloud is on the left, and adjusted one is on the right.

Lidars produce sparse point clouds. As a result of this, when projecting colors from images to these sparse point clouds, they can appear washed. To help make the colors appear brighter, you can adjust the contrast, brightness and saturation to make the colors in the image pop out. My suggestion would be to increase the values as high as you can without making some of the objects in the image disappear. And apply the colorizing process on the edited images.

Further improvements

While this colorizing process works pretty well for most of the datasets we tried, it is slow. One way to speed it up would be to perform the lidar to camera conversion using numpy matrices. This should make the process considerably faster compared to our method of looping over each point and transforming it to the image plane.

Another possible improvement is to try to handle situations where an object might be occluded in the image, but its point still maps to a pixel in the image as it’s ray passes through the object in front. I am currently exploring using meshes to solve this problem. The idea is to convert the point cloud into a rough mesh, so that it blocks a ray passing through an occluded object.

If you have any suggestions for either of these, I would be happy to have a chat, so do leave a comment.

--

--