Exploring the KITTI 3D object detection data set

From directory structure to 2D bounding boxes

3 min readApr 10, 2020

Example of camera 2 in kitti 3d object detection dataset

The Kitti 3D detection data set is developed to learn 3d object detection in a traffic setting. In upcoming articles I will discuss different aspects of this dateset.

From directory structure to 2D bounding boxes

Code and notebooks are in this repository https://github.com/sjdh/kitti-3d-detection

Motivation for the 3d detection

Autonomous robots and vehicles track positions of nearby objects. These can be other traffic participants, obstacles and drivable areas.

For path planning and collision avoidance, detection of these objects is not enough. To make informed decisions, the vehicle also needs to know relative position, relative speed and size of the object.

The 3d detection task

The task of 3d detection consists of several sub tasks. Objects need to be detected, classified, and located relative to the camera. Finally the objects have to be placed in a tightly fitting boundary box.

Directory structure

The kitti data set has the following directory structure

{training,testing}/image_2/id.png
{training,testing}/image_3/id.png
{training,testing}/label_2/id.txt
{training,testing}/velodyne/id.bin
{training,testing}/calib/id.txt

There are two visual cameras and a velodyne laser scanner.

The two cameras can be used for stereo vision. Overlaying images of the two cameras looks like this

Labels

The label file looks like this

I wrote a gist for reading it into a pandas DataFrame. Here is the parsed table.

2D bounding boxes

The first step in 3d object detection is to locate the objects in the image itself. The corners of 2d object bounding boxes can be found in the columns starting bbox_xmin etc.

Here the corner points are plotted as red dots on the image

for box in corners:
    for corner in box:
        cv2.circle(img, corner, 1, (255, 0, 0), 5)
        # cv.Circle(img, center, radius, color, thickness=1,   lineType=8, shift=0)

Getting the boundary boxes is a matter of connecting the dots

for i, box in enumerate(boxes):
    img = cv2.rectangle(img, box[0], box[1], (0, 255, 0))
    img = cv2.putText(img, str(i), (box[0][0] + 10, box[0][1] - 4) , cv2.FONT_HERSHEY_SIMPLEX, 0.5, (0,255,0), 1)plt.imshow(img

Code

The full code can be found in this repository

https://github.com/sjdh/kitti-3d-detection