Import and Export Your 3D Point Cloud Data in KITTI Format with Xtreme1-SDK Toolkit

Published in

Multimodal Data Training

5 min readJun 8, 2023

3D point clouds are the perfect fit for capturing and utilizing data from LiDAR systems in object detection and tracking algorithms. They offer an ideal digital model that seamlessly adapts to radar perception and understanding of the surrounding environment.

1. Navigating the Maze of Data Formats

With dozens of file formats available for 3D point cloud data, working with them can be a daunting task. Various acquisition devices generate different raw data formats, but processing software typically supports only a limited selection of these formats, each with unique export capabilities.

Among them, the PCD (Point Cloud Data) format (*.pcd) stands out as the officially designated format by PCL. Alongside point clouds, multimodal fusion further complicates the dataset structure. Consequently, converting point cloud formats, folder structures, camera-generated image data formats, and camera parameter calibration file formats for 2D & 3D fusion demands a considerable amount of engineering effort.

To tackle these challenges head-on, we introduce the Xtreme1-SDK toolkit.

2. Unifying Data Formats with the SDK

The KITTI dataset is a popular source of experimental data often employed as the starting point for using 3D point cloud data annotation and visualization tools—namely uploading and displaying annotation boxes.

Established through a collaboration between the Karlsruhe Institute of Technology in Germany and the Toyota Technological Institute in the United States, the KITTI dataset is currently the largest international dataset for evaluating computer vision algorithms in autonomous driving scenarios.

2.1 Understanding the KITTI Format

Initiated by BasicAI team, Xtreme1 is designed to work with the standard KITTI format. If your data format doesn't align with the example, you'll need to convert it to the standard format first.

The standard KITTI format is as shown below (including one camera angle, with consistent folder naming):

Camera calibration parameter example:

P0: 7.215377000000e+02 0.000000000000e+00 6.095593000000e+02 0.000000000000e+00 0.000000000000e+00 7.215377000000e+02 1.728540000000e+02 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 1.000000000000e+00 0.000000000000e+00
P1: 7.215377000000e+02 0.000000000000e+00 6.095593000000e+02 -3.875744000000e+02 0.000000000000e+00 7.215377000000e+02 1.728540000000e+02 0.000000000000e+00 0.000000000000e+00 0.000000000000e+00 1.000000000000e+00 0.000000000000e+00
P2: 7.215377000000e+02 0.000000000000e+00 6.095593000000e+02 4.485728000000e+01 0.000000000000e+00 7.215377000000e+02 1.728540000000e+02 2.163791000000e-01 0.000000000000e+00 0.000000000000e+00 1.000000000000e+00 2.745884000000e-03
P3: 7.215377000000e+02 0.000000000000e+00 6.095593000000e+02 -3.395242000000e+02 0.000000000000e+00 7.215377000000e+02 1.728540000000e+02 2.199936000000e+00 0.000000000000e+00 0.000000000000e+00 1.000000000000e+00 2.729905000000e-03
R0_rect: 9.999239000000e-01 9.837760000000e-03 -7.445048000000e-03 -9.869795000000e-03 9.999421000000e-01 -4.278459000000e-03 7.402527000000e-03 4.351614000000e-03 9.999631000000e-01
Tr_velo_to_cam: 7.533745000000e-03 -9.999714000000e-01 -6.166020000000e-04 -4.069766000000e-03 1.480249000000e-02 7.280733000000e-04 -9.998902000000e-01 -7.631618000000e-02 9.998621000000e-01 7.523790000000e-03 1.480755000000e-02 -2.717806000000e-01
Tr_imu_to_velo: 9.999976000000e-01 7.553071000000e-04 -2.035826000000e-03 -8.086759000000e-01 -7.854027000000e-04 9.998898000000e-01 -1.482298000000e-02 3.195559000000e-01 2.024406000000e-03 1.482454000000e-02 9.998881000000e-01 -7.997231000000e-01

Data annotation result example:

Class names/Truncation/Occlusion/Alpha/B-box coordinates/3-D dimension/Location/Rotation_y

Car 0.00 0 1.51 896.18 505.20 1041.40 648.19 1.74 1.77 4.13 0.94 0.89 14.01 1.58 1
Car 0.00 0 2.34 388.93 519.28 820.69 712.79 1.54 1.85 4.40 -2.85 0.98 10.24 2.07 1
Car 0.00 0 0.64 1011.21 521.94 1249.42 614.17 1.41 1.97 4.05 4.43 1.10 18.16 0.88 1
None 0.00 0 -2.17 819.25 495.83 935.81 590.23 2.04 0.72 3.73 -0.54 0.79 22.48 -2.19 1
Car 0.00 0 -0.18 335.98 444.21 646.47 572.93 2.31 3.63 5.40 -8.63 0.08 19.88 -0.59 1

2.2 Converting KITTI Dataset to a Format Supported by the Xtreme1 Platform

Data conversion and preprocessing:

# Install the Xtreme1 main program, see
# https://docs.xtreme1.io

# Install Xtreme1-sdk, see
# https://github.com/xtreme1-io/xtreme1-sdk
pip install git+https://github.com/xtreme1-io/xtreme1-sdk.git

# Or download the repository and install it:
git clone git@github.com:xtreme1-io/xtreme1-sdk.git
cd xtreme1-sdk
pip install -e .
# (Note: There's a dot after -e)

2.3 Preparing Data

Prepare the standard KITTI format (download link: https://www.kaggle.com/datasets/garymk/kitti-3d-object-detection-dataset)
Convert to a format supported by Xtreme1: Run KITTI to Xtreme1, upload data, including point cloud format conversion, camera parameter JSON conversion, folder structure conversion, and pre-annotation structure format conversion:

python -m xtreme1.script_ctl --mode import --src "path/to/kitti_dataset_dir" --dst "path/to_save/upload_files" --format kitti

(Replace the red part in the code above with your data path)

2.4 Uploading Pre-annotated Data to the Platform

When uploading data, select "Include Results" and specify a result type according to your needs.

Uploading Pre-annotated Data to the Platform_Step-1

Add corresponding tags in Ontology: If you only want to display 3D boxes without labels, you can skip this step.

Uploading Pre-annotated Data to the Platform_Add tags

After successfully uploading, select the result source in the upper right corner of the annotation interface, which is the previously specified result type, to view the pre-annotation results. Visualization on the Xtreme1 platform:

2.5 Exporting in KITTI Format

Use the SDK to convert the Xtreme1 format to the KITTI dataset format:

python -m xtreme1.script_ctl --mode export --src "path/to/result_zipfile.zip" --dst "path/to_save/kitti_format_results" --format kitti

Exported results are saved according to different camera angles, as shown in the figure: label_0 is the annotation result of camera 0

3. Looking Ahead

nuScenes, Waymo Open, etc., are also commonly used open-source datasets, and our SDK toolkit will support the conversion of these data formats;
We currently use a command-line invocation method; in the next iteration, we'll offer a service-based approach for backend calls, enabling direct platform imports.
One more thing, we're also creating an open-source benchmark dataset for autonomous driving that combines cutting-edge technology and equipment from 2023—stay tuned!

References

[1] The KITTI Vision Benchmark Suite
@ARTICLE{Geiger2013IJRR,
author = {Andreas Geiger and Philip Lenz and Christoph Stiller and Raquel Urtasun},
title = {Vision meets Robotics: The KITTI Dataset},
journal = {International Journal of Robotics Research (IJRR)},
year = {2013}}
[2] Data Annotation Format https://docs.nvidia.com/metropolis/TLT/tlt-user-guide/text/data_annotation_format.html
[3] Xtreme1 docs https://docs.xtreme1.io/
[4] Xtreme1-sdk on GitHub https://github.com/xtreme1-io/xtreme1-sdk