nuScenes — Using Yonohub to develop, evaluate and benchmark over the cloud

Ahmed Hendawy

Published in

YonoHub

7 min readJul 31, 2019

Working with nuScenes Dataset on Yonohub

Introduction

On the 27th of March 2019, Aptiv Autonomous Mobility (formerly nuTonomy) published the full version of a large-scale dataset, which is called nuScenes, for the autonomous driving development community. The dataset enriches with different types of data which comes from various kinds of sensors. The dataset contains 1000 scenes which are split into training and testing parts. Six Cameras, one lidar, five Radars, and one IMU are contributing to the formation of the State-of-the-Art dataset. We can see the layout of the sensors as shown below,

Recently, YonoHub team created a new package for the nuScenes dataset in terms of encapsulated blocks which can be used in YonoArc App. Yonohub developed user-friendly blocks which facilitate the interaction with this dataset, either in the phase of development or benchmarking, without a need of multiple processes to extract and manipulate the raw data for training and even for the visualization phase. The target is to support users with different datasets, which are used in the autonomous driving field, without any prior knowledge of the dataset structure. In addition to ease of integration with other perception blocks.

Yonohub published four blocks which are effective for fast development and evaluating against such a complicated structured dataset. You can freely purchase the blocks through YonoStore for free. The nuScenes package is developed based on standard types of ROS messages together with a package of nuScenes custom messages (nuscenes-msgs).

You do not need to download the dataset with this huge size. YonoHub provides users with various datasets (e.g., KITTI, BDD) through YonoStore. You can get the nuScenes dataset and it will be directly downloaded inside the YonoStoreDatasets folder in your YonoHub Drive. The available nuScenes dataset splits, which are published on YonoStore, are the mini as well as the keyframed trainval.

In this article, I will demonstrate some use-cases which can be developed using the nuScenes package in YonoArc. In the next few sections, I will illustrate how to extract the different raw sensory data as well as the ground truth 3D bounding boxes. Besides, we learn how to benchmark against nuScenes dataset.

Visualizing the Camera Images with the 3D Bounding Boxes

In this demo, I show how to use the “nuScenes Dataset Player” block for extracting the images, captured by the different cameras, as well as the ground truth 3D bounding boxes. Also, we draw the bounding boxes on the corresponding camera image.

I purchase, for free, the “nuScenes Dataset Player” block along with two utils blocks, “nuScenes Frame Transformer” and “nuScenes Draw 3D Boxes”. You can get more information about the blocks through the provided description in the Help tab (2) by opening the settings window (1).

The player block publishes 15 different outputs on his output ports. You can extract the 12 sensors’ data as aside with the 3D bounding boxes, the frame transformation matrices, and the calibration matrices of the cameras.

The player block supports a property of choosing the type of sensor(s) you want to extract its raw data through checkboxes (3). Due to the computational complexity of extracting all these data, this property increases the maximum possible publishing rate which can be chosen by the user (4). In fact, the block will guarantee the publishing rate you provided in the related field if you choose less number of sensors.

Then, you need to specify the directory, associated with one of the datasets compressed folders, in the provided “Dataset Path” field (5).

For the next demo, I use the output related to the front camera. The ground truth is published continuously by default. You need to know that the bounding boxes, published by the player block, are measured with respect to a global frame. In this case, I utilize the “nuScenes Frame Transformer” block to transform the 3D bounding boxes reference frame to the front camera frame (2).

nuScenes Frame Transformer Configuration

After the transformation, I add the “nuScenes Draw 3D Boxes” block which outputs an annotated image with the ground truth 3D bounding boxes published by the player. Additionally, I use the “video player” block, which provided in your default YonoArc packages, to visualize the output. YonoArc supports the visualization of the output, related to the launched pipeline, in the YonoDashboard window which can be opened by clicking on the bottom left.

Finally, we can see the output as shown after launching the full pipeline. I visualize the annotated video against the original one. Consequently, I implemented a custom block called “Stack” to concatenate two images and sync them. You can check the docs for further information about creating custom YonoArc blocks.

Original Vs. Annotated Frames

Moreover, I enlarged the pipeline to visualize the full outputs from all of the six cameras. I repeated the above structure six times. Synonyms of the “Stack” custom block have been developed according to the number of input images needed to be stacked. Besides, I implemented a mirroring block of images to get the desired visualization as a full scene.

You check the structure of the pipeline below as well as the results corresponding to it.

Six Cameras Annotated Frames

Visualizing the Lidar/Radar Point Cloud using RViz

Most of the 3D object detection algorithms are implemented either based on the lidar point cloud or fusion with the images captured by the cameras. Thus, I created another demo to visualize the point clouds of either the lidar or the radar sensors’ outputs.

RViz software, which is also supported as a YonoArc block, is used for visualizing the point cloud. You can get the RViz block easily from YonoStore.

The lidar and the front radar are chosen to be extracted from the block. I inserted the RViz block which listens to all the topics in the running pipeline. The points’ reference frame is the sensor frame itself. You need to specify the fixed frame in RViz. For more information about the naming of the different frames, please refer to the description of the player as well as the frame transformer blocks.

I launched the pipeline and I opened RViz block (2) as shown,

In the RViz window, you can add many types of data that you may want to visualize. I add a “Point Cloud 2” topic. Next, I select the lidar topic which is needed to be visualized. The names of the topics are very informative and mimic the names of the keys used in the nuScenes formate. As I mentioned, you are required to change the fixed frame to “lidar” which is the reference frame of the lidar points.

You can see the visualized 3D Point Cloud as shown below,

Lidar Point Cloud Visualization in RViz

Similarly, the point cloud of the front radar is demonstrated in the following screenshot,

Front Radar Point Cloud Visualization in RViz

nuScenes Object Detection Benchmark

Yonohub provides also a plug-and-play benchmark YonoArc block for the nuScenes dataset. The developers, in the computer vision community, can use the same YonoArc pipeline with a slight change by replacing only the algorithm block.

In the following demo, I will represent the usage of “nuScenes Benchmark Evaluator” block. To demonstrate the function of the block, I evaluate the ground truth of the dataset against itself. The expected value for the nuScenes detection score (NDS) is one. Be careful, the reference frame of both inputs should be identical.

The evaluator block benchmarks a developed algorithm against the nuScenes dataset regarding some performance matrices which are specified by the nuScenes evaluation criterion of the previous CVPR 2019 challenge.

You can generate intermediate results of the accumulative evaluated samples on click (3) as shown. You just need to specify the directory of the results (2) as well as the name of the report provided. Besides the report, the block will save all the evaluation matrices of all the classes together with a summary of the results as JSON files.

Also, you can plot continuously the NDS using the “Line Chart Viewer” block. The Y-axis represents the NDS while the X-axis shows the time (2). You can refer to the Help tap to get more information about the block.

You can see the demo’s results provided as a chart of the NDS as well as the text file content,

Conclusion

To summarize, I tried to represent the experience of developing, evaluating and benchmarking different computer vision algorithms over the cloud through Yonohub. It’s easy to try out Yonohub. New users receive $25 free credits. Sign up on Yonohub!

My next step is to invest some time developing one of the on-shelf algorithms as a YonoArc block to have a complete benchmark loop.