SiaSearch partners with Motional to make the groundbreaking nuScenes dataset fully accessible for researchers

Uri Eldan
SiaSearch (now Scale Nucleus)
5 min readOct 6, 2020

SiaSearch is thrilled to announce that nuScenes datasets by Motional, a global leader in driverless technology, are now available in the SiaSearch platform. Our team is proud to support automated driving research and hopes that SiaSearch will encourage more people to get involved with self-driving and robotics.

Explore nuScenes through SiaSearch - register here

SiaSearch provides novel visibility into raw sensor data and efficient access to data at scale, by introducing content-based searchability for unstructured sensor data. We enable searchability through an automatically extracted lightweight metadata catalog which significantly accelerates data-driven development processes by reducing the time engineers currently spend on data triage and selection.

SiaSearch’s offerings include:

  1. User-friendly and intuitive GUI to analyze, query, and select driving sequences

2. API for programmatic data selection

3. Ability to query based on the nuScenes tags (objects and attributes) in addition to the SiaSearch tags

4. Automatically created semantic attributes (meta tags) that describe each drive and allow to query and filter based on them. Among these meta tags are:

  • Maneuvers performed by the Ego vehicle (lane changes, turns, cut-ins …)
  • Maneuvers performed by other traffic participants (overtakings, cut-ins …)
  • Critical situations (critical brakings, critical accelerations, violating speed limits, not keeping a safety distance …)
  • Environmental attributes
  • Road infrastructure attributes
SiaSearch GUI

Through our platform, our goal is to provide the research community with another way to access data, as demonstrated by our addition of nuScenes by Motional. This collaboration will make nuScences fully queryable based on SiaSearch’s comprehensive catalog of over 60 attributes and driving scenarios.

nuScenes is the second dataset we make available to the research community through SiaSearch (learn more about our collaboration with the KITTI dataset). With 1,000 driving scenes in Boston and Singapore, two cities that are known for their dense traffic and highly challenging driving situations, the dataset shows a diverse and interesting set of driving maneuvers, traffic situations, and unexpected behaviors. We are proud to help researchers make the most out of this insightful data, which will facilitate momentum in the industry during these challenging times.

“With SiaSearch, the way engineers interact with driving data changes completely. SiaSearch automatically turns unstructured data into fully structured data and allows us, engineers, to efficiently interact with it and reduce the time spent on data triage and selection”

Holger Caesar, Team Lead Data-Curation @Motional and nuScenes Project Lead

The Challenge with Accessing Data Today

In general, a dataset is considered to be of a higher quality if it is bigger, more diverse, and consists of more sensor inputs. The problem is that the ease of accessing and interacting with a dataset is reversed correlated with the dataset’s size and complexity. Finding edge cases and interesting interactions within a large scale dataset is like finding a needle in a haystack. Therefore, making a great dataset such as nuScenes accessible through SiaSearch, makes the dataset even more powerful, as it becomes fully accessible and the process of selecting the relevant driving sequences is now trivial.

The Importance of Research

The research community plays a critical role in the autonomous vehicle (AV) development ecosystem. Some of the most important advancements in AV development became a reality thanks to curious researchers from labs around the globe. Both SLAM (Simultaneous Localization and Mapping), as well as object detection, can be mentioned as ideal examples for significant progress that was done in research environments. They laid the groundwork for fundamental problems any AV engineer deals with during their daily jobs in the industry.

Today, the development of AV is very data-driven. Without driving datasets it would be impossible to train and validate perception, prediction, or motion planning models. However, equipping a test vehicle with a typical sensor setup (cameras, lidars, radars, GPS, imu) can cost up to a few hundred thousands of dollars. Therefore, large-scale publicly available datasets (free of charge for non-commercial activities) have played a significant part and will continue to do so, with regards to pushing forward the development of AV.

Lots of articles were written on the importance of public datasets and the motivation of the different OEMs, Tier 1 suppliers, and full-stack startups to release them to the research community.[1] [2] [3] These publications all share similar motivations such as encouraging researchers to develop new insights, discuss novel perspectives as well as to discover solutions to existing and shared problems. Therefore, it is clear that the industry sees the research community as an integral part of the autonomous vehicles ecosystem and as a facilitator for future and advanced developments.

How SiaSearch is Used to Curate Training Datasets

For example, as an ML engineer working on motion planning, you know you have a problem with your model when it comes to complex driving scenarios such as unprotected left turns with oncoming traffic and pedestrians next to the ego vehicle. You want to curate a training dataset to improve your model — how would you find the relevant data?!
Currently, this process is time-consuming and requires a manual review of large amounts of data to select scenes that fit the requirements.

With SiaSearch this process can be heavily accelerated. The ML engineer can curate a training dataset in just a few minutes! (even seconds, but we don’t like to brag). Intuitively querying the entire data lake and finding exactly what he needs before exporting the data to her development environment.

Query Example
Export Example

Explore nuScenes through SiaSearch — register here

If you want to learn more about SiaSearch and how it could help you to better interact with your raw sensor data, please feel free to visit our website at www.siasearch.io, reach out to us via hi@siasearch.io and make sure to follow us on LinkedIn and Twitter.

nuScenes is available for commercial use under our commercial license agreement. Please reach out to nuScenes@motional.com to learn more.

About Motional

Motional is a driverless technology company making self-driving vehicles a safe, reliable, and accessible reality.

The Motional team was behind some of the industry’s largest leaps forward, including the first fully-autonomous cross-country drive in the U.S, the launch of the world’s first robotaxi pilot, and operation of the world’s most-established public robotaxi fleet.

Motional is a joint venture between Hyundai Motor Group, one of the world’s largest vehicle manufacturers offering smart mobility solutions, and Aptiv, a global technology leader in advanced safety, electrification, and vehicle connectivity.

Headquartered in Boston, Motional has operations in the U.S and Asia. For more information, visit www.motional.com, and follow on Twitter, LinkedIn, Facebook, Instagram, and YouTube.

--

--

Uri Eldan
SiaSearch (now Scale Nucleus)

VP Business Development @SiaSearch — revolutionizing the way automated driving engineers interact with multimodal sensor data.