By: Vinay Shet, Director, Product Management
The Lyft Level 5 team recently released a self-driving dataset with several tens of thousands of human-labeled 3D annotated frames and a semantic map, along with associated lidar frames and camera imagery. Today, we’re thrilled to launch a Kaggle competition with this dataset on 3D object detection over a semantic map. This competition is being conducted in association with NeurIPS 2019, a premier conference on machine learning, where we’ll be presenting top winners and their solutions at our competition track. The competition officially opens today and will run until November 12, 2019.
The goals of this competition are to advance the state-of-the-art in 3D object detection. We hope to do that by:
- Democratizing the availability of a full modality of data, including advanced data artifacts like 3D semantic maps and fully-calibrated and localized lidar point clouds.
- Focusing research on this problem in the context of autonomous driving, i.e. 3D object detection with the semantic map prior.
We’ll be awarding $25,000 in prizes across three winners. In addition, top performers may be invited to the NeurIPS Conference in December, and may also get the opportunity to interview for a position with the Level 5 team. You can register to participate in this competition by heading over to the Kaggle site for information on data download, rules, submission guidelines, the evaluation metric, and more. You’ll also be able to view the public leaderboards and track progress of the participants.
Vladimir Iglovikov, Lyft’s own Kaggle grandmaster, will be the technical point of contact and host of this competition.
Accelerating the community
Lyft’s mission is to build the world’s best transportation. We envision a world where electric, autonomous vehicles enable us to reimagine our cities built around people, not cars. Self-driving cars are a critical component needed to bring this vision to fruition. We’re working to accelerate the development of self-driving technology to get closer to our vision.
From a technical standpoint, however, self-driving cars pose significant engineering challenges. These challenges are exacerbated by the fact that the bar to unlock technical research and development on higher-level autonomy functions like perception, prediction, and planning is extremely high. Unlocking this technical research and development requires:
- A robust self-driving hardware platform;
- Integration of a variety of hardened, well calibrated sensors;
- Consistent timing across all subsystems;
- Development of efficient ways to ingest data from the car into the cloud;
- Development of tooling and operations to label ground truth on this data;
- Construction of a high-quality HD map;
- The ability to localize the car robustly across a variety of situations, and, finally,
- Efficient operations to drive these cars for several thousands of miles to collect data.
These factors require significant resources that are usually available to industrial engineering organizations, large research labs, and very few academic groups. This implies technical R&D on self-driving cars has traditionally been inaccessible to the broader research community.
Over the last two years, the Level 5 team has been developing a high-quality self-driving technology stack in close collaboration with Magna, one of the largest tier 1 manufacturers in the automotive industry. This has helped accelerate the creation of this large-scale dataset.
This dataset aims to democratize access to such data, and foster innovation in higher-level autonomy functions for everyone, everywhere. By conducting a competition, we hope to encourage the research community to focus on hard problems in this space — namely, 3D object detection over semantic maps.
Such a singular focus helps quickly establish baselines for performance of different algorithms within the community, and explore how far certain algorithms can go on specific subsets of problems. Insights derived from such concerted efforts can be incredibly valuable to the community at large for comprehending the shortfalls of current approaches within the context of real-world data. This then sets the stage for the discovery of novel approaches.
Perceiving the world in 3D
To function correctly, a self-driving car needs to perceive the world around it; maintain a precise geographic context of where in the world it’s located; predict the future state of all dynamic agents; and plan its own trajectory to navigate around these agents and progress towards its destination. Specifically, the car needs to perceive objects and reason about action in three dimensions through the fusion of vision and depth sensors such as lidar.
Each of these stages of autonomy poses daunting research and development challenges due to the stringent performance and robustness requirements around them. The 3D perception stage, in particular, is critical in the overall autonomy stack upon which all subsequent agent predictions and ego-vehicle actions hinge. Thus, it must be robust enough to take on a wide range of environmental variations, sensor noise characteristics, and complex agent interactions.
In recent years, R-CNNs have democratized object detection in 2D images¹. This is in large part due to the recent widespread availability of labeled 2D-object annotations and the willingness of the research community to share models and code that deliver high performance. The same has not yet occurred in 3D object detection. Today, there are multiple competing technologies for 3D object detection. For instance some that use convolutions², versus some that don’t (e.g. Point Net³), versus some that use continuous convolutions⁴. Each of these approaches come with different tradeoffs resulting in no clear technology winner. This, in part, may be due to the lack of high-quality datasets in this domain.
By launching this dataset and competition, we hope to focus researchers to explore solutions to these challenges.
Democratizing the development of self-driving doesn’t just happen with datasets and competitions, but by sharing knowledge and insights. To that end, the Level 5 team conducted a tutorial on Perception, Prediction, and Large Scale Data Collection at CVPR 2019, where we shared practical tips for building a Perception & Prediction system for self-driving cars.
Lyft Level 5’s Tutorial at CVPR 2019
We covered the challenges involved in building systems designed to operate without human driver intervention, and how to develop state-of-the-art neural network models into production. The tutorial struck a balance between applied research and engineering: We shared insights about different kinds of labeled data needed for perception and prediction, and how to combine classical robotics and computer vision methods with modern deep learning approaches.
See below, an excerpt from this tutorial that focuses on 3D perception. In this presentation, you’ll hear thoughts from Ashesh Jain, Head of Perception at Level 5.
Moving self-driving forward, together
Self-driving cars have the potential to dramatically redefine the future of transportation. When fully realized, this technology promises to unlock myriad societal, environmental, and economic benefits. With the launch of this self-driving dataset and competition, we hope to empower the research community, stimulate further development, and share insights into future opportunities from the perspective of an advanced industrial self-driving car program.
In the meantime, if you’re interested in helping us build the future of transportation, check out our open positions in Palo Alto, San Francisco, London, and Munich.
- Faster R-CNN: Towards Real-Time Object Detection with Region Proposal Networks Shaoqing Ren, Kaiming He, Jian Sun Published in IEEE Transactions on PAMI 2015 DOI:10.1109/TPAMI.2016.2577031
- Voxelnet: End-to-end learning for point cloud based 3d object detection Y Zhou, O Tuzel — Proceedings of the IEEE CVPR, 2018
- Frustum pointnets for 3d object detection from rgb-d data CR Qi et al -IEEE CVPR, 2018
- Liang M., Yang B., Wang S., Urtasun R. (2018) Deep Continuous Fusion for Multi-sensor 3D Object Detection. In: Ferrari V., Hebert M., Sminchisescu C., Weiss Y. (eds) Computer Vision — ECCV 2018