Google Releases ‘Objectron Dataset’ of Object-Centric Video Clips to Advance 3D Object Understanding

Published in

SyncedReview

3 min readNov 10, 2020

Google AI yesterday released its Objectron dataset — a collection of short, object-centric video clips capturing a large set of common objects from different angles. Each video clip is accompanied by AR session metadata that includes both camera poses and sparse point-clouds.

The Google researchers hope the dataset’s release will help the research community push the limits of 3D object-geometry understanding, which has the potential to power a wide range of applications such as augmented reality, robotics, autonomy, and image retrieval.

Google AI researchers earlier this year released their MediaPipe Objectron, a mobile real-time 3D object detection pipeline able to detect everyday objects in plentiful 2D image collections and estimate their poses and sizes through a machine learning (ML) model trained on a newly created 3D dataset.

Understanding objects in 3D remains challenging in large part due to the lack of large, real-world 3D datasets. The Google researchers believe the ML community has a strong need for object-centric video datasets that capture more of the 3D structure of an object while matching the data format used for many vision tasks, and so decided to release the Objectron dataset to aid in the training and benchmarking of ML models.

The Objectron dataset contains manually annotated 3D bounding boxes that describe each object’s position, orientation, and dimensions, comprising 15,000 annotated video clips supplemented with over 4 million annotated images collected from a geo-diverse sample covering 10 countries across five continents.

*Sample results of 3D object detection solution running on mobile*

The dataset currently includes bikes, books, bottles, cameras, cereal boxes, chairs, cups, laptops, and shoes, and is stored in the Objectron bucket on Google Cloud storage. An open-sourced data pipeline has been provided to parse the dataset in Tensorflow, PyTorch, and Jax ML frameworks.

Along with the dataset, the researchers also shared a 3D object detection solution for the shoes, chairs, mugs and cameras categories. The models are trained with the Objectron dataset and have been released in MediaPipe, Google’s open-source framework for cross-platform customizable ML solutions for live and streaming media.

The Objectron dataset is available on GitHub.

Reporter: Yuan Yuan | Editor: Michael Sarazen

Synced Report | A Survey of China’s Artificial Intelligence Solutions in Response to the COVID-19 Pandemic — 87 Case Studies from 700+ AI Vendors

This report offers a look at how China has leveraged artificial intelligence technologies in the battle against COVID-19. It is also available on Amazon Kindle. Along with this report, we also introduced a database covering additional 1428 artificial intelligence solutions from 12 pandemic scenarios.

Click here to find more reports from us.

We know you don’t want to miss any news or research breakthroughs. Subscribe to our popular newsletter Synced Global AI Weekly to get weekly AI updates.

Google Releases ‘Objectron Dataset’ of Object-Centric Video Clips to Advance 3D Object Understanding

Written by Synced