Member-only story
Intro to 3D Deep Learning
3D data representation, vision tasks and learning resources
Written by Margaret Maynard-Reid (ML GDE) and Nived PA
3D deep learning is an interesting area with a wide range of real-world applications: art and design, self-driving cars, sports, agriculture, biology, robotics, virtual reality and augmented reality. This blog post provides an introduction to 3D deep learning: 3D data representations, computer vision tasks and learning resources.
3D Data
Data is super important for training machine learning models. One of the biggest differences between 2D and 3D deep learning is the data representation format.
Regular images are typically represented in 1D or 2D arrays. 3D images, on the other hand, can have different representation formats and here are a few most popular ones: multi-view, volumetric, point cloud, mesh and volumetric display. Let’s take a look at each data representation illustrated with images.
Multi-view images
These can be captured by positioning multiple cameras that take photos from different angles of the same object or scene. Here is what a chair looks like with images from ShapeNet which is a richly-annotated, large-scale repository of shapes represented by 3D CAD models…

