Speaking Code: PointNet

Published in

Deem.blogs

5 min readJul 11, 2022

Deep Learning on Point Sets for 3D Classification and Segmentation

This work explores the applications of deep learning architectures with 3D geometric data such as point clouds or meshes. Usually, CNNs require regular data formats like image grids or 3D voxels. However, point clouds and meshes do not match the conventional format; therefore, people used to transform them into regular 3D voxel grids or collections of images. Although you could vaxolize the input data, the approach is very expensive since it results in a large input space (your mesh would be really sparse). PointNet, however, simply utilizes point clouds as input data and outputs either class labels for the entire input or for each individual point to perform Part Segmentation.

Architecture

PointNet directly receives unordered point sets as input.

A point cloud is represented as a set of 3D points {Pi | i = 1, …, n}, where each point Pi is a vector of its (x, y, z) coordinate plus extra feature channels such as color, normal etc.

The architecture of PointNet is shown in the figure above. Let’s try to understand the architecture. First, we see that the input is…

Speaking Code: PointNet

Architecture

Written by Ching (Chingis)