New SOTA on Human Posture Recognition in 3D and Real-Time

Published in

Deep Learning Digest

2 min readJul 24, 2020

The new SOTA neural network recognizes 3D human posture in real-time

Assessing a person’s posture and recognizing an action are related tasks because both problems depend on the representation and analysis of the person’s body. However, most of the existing models solve these problems separately. Researchers propose a multitasking framework that solves the problem of jointly evaluating 2D and 3D poses from images and classifying actions from a video recording.

One architecture handles both tasks at the state-of-the-art level. At the same time, the inference model processes more than 100 frames per second. The proposed neural network uses separate parameters when solving problems of posture assessment and classification of actions.

Overview of the proposed approach. Source: Arxiv

Approach architecture

The workflow of the model consists of the following steps:

Feature maps are extracted from the input images;
Feature maps are input to a sequence of convolutional networks that consist of prediction blocks (PB), upscaling and downscaling modules (UU and DU), and skip connections;
Each PB block provides predictions for posture and action. These predictions are refined in subsequent blocks.

The model was trained entirely on labeled data.

Visualization of the network architecture. Source: Arxiv

Model performance evaluation

The researchers tested the model on four datasets: MPII, Human3.6M, Penn Action, and NTU RGB + D. Below you can see that for the Human3.6M dataset, the neural network bypasses the previous approaches in the accuracy of classifying actions by video.

Comparison of approaches on the task of recognizing actions on the Human3.6M dataset. Source: Arxiv

Source: https://arxiv.org/pdf/1912.08077.pdf
Github: https://github.com/dluvizon/deephar

New SOTA on Human Posture Recognition in 3D and Real-Time

Approach architecture

Model performance evaluation

Written by Mikhail Raevskiy