Evaluation of PoseNet for applied AI-Fitness applications

Hadi Golkarieh
Optima AI
Published in
4 min readMar 15, 2019

Pose estimation is the process of utilizing computer vision techniques to estimate various elements of human posture within an image or segments of a video. PoseNet, an open source human pose estimation model, provides the building blocks necessary for real time human pose detection. PoseNet provides confidence scores for detection of key human body parts such as joints and facial features within an image.

Photo by Olia Gozha on Unsplash

In this study we explored the accuracy of this model in day to day human movements as well as on a series of non-conforming athletic poses. Hence posing the question of how accurate this model operates and whether it is capable of detecting non-conforming human poses out of the box. The reason for this study was to evaluate the extent to which this model is suitable for real time applied AI applications in fitness and physical therapy.

In this study we used the Tensorflow implementation of PoseNet as describe here. This implementation provides both single and multi-pose detection methods. In the single pose method the algorithm is able to identify a single human body in an image and detect its posture. In the multi-pose method the algorithm first identifies all bodies in an image and then performs pose detection for all their individual postures.

We started off the evaluation using the single pose detection scenario. As it is illustrated below the model achieves acceptable results in detecting and denoting various joint positions such as shoulders, elbows, hips, knees and ankles. However, for the detection of facial features such as nose, ears, and eyes the model scores poorly as it is illustrated below. What makes this model fascinating is its ability to track joint movements in real time with a relatively high level of accuracy and flow. Our evaluation of the model in the case of single pose detection movements for day to day conforming human poses was acceptable. This model may be used for simple applications containing a single human body for the purpose of joint detection.

Next, we evaluated the multi-pose detection method. The model was able to identify all the individuals presented in a given picture. Detecting their respective joint and facial features on the other hand proved to be somewhat challenging depending on how closely these individuals appeared in the images in respect to each other. Given enough distance between the individuals the model is able to detect their features more accurately. However, as the bodies appeared closer to each other the model started to fail in detecting and locating the body features properly. Understandably as the complexity and density of the features in the input images grew the model failed to deliver accurate results. Our evaluation of the model for multi-pose scored mixed results. We believe for this model to be deployed in applied multi-pose detection applications in real world further enhancements are required.

Next, the most challenging of the three problems was the detection of the more non-conforming athletic movements such as yoga poses, squats, deadlifts and lunges. For these scenarios we used the single-pose detection method to simplify the problem. As illustrated below the model failed to deliver accurate results in identifying and locating various features of the athletes. PosNet provides a general purpose pose estimation model. However, this model may be fined tuned and trained for specific athletic and fitness applications. Such fine tuning and training should enhance the accuracy of the model for fitness specific applications.

Based on our evaluation, the model may currently be used in simple applications where the accurate pixel location of joints is not as important as their overall positioning relative to each other. For both multi-pose and athletic posture detection applications this model needs to be further trained and tuned to achieve optimum results. Because of that, the out of the box version of this model today, is not suitable for applied AI fitness and physical therapy applications. However, we strongly believe this model is a great baseline which can be further enhanced to support such applications.

If you are interested in learning more about PosNet applications, or apply the technology in your business, please reach us at www.optima.ai. We believe in the near future, pose detection algorithms will unveil great opportunity in the space of personalized fitness and physical therapy. At Optima AI, our goal is to empower organizations apply cutting edge technologies in their products and services.

--

--