Weight Prediction Framework from Gait Data

Apr 6 · 3 min read
“Making complex ideas simple.”- Albert Einstein

Accurate weight reporting can seem a minor issue and even not worth the buzz, as obviously, physical appearance is an apparent signal. Yet, people continue to spend a lot of money on applications for weight prediction or weight management. One of the reasons is phycological because continuous weighting and reporting result in additional pressure for healthier lifestyle habits. The second reason is the accuracy because studies have shown that people usually under-report their weight, which in the long run can result in negative consequences.

One company decided to challenge this paradigm and designed a smart shoe insole, which collects the user’s gait (movement) data and at the end of the day provides the weight estimation through the app. The user just needs to input their initial weight and place the insole, the rest is completely automatic and based on the movement data.

Now let’s dive deeper into the technical details of how we accomplish that.

Data is collected from several accelerometers and pressure sensors continuously during the day. The rough data from multiple sensors is imputed in the algorithm, where they undergo some aggregations and feature engineering. For each sensor, we created different features such as median, min, max, as well as signal processing specific features such as the number of peaks, peaks height, prominence, etc. In addition, the interaction of various sensors was taken into account by creating features such as magnitude and the weighted average of signals values. The result is a similar table for a 1-minute interval per user.

The feature set example for one sensor per minute per individual

The above-mentioned approach had a serious limitation, which initially put the whole project under the serious threat of existence;

  • Lack of data- the product was still in the design stage and the team members themselves were collecting the data. Moreover, the idea was to mimic the real-life experience, which was very challenging for the team as they should simultaneously walk as well as label their activities. Last but not least is the limitation of models that we could use. As the data was very small it was unreasonable to think about advanced models, such as LSTM or even boosted trees as they will overfit the data and will not work for the new user.
  • Bias- the data was very specific to the people who collected, which limited the ability of the approach to be generalizable for a larger population.

These issues were a serious challenge, we tried each and every possible model, starting from Bayesian regression to the boosted tree regression and none of them worked. Some models were working for some people but none of them was generalizable and even close to the target, which was weight prediction with the maximum error of 3 kg. Meanwhile, we noticed that the majority of models have the same set of important features. This actually hinted us to concentrate only on the top 10 features and try to build models only with them. This does not work as well, but strangely we again had a similar set of top 3 important features. Finally, after a lot and here I really mean a lot of experiments we came up with a simple linear equation with only 1 feature. Surprisingly this approach worked not only on the train and available test data sets but also worked for completely new users with different weights and physiological characteristics.

Linear regression formula: a-intercept, b-coefficient, x-the feature value, e-the random error

Just with the simple formula, we were able to solve the issue of weight prediction from gait data. Our results emphasize the importance of understanding the data, the full potential of its utilization, and model interpretation.

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data…

Sign up for Analytics Vidhya News Bytes

By Analytics Vidhya

Latest news from Analytics Vidhya on our Hackathons and some of our best articles! Take a look.

By signing up, you will create a Medium account if you don’t already have one. Review our Privacy Policy for more information about our privacy practices.

Check your inbox
Medium sent you an email at to complete your subscription.

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com


Written by


Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store