How to build a Physical Activity Recommendation Engine?

Oussama Tchita
Ad Scientiam Stories
11 min readMar 15, 2022
Photo by Jozsef Hocza on Unsplash

Standing an hour a day keeps the doctor away. This is the conclusion reported in a 2020 study by Jain et al, which demonstrated that simply standing for more than an hour a day, without necessarily exercising, is enough to significantly reduce the risk of premature deaths [1].

According to recent estimates of the World Health Organization, physical inactivity is directly responsible for approximately 10% of premature deaths worldwide, an impact similar to that of smoking and obesity [2].

Indeed, this percentage is far too large. Numerous studies have clearly shown that high levels of physical inactivity are associated with a significant increase in the risk of cardiovascular disease, type 2 diabetes and certain types of cancer [3].

There is no doubt that prolonged bouts of physical inactivity, such as sitting or lying down for too long, is very bad for health, and it is absolutely necessary to break this bad habit to improve the health of the general population.

To that end, at Ad Scientiam we decided to develop a Software as a Medical Device integrating a solution that helps patients with cardiovascular disease after cardiac rehabilitation. But it can also be easily generalized to all kinds of profiles or conditions.

This solution provides each user with a personalized recommendation system tailored to both their preferences and their health status. Our ultimate goal is to improve patients’ arterial pressure following cardiologist recommendations, which would contribute to a net decrease in relapses.

Let’s start with a broad introduction of the different components of our Physical Activity Recommendation Engine (PARE).

The architecture of the Physical Activity Recommendation Engine (PARE). Yellow: the standard input to the system. Red: the engine core components. Green: the information provided by the user. Blue: the system output.

A list of physical activities and the patient’s medical information are fed into the PARE, which provides a customized weekly activity program as output. Between these two elements is the engine core, which is in turn composed of three major components:

  • The Activity Filtering System (AFS): it filters out unsafe activities and highlights the most beneficial ones.
  • The Activity Ranking System (ARS): it predicts activity preference scores, and provides a ranking of the activities that are most likely to be selected by the user.
  • The Intensity Recommendation System (IRS): it provides an intensity recommendation associated with each recommended safe activity.

The recommended activities (with their associated intensities) provided by the engine core are then selected by the user to construct a weekly program. During the week when this program is followed, we collect all kinds of feedback from the user regarding their attitude towards the selected activities and the level of exhaustion after performing them. These different pieces of information will then be used to update and fine-tune the deciding components of our recommendation engine i.e. AFS, ARS and IRS.

Now that we have a general idea of how it works, let’s break down each component separately.

Activity Filtering System

The F within AFS stands for filtering i.e. we require an activity list from which we can filter. We have based our work on the updated version of the Compendium of Physical Activities [4]. This compendium provides a comprehensive list of activities, each with its associated intensity and its Metabolic Equivalent of Task (MET). This MET value is a measure of energy expenditure. We’ll circle back to it later on.

Activities are listed in the Compendium as multiples of the resting MET level, and range from 0.9 (sleeping) to 18 METs (running at 10.9 mph). We have customized this compendium into our own version by selecting only pertinent activities and by reorganizing their intensities. Their pertinence was determined via two factors:

  • Their intensity i.e. discarding highly costly activities along with extremely low and inefficient activities. For instance, even though meditating (1 MET) seems gratifying, we will not be recommending that to our users.
  • Their nature i.e. a bundle of activities is simply not recommended for our population. For instance, contact sports should be avoided while the patient is under dual antiplatelet therapy (potentially the case for our target population) due to the risk of bleeding, but they may be considered afterwards [5].

The final list of activities has been thoroughly examined by key opinion leaders who have been involved throughout the process of conception and development of the PARE.

Here’s a snippet of what the compendium looks like after customization:

Once we’ve obtained our custom compendium of activities, we’ll be moving forward with the filtering system. It is based on two essential elements:

  • The patient’s cardiorespiratory fitness (CRF).
  • The European Society of Cardiology (ESC) recommendations filter for risk level determination.

The idea behind the AFS is quite basic in its functioning. We define a MET range for recommended activities and only select activities whose MET values fall within the aforementioned range. Easy as it may seem, there are a few intricacies to it.

First, we ought to define the MET range, which basically represents the recommended range of energy expenditure that is directly linked to cardiorespiratory fitness improvement. This range is naturally patient dependent and the measure that we use to compute it is the VO2max.

The VO2max or maximal oxygen uptake is the maximum rate of oxygen consumption measured during incremental exercise, i.e. exercise of increasing intensity.

This measure is computed by the patient’s healthcare professional in a controlled setting that looks more or less like this:

SOLSTOCK/GETTY IMAGES

Activities requiring a cardiorespiratory fitness higher than that of the patient are not considered for the generation of the program in order to avoid the endangerment of the patient by recommending high effort activities.

VO2max is thus accepted as the criterion measure of CRF and is converted into a MET to determine the appropriate target activity range.

Following ESC guidelines along with the patient’s medical information, we establish a MET range inside which all recommended activities must fall into. The filtering procedure then simply selects the activities whose energy expenditures are within the aforementioned MET range.

Activity Ranking System

The aim of the ARS is to identify the top few activities in a large catalog that have the highest probabilities of being selected by the user. We’ve put in place a hybrid approach that will allow us to iteratively propose activity recommendations to the patient, based on their previous choices, their preferences and their habits. It is a combination of a user-based filtering approach for initial activity recommendation and a content-based approach for updating preferences throughout the use of the application.

User-based filtering approach

User-based collaborative filtering is a technique used to predict the items that a user might like, based on the ratings given to those items by other users with similar user-taste.

This approach, however, requires user-based data beforehand. To that end, Ad Scientiam has conducted an online survey to collect respondents’ physical activity preferences data along with their corresponding profile information (age, gender, living environment etc.). We’ve collected over 700 different responses. Using that data, we’ve trained a recommender model using TensorRec. This model provides each user with an initialisation of activity preferences based on their profile information. This mapping is further refined using the content-based filtering approach.

Content-based filtering approach

Content-based filtering projects the recommendation problem into an item space which in our case is the space of all possible physical activities to be recommended by the algorithm. It explores the relationship between physical activities instead of the relationship between users. In most real-life applications, the number of items is dwarfed by the number of users and in many cases, these items are static: the set of items changes much less frequently than the number of users. This allows for a decoupling of the fitting stage and the prediction stage of a content-based collaborative filtering model.

The concept of similarity is a critical element of the collaborative filtering framework. For user-based collaborative filtering algorithms, the user-similarity matrix consists of a metric that measures the distance between any pair of user preferences. Likewise, the item-similarity matrix measures the similarity between any pair of items in the content-based framework i.e. physical activities.

Similarity between items such as physical activities may seem abstract, yet we do have multiple means of quantifying it. In our case, we’ve decided to use the Cosine similarity measure:

This measure relies on the different binary tags associated with each activity found in the main activity list (indoor/outdoor, ball use, water-based, requires gym membership, endurance-based…). Thus, for each activity we obtain a binary vector of tags.

We’ve specified over 50 binary tags per activity that have been added to our custom compendium of physical activity. This allows us to compute the item-similarity matrix following the aforementioned formula. Below is a dummy data example of how that matrix is represented:

If we decide to apply the above-mentioned model as is, we might risk a limited exploration of the activity space i.e. a patient may potentially like swimming but they will never know, unless they decide to try out a ‘similar’ activity autonomously.

In order to prevent this space exploration limitation, we introduce an exploration/exploitation trade-off. It can be defined as the probability from which an activity is selected utterly randomly. This trade-off increases as we go down the ranking of our proposed recommendations.

Intensity Recommendation System

As stated earlier, the AFS provides a safe activity list for each user. This list is fed to the IRS along with the user’s medical information (VO2max, risk factors, etc.) in order to obtain intensity recommendations for each safe activity. The output translates as follows:

  • An amount of activity to be performed during the week.
  • An intensity in terms of either effort (light/moderate/vigorous) or target speed (km/h, wattage, etc.) depending on the activity.

The 2019 ESC guidelines [6] define a target level of 150–300 minutes of moderate-intensity physical activity per week, regardless of the related MET value in a particular activity. This recommendation can also be considered as 500 to 1000 MET-minutes (MET-min) per week which is obtained by multiplying the time (in minutes) spent performing the activity by the MET value of the activity — based on the 2011 compendium [4].

Each subject has thus a pre-established weekly amount of activity in METs. This amount will therefore be tweaked weekly depending on the performance of the subject during the previous week. A high outperformance will lead to a slight increase in the activity amount and vice versa. This amount is of course limited on both ends following the ESC guidelines.

The amount of activity is spread out across the activities selected by the user. The importance of not exceeding the recommended amount of exercise is also highlighted, as over exercising can endanger the subject.

Feedback

The feedback loop is a vital part of the PARE as it ensures that the three major components (AFS, ARS and IRS) are updated weekly throughout the use of the application. The collected feedback data is either targeted interaction logs with the application or simply straightforward questionnaires.

Multiple types of information are collected throughout the week; we can divide them up according to their impact vis-à-vis the aforementioned components. Two separate categories of feedback are hereby presented:

1. Preference-based feedback

  • The application collects the information regarding the activities selected by the user for their previous week. This information is concatenated with the list of activities that were actually recommended during said week in order to tweak the activity ranking.
  • An activity feedback questionnaire is also given to the user in order to obtain ratings of the performed activities of the week.

→ The preference-based feedback information updates the activity rankings (ARS) while taking into account not only the previous week’s feedback but also all of the previous ones as well. Their significance decreases as new and updated feedback comes along. Meaning that more recent feedback will have a larger impact on activity rankings than the older one.

2. Performance-related feedback

  • Directly after the conclusion of each activity, we collect the duration of its performance as well as the subject’s exhaustion level within the hour. The exhaustion level is expressed in a Börg scale. It represents the rate of perceived exertion (RPE). Using this value and the subject’s VO2max, we can recompute the number of METs spent during said activity and correct it in the compendium.

→ This allows the activity MET values to be even more subject specific in addition to the profile information used in its computation. These new activity METs will directly impact the filtering process in the AFS as well as the intensity recommendation (IRS).

Closing thoughts

Throughout each step of our proposed model, two fundamental questions regarding physical activity recommendation have been answered:

  • Safety: How do we make sure that our recommendations don’t put the user at risk?
  • Adherence: How do we make sure that the user sustainably keeps up their physical training program?

It’s not a mystery to us that physical activity is beneficial. However, we might ignore how beneficial it can be when performed right. It can also have a much larger impact on certain populations.

Following medical procedures, cardiac patients are often left to their own devices with little or no follow-up when it comes to physical activity training. Our mission here at Ad Scientiam is to make a measurable difference for these patients, to adapt to their needs and to provide them with the care they deserve.

I hope you’ve enjoyed the article. Feel free to share your thoughts and feedback.

References

[1] Purva Jain, MPH, John Bellettiere, MPH, PhD, Nicole Glass, MPH, Michael J LaMonte, MPH, PhD, Chongzhi Di, PhD, Robert A Wild, MD, MPH, PhD, Kelly R Evenson, MS, PhD, Andrea Z LaCroix, MPH, PhD, The Relationship of Accelerometer-Assessed Standing Time With and Without Ambulation and Mortality: The WHI OPACH Study, The Journals of Gerontology: Series A, Volume 76, Issue 1, January 2021, Pages 77–84, https://doi.org/10.1093/gerona/glaa227

[2] Lee IM, Shiroma EJ, Lobelo F, et al. Effect of physical inactivity on major non-communicable diseases worldwide: an analysis of burden of disease and life expectancy. Lancet. 2012;380(9838):219–229. doi:10.1016/S0140–6736(12)61031–9

[3] Biswas A, Oh PI, Faulkner GE, Bajaj RR, Silver MA, Mitchell MS, Alter DA. Sedentary time and its association with risk for disease incidence, mortality, and hospitalization in adults: a systematic review and meta-analysis. Ann Intern Med. 2015 Jan 20;162(2):123–32. doi: 10.7326/M14–1651. Erratum in: Ann Intern Med. 2015 Sep 1;163(5):400. PMID: 25599350.

[4] Ainsworth BE, Haskell WL, Herrmann SD, Meckes N, Bassett DR Jr, Tudor-Locke C, Greer JL, Vezina J, Whitt-Glover MC, Leon AS. 2011 Compendium of Physical Activities: a second update of codes and MET values. Med Sci Sports Exerc. 2011 Aug;43(8):1575–81. doi: 10.1249/MSS.0b013e31821ece12. PMID: 21681120.

[5] Mats Borjesson, Mikael Dellborg, Josef Niebauer, Andre LaGerche, Christian Schmied, Erik E Solberg, Martin Halle, Emilio Adami, Alessandro Biffi, Francois Carré, Stefano Caselli, Michael Papadakis, Axel Pressler, Hanne Rasmusen, Luis Serratosa, Sanjay Sharma, Frank van Buuren, Antonio Pelliccia, Recommendations for participation in leisure time or competitive sports in athletes-patients with coronary artery disease: a position statement from the Sports Cardiology Section of the European Association of Preventive Cardiology (EAPC), European Heart Journal, Volume 40, Issue 1, 01 January 2019, Pages 13–18, https://doi.org/10.1093/eurheartj/ehy408

[6] Stavros V Konstantinides, Guy Meyer, Cecilia Becattini, Héctor Bueno, Geert-Jan Geersing, Veli-Pekka Harjola, Menno V Huisman, Marc Humbert, Catriona Sian Jennings, David Jiménez, Nils Kucher, Irene Marthe Lang, Mareike Lankeit, Roberto Lorusso, Lucia Mazzolai, Nicolas Meneveau, Fionnuala Ní Áinle, Paolo Prandoni, Piotr Pruszczyk, Marc Righini, Adam Torbicki, Eric Van Belle, José Luis Zamorano, ESC Scientific Document Group, 2019 ESC Guidelines for the diagnosis and management of acute pulmonary embolism developed in collaboration with the European Respiratory Society (ERS): The Task Force for the diagnosis and management of acute pulmonary embolism of the European Society of Cardiology (ESC), European Heart Journal, Volume 41, Issue 4, 21 January 2020, Pages 543–603, https://doi.org/10.1093/eurheartj/ehz405

--

--