From Suffer Score to Relative Effort
One of our most popular Premium features is Suffer Score, a way to quantify effort during an activity using heart rate data. Although Suffer Score has become a popular feature for providing insights into an activity, it had limitations in providing accurate results across sports and athletes for similar efforts. The main problems were:
- The model wasn’t well suited for comparing effort across different sports. Running efforts were scored higher than rides and swims of the same length and intensity. This meant that a similar amount of effort applied to a run would result in a higher Suffer Score than it would have in a ride of a similar duration.
- Length of the effort was weighted much more than intensity. Shorter, more intense efforts had lower Suffer Scores than longer, less intense ones even if the shorter effort had a much greater training effect.
Working with Dr. Marco Altini, we developed a model to generate a new version of Suffer Score, called Relative Effort. This powers new Premium features that provide insight into week-over-week training load. The new model follows the same approach of using time spent in different heart rate zones to score an activity, but differs in a few key ways: it weights intensity of effort more than duration, is better at weighting equivalent efforts across sports, and scores equivalent efforts by different athletes similarly. The result is that Strava athletes are better able to use this metric to compare activity effort across workout types, sport types and even individuals.
Tuning the Model
To calculate Relative Effort, the model takes in a stream of data with an athlete’s heart rate and the corresponding timestamp as inputs. Using the athlete’s max heart rate (either entered by the athlete or estimated), this heart rate data is divided into a number of zones that approximate different levels of cardiovascular intensity. For each heart rate zone, the model applies a coefficient to weight the time spent in that zone. The higher the heart rate zone, the harder the effort, and a higher coefficient to more heavily weight time spent in that zone.
To achieve our goal of providing similar values for athletes producing similar efforts, we leveraged Strava’s rich data set to find a group of activities from athletes giving a roughly equivalent effort. To tune on a subset of activities from thousands of different athletes, we made the assumption that Athlete A giving an all-out effort should expend an equal amount of stress/effort as Athlete B, even if the athletes have different absolute fitness levels. These two athletes should then have similar Relative Effort values. After testing 5k, 10k, and half marathon distances, we settled on using 10k running race efforts to tune the coefficients applied to each zone. Running removes much of the variability associated with sports like cycling, where drafting and coasting can result in significant changes in effort and heart rate. The 10k distance is short enough that an athlete is consistently in a high heart rate zone, yet long enough for heart rate data to settle into a consistent pattern and avoid the noise often found in shorter events.
With a large subset of 10k running races on Strava, we iterated through thousands of different coefficient combinations for each heart rate zone, with the goal of minimizing variance of Relative Effort values from our data set. By minimizing the variance on a set of equivalent efforts, we sought to increase the precision of the model and rate equivalent efforts equally. We used the coefficient of variation (standard deviation / mean) to measure variance in order to control for the effect of increasing values when testing higher coefficients. We were able to decrease the coefficient of variation from 0.44 from the original coefficients to 0.39, a 12% improvement. This decrease in variance will improve the accuracy of the Relative Effort model and give our athletes a better representation of their efforts.
Extending the model to different sports
The second objective of rebuilding Relative Effort was to better approximate efforts across activities. To do this, we used activities from Olympic distance triathlons as a way to find equivalent efforts to tune the model across sports. Although not perfect in terms of athletes giving equal effort in each leg, we think that in aggregate those activities should give us a reasonably hard effort for comparison. This was particularly valuable to validate using a lower set of heart rate zones for cycling activities, and a lower max heart rate for swim activities.
To make Relative Effort scores comparable between run and ride, we compared running and cycling activities from the Olympic distance triathlon dataset to view an approximation of the values using the new ride zones. The ride zones are relatively lower than run, as cycling heart rates are lower than weight-bearing activities like running. We found a median score of 211 for run, and 190 for ride. Even though the ride median is slightly lower than the run median, we’re comfortable with this because a cycling effort during a triathlon is usually submax to save some energy for the run.
We wanted to make sure we had a good approximation between ride and run before we moved on to swimming. Heart rate based training isn’t as common for swimming, and literature for swim heart rate zones isn’t as well defined as run and ride. Additionally, swimming max heart rate has been described as 10–12 beats per minute lower than other activities. Given the non-weight-bearing nature of swimming, we used cycling zones and tested a mix of lower max heart rates and a lower lactate threshold heart rate. We settled on using cycling heart rate zones with a max heart rate 12 beats lower than the recorded value, based on comparing the Relative Effort distributions from swim, ride, and run legs of the triathlon data set.
For Strava’s 20 other supported activity types, there isn’t extensive research on relative heart rate differences between activities, so we assigned the activities either run or ride HR zones. This is roughly based on whether the sport is weight-bearing and bipedal (Nordic ski, hike) in which case we assigned the run zones, or is not weight-bearing (Windsurf, handcycle, kayak) and has lower expected heart rates, and applied the cycling zones.
Leveraging Strava’s extensive activity dataset, we were able to tune and extend Relative Effort to better quantify Strava athletes’ efforts, in turn powering a compelling new product for our athletes to help guide their training.
Thanks to Tommy Gaidus, Ethan Hollinshead, Kyle Yugawa, Varun Pemmaraju, and J Evans for their significant contributions and guidance on this project.