Building Local Legends
The Birth of a New Feature at Strava
Most Strava athletes love competition! Pushing hard to beat someone in the final stretch of a race. Achieving a new personal best at your local road race. Or perhaps simply pushing yourself to keep up with your training partner in a workout. Athletes crave these things, and it drives them in their daily training. Many years ago, Strava created a way for athletes to compete with each other on their neighborhood trails with Strava segments. Since then, segments have become a core part of the Strava experience for many athletes.
A New Way to Compete
More recently, in late 2019, Strava’s product development team began forming ideas about a new way to compete on Strava. Strava segments have always provided a way to recognize athletic achievement through a form of racing. Now, we wanted to also recognize the grit that goes into training. Those athletes who are out on the trail every day, rain or shine, striving for more. Local Legends is the result of these ideas, recognizing the athlete who’s completed a segment the most times in the last 90 days. I was lucky enough to play an important role in building this feature, and in this blog post, I want to tell the story of how we developed Local Legends and explore some of the technical challenges we faced along the way.
Our product team began by exploring many different ideas for features we might develop. We knew we wanted to develop a feature to serve athletes who wanted to compete in new ways on Strava. The idea of competing to see who’s completed a particular segment the most times wasn’t an entirely new idea. In fact, some Strava employees have kept a running tally for years to see who’s done a particular segment near the office the greatest number of times. But as our product team narrowed down some of their brainstorming work and did some user research, the idea for Local Legends retained our attention because it was an interesting new form of competition. A level playing field for athletes who might never be the King or Queen of the Mountain with the fastest time, but who are out on the trails more often than anyone else.
Can We Build It?
As soon as our product team had a very rough idea of the feature we wanted to develop, they started consulting with our engineering team. At this point, we knew we wanted to build something that would allow athletes to compete with one another based on the number of times they’ve done a segment rather than their fastest time. But that’s all we knew. We hadn’t yet decided what the timeframe of competition should be (all-time vs yearly vs monthly vs other). We hadn’t yet decided what the leaderboard would look like (would it be a traditional leaderboard, or were there alternatives we should consider?). And most importantly, we didn’t yet know if it would be possible to build this feature in a reasonable amount of time, from an engineering perspective.
Building a leaderboard based on the number of segment efforts an athlete has done sounds easy — and in some ways it is, at least at a small scale… But Strava doesn’t operate at a small scale. We celebrated our 4-billionth activity upload on September 2, 2020! At a small scale, you might be able to simply query the database with something like this.
SELECT athlete_id, COUNT(*)
WHERE segment_id = ?
GROUP BY athlete_id
ORDER BY COUNT(*) DESC
And that would probably work just fine, up to thousands or even millions of segment efforts. But Strava’s several orders of magnitude beyond that scale. In fact, we have significantly more segment efforts than activity uploads because a single activity often traverses multiple segments. At this scale, the database can’t provide the information we need fast enough when we query it in this way. Even with proper indexing and other optimization techniques, it takes too much time to count the efforts of (possibly) thousands of athletes and sort the results. We simply can’t do that on every request for a leaderboard.
So, as software engineers, what do we do? We don’t want to go back to our product team and say, “Sorry, we can’t build that.” Instead, we want to provide some alternatives that will work. So that’s what we did. There are numerous possibilities for workarounds or alternative solutions. If the fundamental problem is that we can’t calculate the leaderboard fast enough for every request, an obvious solution is to avoid calculating it on every request — for example, by updating it on a schedule. We could, perhaps, recalculate the leaderboard daily. The leaderboard would be frozen in time for, say, 24 hours — at which point it would be recalculated with the most recent data. It’d be a compromise, but from an engineering perspective at least it’s a feasible option. But there are also other solutions we could consider to avoid calculating the leaderboard on every request. We could denormalize the data so it’s stored closer to the format we need. Instead of the database running an expensive query where it has to count and group results (as above), it could run a cheap query like this.
SELECT athlete_id, effort_count
WHERE segment_id = ?
ORDER BY effort_count DESC LIMIT ?
This seems ideal from a read perspective, but doesn’t come without trade-offs. Essentially, this solution moves some of the load from read-time to write-time. The query to read the leaderboard is fast and efficient, but there’s more work to be done each time a new activity is uploaded. So it’s a feasible option, but as always, not without some costs. We considered some other options as well, including some other types of databases, but ultimately we found the same root problem — the need to count and sort segment efforts efficiently.
Iterating on Our Designs
After some brainstorming around these alternative engineering solutions, we presented them to our product team so we could discuss the trade-offs. We presented the daily recalculation solution and we presented the denormalization solution. Keep in mind — at this point, neither of these were firm solutions! They were rough ideas about how we might approach the problem. In our discussions with our product team, we eliminated the daily recalculation solution fairly quickly. When athletes upload activities to Strava, they don’t want to wait up to 24 hours to see their results. It’s important that when an athlete uploads an activity, their new stats are quickly reflected on Strava. After eliminating the daily recalculation solution, the denormalization solution sounded appealing, and we started exploring it in more depth. Denormalizing data in the way we were talking about required a time window, and hinted at yearly or monthly leaderboards since either would provide a time window around which to store our denormalized data.
We considered several pieces of data in deciding which solution to implement. At Strava, we value data in our decision-making process, and we have a data analyst on most teams to provide insight into decisions like this one. Our data analyst used historical data to simulate what it might look like if we developed all-time Local Legend leaderboards. In many ways, we found exactly what you might expect! Strava was founded in 2009, and many athletes who joined the platform in the early days are still active users today. These athletes tended to dominate our simulated all-time Local Legend leaderboard in a way that often made it impossible for other athletes to compete with them.
In a similar way, we simulated annual and monthly leaderboards, and found that these didn’t have the same sort of problem that the all-time leaderboard did. We needed to settle on a time frame — at least for our first iteration of this feature. We decided that annual leaderboards, while interesting, take too long to recycle. An athlete who’s new to an area might need to wait 6 months or more to have a realistic shot at becoming the Local Legend. After validating some of our assumptions with more user research, we settled on 3-month (quarterly) windows for leaderboards. But we also felt that a quarterly reset wasn’t ideal from a product perspective. We wanted to avoid the time during the first couple days of the quarter where someone would become the Local Legend with near 0 efforts, and where the Local Legend might shift too frequently from one athlete to another. If you’re familiar with the Local Legends feature, you already know that the solution we found for this problem is to use a rolling time window. So we finally decided that a 90 day rolling window was the best time frame to create the kind of competition we wanted to facilitate with our initial feature release.
We also considered some other trade-offs as we thought about this feature in more detail. Strava values athlete privacy, so we felt strongly about building privacy into the design of the product. Because of this, we considered some alternatives to a complete leaderboard (in other words, the number of times any athlete had traversed any segment), including several visualization options, as ways to provide some insights about the data while protecting athletes’ privacy better than a complete leaderboard would. We experimented with visualizing the data in a couple different ways, and became interested in using a histogram visualization. We found histograms interesting because they exposed the distribution of our data visually. But we didn’t like that they tended to minimize the efforts of the Local Legend (since the Local Legend was often in a histogram bin with only a single athlete). We also found that on a histogram, the first few bins (i.e. athletes who had only done the segment once or twice) were disproportionately large and the rest of the bins (athletes who had done the segment more frequently) were much much smaller. One of our data analysts suggested experimenting with a logarithmic scale (using historical data, as before), and we found that using a logarithmic scale solved both of these problems for us. The logarithmic scale significantly reduces the difference in visual scale between the number of athletes who had done the segment once and the number of athletes who had done the segment many times. Perhaps more importantly, however, the logarithmic scale shows a large visual difference between bins with 1 and 0 athletes. Practically, this means that the bin with the Local Legend (typically a bin with a single athlete at the right of the graph) is visible even when the scale of the y-axis must also accommodate thousands of athletes in the first bin.
Engineering a Solution
So, to consider all of this from an engineering perspective again, we needed to find a solution that would allow us to efficiently query for the Local Legend on any segment within a 90-day rolling window. We also needed to be able to efficiently produce a histogram showing the distribution of all athlete effort counts on that segment. As with many software engineering projects, our requirements had evolved slightly over time as more information became available. But we weren’t caught off-guard because we collaborated closely with our product team and our analysts throughout the process.
Near the beginning of this blog post, we discussed two possible approaches for efficiently finding the Local Legend. One was a scheduled job and one was to use some form of denormalization. The 90-day rolling window adds another piece of complexity here, but there’s some good news as well. Because we’re using a histogram instead of a complete leaderboard, we don’t necessarily need to be able to efficiently query for the entire leaderboard — just the Local Legend (i.e. the athlete at the top). An obvious way to handle the rolling window, and one that we thought about, is to use the scheduled job approach and make sure it runs near midnight. While this might work, it’s not a perfect solution. For starters, the job wouldn’t run instantly, so you’d have at least a short period of inaccuracy overnight. And there are other concerns about whether the job can be safely retried if it fails, and how we ensure that the data is in a correct state. Instead of the nightly job, we decided to implement a solution that avoids those problems by doing further denormalization. Essentially, this shifts the work that would’ve been done in the nightly job to work that’s done when new activities are uploaded — which is fine because we can do the work in a background job. To achieve this denormalization, rather than storing only the current local legend and their effort count in a denormalized way, we store the current and upcoming local legend (not always the same person) with their effort count for each of the next 90 days (one row per day).
Let’s examine this solution in a little more detail. We have a table of segment efforts, with one row per segment effort. It’s relatively slow to query for the local legend from this table (because of grouping and sorting many rows), but it isn’t too bad to query for a single athlete’s efforts in the last 90 days if you have reasonable indexes. (This requires counting, but not sorting those counts.) Separate from this, we have our denormalized local legend table. It stores the local legend for every segment, for every day, 90 days into the future, and we keep this table up-to-date with every new activity. When an athlete uploads a new segment effort, we query the efforts table to read all their segment efforts in the last 90 days (including the one they just uploaded). With this information, we calculate their “last 90 days” effort count for today, and tomorrow, and the next day, 90 days into the future. If any of these are more than the stored local legend’s effort count on any upcoming day, we make this athlete the new local legend for that day and write their info into the denormalized local legend table. With this schema, we can query for the current local legend very efficiently (which is important because this query happens often).
SELECT athlete_id, effort_count
WHERE segment_id = ? AND date = ?
At midnight (when the daily rollover happens for the rolling window), there’s nothing to do — we simply start querying the next day, which is already correct in the database!
So that’s how we query for the Local Legend, but what about the histogram? Calculating the histogram is actually pretty straight-forward compared to what we do for the Local Legend. Querying for histogram data for a segment using the efforts table isn’t as inefficient as calculating a full leaderboard because, again, it doesn’t require sorting — just grouping by athlete ID. So we simply query the efforts table for counts of all the 90-day efforts on a segment, grouped by athlete ID, and transform that data into a histogram. This query isn’t blazingly fast, but it’s fast enough. And we can do some caching for the biggest segments where the query might be a little too slow for our needs.
Implement the Solution
Much of the work described above — the interesting bits of the design phases — took place over many weeks in early 2020. During this time, we were doing a lot of product and user research, and dedicating some engineering time to investigation and research, but engineers weren’t working on the project in a full-time capacity. We quickly ramped up our engineering team around March and began work in earnest. I worked primarily on the backend with two other engineers, and we collaborated closely with the frontend and mobile engineers on our team.
Early in our development process, we agreed on the data structures we would use and we implemented backend code to serve static (mock) data. This allowed us to work on the mobile UI and the backend in parallel, and make small tweaks to our API as needed early in the development process. This also allowed us to get something in front of (internal) users quickly. At Strava, our engineers make use of feature gates to test and roll out new features. With feature gates and our mock data, we were able to release a preview of the feature to our Local Legends team very early in the development process, while the backend (and even the frontend) was still being developed. This allowed us to receive early feedback.
Developing the backend itself progressed quickly. Our engineers were wrapping up previous projects, so we didn’t start implementing code for Local Legends ’til mid March. By mid May, we had delivered a working version to Strava employees (in production, but behind a feature gate). Our team developed the core functionality of Local Legends in just 2 months! Through the rest of May and into June, we continued to deliver updates, improvements, and bug fixes to those who had early access as we finished some work around the edges (like notifications). And we gained experience operating and tuning the service we’d developed before it was completely rolled out.
As with most big features at Strava, we’d been using feature gates to control access to local legends, pushing the feature to employees first and then releasing to a wider audience. We had always planned to do this for Local Legends, but the release became more complicated than usual because of our denormalized data store. The service we’d developed needed to ingest the past 3 months of data before it could provide accurate information to users — a process we referred to as “backfilling”. We weren’t surprised by the need to backfill, but it required engineering effort to develop and run the backfill scripts, and it took time to run them — on the scale of weeks.
We first backfilled segments in several US states and rolled out the feature to these areas as a beta test. We gave ourselves the ability to enable the Local Legends feature on a segment-by-segment basis so we could do beta testing before the backfilling process was complete in other geographical areas. Although we fixed some minor bugs and usability issues, the beta testing process went very smoothly and we were feeling great about a much larger rollout the week of July 13.
Unfortunately, the large-scale rollout did not go as smoothly as our beta testing did. A bug had slipped through the cracks in our backfill script with changes we introduced after backfilling the beta test segments! The bug meant that our backfill did not run correctly on a portion of the geographical areas we were rolling out, leading to big inaccuracies in our Local Legends data. Fortunately, our team realized this quickly and immediately began working on a solution. First, we temporarily disabled the feature — which was very easy to do because of our feature gates. Within an hour or two, we’d identified the root cause and fixed it. Of course, we couldn’t simply release the fixed code and be done with it because the problem was in our backfill script. We needed to do more backfilling on the problematic segments — which would be time consuming and could potentially delay the release more. Our team came up with a plan to backfill segments and enable the feature on those segments as the backfill completed. We rolled out publicly to about 20% of our desired segments within 24 hours of the incident, and finished rolling out to 100% of our desired segments about a week later — a big success for our team in response to a challenging and unforeseen incident. There’s always something that doesn’t go as planned, and I’m proud of the way our team collaborated and overcame the problem.
Local Legends successfully rolled out to many geographical areas in mid July, and we’ll continue rolling out to new segments throughout this year and next! Local Legends is a complex feature, and I’d be remiss if I didn’t mention the contributions of all my teammates. Neil Bezdek contributed a lot of effort to this feature as an analyst, and provided the graphs you see above. Kate Frett, David Wilson, Marcus Saad, Fumba Chibaka, Devin Ridgway, David Lee, Michael Fransen, Sam Odom, Jeff Remer, Conner Peirce, and Jacob Stultz all contributed to the project as software engineers, and far too many people to name contributed in many other ways. From marketing to engineering to product management and user research, nearly every team at Strava contributed to Local Legends in a meaningful way, and it’s the support of all these people that makes Strava such a great place to work!