My journey at Strava began in June 2016 as a summer intern on iOS. I was taking a gap year from school the following year, and ended up staying on the iOS team for the year. I then transferred to the Infrastructure team for my second summer, after which I returned to Stanford for my senior year. You can read about those 14 months at Strava in my previous blog post. This past summer I was back on the Infrastructure team before beginning an MS in Computer Science at Stanford this fall.
This summer I worked on improving segment leaderboard performance.
First, some background on segment leaderboards:
Every time an athlete uploads a GPS activity to Strava, they are matched to segments along their route. An athlete’s fastest effort on a segment is ranked on a leaderboard against other athletes who have traversed that segment. Strava offers granular permutations of leaderboards —for example, you can compare your effort to the people of your gender, or from a certain club, or who have ridden the segment in the past week, or even a combination of all three. This support article gives more information about segments generally, and this series of blog posts from my mentor, Jeff Pollard, talks about Strava’s leaderboard infrastructure.
It is untenable for a single database table to produce all of the different leaderboards that we support — keeping indexes on that data is prohibitively expensive and slows down writes. It is infeasible to store every permutation of a leaderboard for every segment. We do keep denormalized leaderboards for all age groups, weight classes, and genders, but maintaining a leaderboard for each segment for the tens of these categories is different from having leaderboards for each of the many thousands of clubs and the following relationships of the millions of athletes, as those sets change too often and there are too many permutations. All efforts are kept in a canonical MySQL table. As efforts are added to or removed from that table, we also update a “best efforts” tables in Cassandra to model the most popular leaderboard combinations. We do not store leaderboards for athletes you are following nor for clubs. Creating leaderboards like these on the fly can be slow, and that’s where my summer project came in.
In the past, club and following leaderboards have been generated by scanning the overall leaderboard in Cassandra and filtering the results to those belonging to the set of athletes in the club or that the user is following. I will refer to the athletes on a club or following leaderboard as “filtered athletes.” In most cases the best efforts leaderboard for a segment has hundreds of efforts and there are tens of filtered athletes, so the filtering approach works just fine. However, in some cases we end up scanning hundreds of thousands of entries to find just a few potential athletes. This is problematic, as scanning over leaderboards becomes ever more time consuming as the size of the leaderboard grows. In these cases, we hypothesized that it would be significantly faster to issue individual requests to Cassandra for the efforts of the filtered athletes. I will call this new method “bulk get”.
For the first stage of my project I did a “dark test” of the bulk get method. For a small percentage of requests for club and following leaderboards, I issued a duplicate background query using the new bulk get method. I noticed that the new method is significantly faster than the old method on average and, crucially, doesn’t experience the same spurious latency spikes from a particularly expensive request.
This was a pretty obvious improvement. However, while latency for a bulk leaderboard request didn’t seem to increase as the number of filtered athletes increased, for large filtered leaderboards the parallelism, from requesting the best effort for each athlete in a filtered set in parallel, was too much for our downstream database to handle, and the additional stress on the system caused intermittent latency increases and failures across the board. For this reason we only use the new method if the number of filtered athletes is below a certain threshold so that we don’t overwhelm the database system with individual requests. I also discovered that a large number of the filtered athletes had never done an effort on the segment, and that we were issuing more than four times more individual requests to the leaderboards system than was necessary since there is no point sending a request for an athlete if we know they’ve never traversed the segment.
This meant that we could reduce stress on the system if there were some way to lower the number of candidate athletes from the original set to some smaller more manageable number of athletes that are actually on the leaderboard. Additionally, with this enhancement, a larger percentage of requests would fall below our maximum threshold for the number of get requests to issue, which results in lower latency, and an improved user experience, for more of our athletes.
We wanted an inexpensive way to check whether an athlete had ever traversed the segment. A bloom filter is a nifty little probabilistic data structure which models object membership in a set. If the object is in the bloom filter then it may or may not be in the set, and if it is not in the bloom filter then it is definitely not in the set. In other words, there are no false negatives but there are false positives. The bloom filter can be configured for the desired false positive probability.
The bloom filter is modeled as a bit array, which means that it is fairly space-efficient. Members are added by hashing the member with multiple hash functions and turning on the bits corresponding to the hashed values. If not all of the bits corresponding to the hashes of the object are on in the bloom filter bit array then the object is definitely not contained in the set. It is not possible to remove elements from the bloom filter. This works well for our application because false positives don’t interfere with functionality, they just result in extraneous requests to our database, and deletes are relatively rare. We are storing this bloom filter in redis, which is an in-memory key-value store. Although we do a membership check for each filtered athlete, the bloom filter membership checks are extremely fast when compared with a roundtrip cassandra query.
Because diagrams are always helpful, here’s a little illustration of what our bloom filter system looks like:
We have bloom filters for each segment stored in Redis.
When an athlete traverses a specific segment for the first time, we add to the bloom filter corresponding to that segment. We lazily backfill the bloom filters; when there is a new best effort on a segment, if a bloom filter does exist for the segment then we add to the bloom filter, and if not, we create a bloom filter for that segment.
When we go to create the club or following leaderboard for a segment, if a bloom filter exists for the segment then we filter the athlete set to those that are contained in the bloom filter for that segment (i.e. those that have probably traversed the segment).
Coincidentally, my manager Jacob has been itching for a reason to use the bloom filter data structure for about 12 years, so he was excited that it fit our use-case perfectly.
Between the new bulk get method and the bloom filters, this project has resulted in an approximately 5x reduction in latency for these expensive club and following leaderboard requests.
Thank you to my mentor Jeff, my manager Jacob, and all of Strava for a great second internship.