Setting the Stage
If you were sitting down to make a ride-tracking platform for cyclists, you’d probably consider storing certain ride details as part of the design. Perhaps you’d pick out some key fields to store like speed, distance, climb, date, time, and bike model. You’d figure a relational database table would work perfectly for storing this information, so you might create a table named Rides where each field you want to store is a column on the table. Awesome!
Later, you might decide you want to add running to your platform (sounds familiar, huh?). Thankfully the fields you used for cycling fit reasonably well with running, so you decide to rename your relational table from Rides to Activities. There are a few differences in running stats — you want to show pace instead of speed and store shoe models instead of bike models — but you can deal with this by transforming the existing values into the new values when rendering them for a viewer. Still awesome!
But suddenly, triathlons explode in popularity and now you want to add swimming. Also, cross training is important to your core athletes and they’ve been asking you to add support for weight lifting, yoga, rowing machines, you name it! On top of that, device manufacturers keep coming up with cool new sensors and app developers are inventing their own unique metrics, but where do you put average muscle oxygen saturation and Splat Score for the athletes who want to track them? Should you add column after column knowing that most columns will be empty for any given activity? And knowing that the bigger the table gets, the more painful each column addition becomes? Suddenly the Activities table that worked so well for your cycling platform seems frustratingly constrained. That’s definitely not awesome.
This is the position in which Strava found itself in 2017.
All the Data
In early 2018, Strava took on what was known as the “All The Data” project: an effort to enable storage of any arbitrary activity data. By the time the project kicked off, it was abundantly clear that adding more columns to the Activities table in our relational database was not the best way to address the growing diversity of activity data. We needed a more flexible storage system to handle the myriad of activity values.
We decided to first tackle one specific piece of this puzzle: storing any number of arbitrary activity fields for a given activity, where an activity field was a piece of activity-related data that could be represented by a single floating point value. String fields were excluded from this definition, as were activity streams (the time series of distance, instantaneous heart rate, latitude, or longitude measurements over the course of the activity). Examples of activity fields that would be stored in the new system included average cadence, number of ski runs completed, and total calories burned during an activity.
New Service, New Storage
Strava has been moving away from a monolithic app for several years now, so we knew our storage system would live in a new, separate microservice. We’ve standardized on Scala as our language of choice for new services, so it was an easy decision to design this system as a new standalone Scala service.
But what storage would we use to back this new Scala service? The first answer might have been to add a new relational table containing polymorphic keys and values for various data types alongside our Activities table, but we wanted to make sure to explore all options before accepting the first thing that came to mind. As a first step to analyzing storage options, we looked at the data access patterns expected in the new system. Writes were straightforward: the vast majority would come from activity uploads (each upload results in multiple fields written to this new system), with a small fraction of writes coming from activity edits and deletions.
Read access, on the other hand, was a bit more complex. Data from this new system would be featured on the detail page for activities, but we also wanted to show some subset of an activity’s data in the activity feed. These two uses meant we would need to support requests for data for both a single activity id (activity details page), and a set of activity ids (activity feed). Additionally, we wanted to support a previously-impossible¹ read pattern: reading all data for an activity field, such as average heart rate, given an athlete id and a time range.
MySQL was considered as an option, with a tentative idea that we could create a table with columns for activity id, activity field id, and activity field value. However, we were concerned about the write load and table size for this solution — if there was a row for each different field in each different activity, we’d be writing multiple rows per activity created. Each write transaction would be fairly heavyweight thanks to the ACID guarantees of the database and the requirement to write through a single master node. Another concern with the MySQL solution was that we would not be able to support our time range read pattern without significant denormalization.
Looking beyond traditional relational data stores, our arbitrary scalar field data was also a good fit for a key-value store. We envisioned a model where the key in this store was a combination of activity id and field type, and the value was simply the value of that field for that activity — very similar to the originally considered MySQL approach. However, with the right store, we thought this could achieve better performance and serve all of our desired read access patterns better than MySQL would.
Given the potential advantages of a key-value store, we decided to investigate Cassandra as another storage option. We were already operationally familiar with it, as Strava uses Cassandra for leaderboards, activity grouping, user sessions, and many other systems. We expected Cassandra to outperform MySQL for writes; where MySQL flushes every write to disk, the storage underlying Cassandra is essentially an in-memory structure that is flushed periodically². Less-frequent disk access means potentially higher write throughput. Additionally, Cassandra provides a more relaxed consistency model than MySQL, allowing any node in a partition to accept writes and allowing clients to read potentially stale data. Not being bottlenecked by a single MySQL master for writes would greatly improve our ability to scale up write throughput. Finally, any individual node in the Cassandra cluster can be lost without blocking reads or writes, making for better fault tolerance compared to replicated MySQL. Based on these advantages, we decided to move forward with Cassandra as the underlying storage for this new system.
With our storage system selected, we got down to the business of designing the table schema. Cassandra facilitates scaling by distributing data across nodes via the partition key. If all data for a query exists within a single partition, reading and returning that data is straightforward: the node in the Cassandra cluster that handles the query, also known as the coordinator node, only needs to deal with one partition. If data is distributed across partitions, the coordinator must deal with issuing requests to multiple partitions, handling failures, and combining the results for the client. Note that this simplified description of read queries excludes discussion of replication and consistency level, however, the core principle is the same: as the number of partitions involved in a request increases, so do the potential points of failure and potential overall latency.
We selected athlete id as the partition key for our schema, which guaranteed two of our main read patterns would be fully contained in a partition: 1) requests for all data of a given activity and 2) requests for all data for a particular activity field and a particular athlete over time. This choice of partition key meant that reads for the activity feed would be distributed across multiple partitions. We decided this was an acceptable performance risk because we knew that most feeds contain a small distinct set of athletes and initially very few activities would have data stored in this new system. If the performance of the new system suffered as it was used for a higher percentage of activities, we would consider adding a read cache.
With a partition key selected, we moved on to clustering keys. These keys affect the order of data within a partition. We decided to focus on the time range query case — for example, a request for all average heart rate values for athlete X in the last month. In this case, athlete X already limited the query to a single partition, which was good for performance. Next, we wanted to make sure that all the heart rate values would be adjacent to each other; to accomplish this, the field type (“average heart rate”, “ski runs”, etc.) was selected as the second clustering key. In order to efficiently use time boundaries, activity start time would be the third clustering key. Lastly, activity id was chosen as the fourth clustering key for edge cases where an athlete had multiple activities starting at the same time. With this clustering schema, data in a partition would be organized by field type, and within each field type it would be organized by activity start time and then activity id.
While this composite primary key of athlete id, activity field, activity start time, and activity id would make the time range query performant, it did not serve the read pattern for the activity detail page very well. For this, we turned to the concept of a secondary index in Cassandra. A secondary index provides the ability to access data directly by attributes other than the partition key. We added a secondary index on activity id to support queries that would specify activity ids but not specific fields or times. Secondary index access in Cassandra can be a performance problem, especially if multiple partitions must be accessed to serve the query. To mitigate this performance concern, our implementation made sure to always narrow the query by specifying the partition key (athlete id), in addition to the activity id, when using this index for queries.
With the schema in place, it was relatively straightforward to write queries for our supported read/write patterns and expose them via the new service’s API.
Integrating and Using All the Data
After the service was designed and built, we needed to integrate it with our existing systems to actually store and display new fields. This project was one of our earliest attempts at taking an entity that existed as a single logical object in our large Ruby on Rails application and splitting it across multiple backend microservices. In this case, we were taking an activity summary that was stored in a single MySQL table and moving the responsibility for storing and serving some of those values to Cassandra fronted by a Scala service. The mechanics of this change and the learnings that followed merit a deeper dive, which we will tackle in the next installment of this blog series.
This project was all about enabling a new core capability in the Strava platform, and from that perspective, it was complete (as complete as software ever is) by mid-2018. However, it took a while longer for the new capabilities to surface in user-facing features. Here are some places you might see it in action today:
We have long received feedback from our users that the calories on their recording device should match the calories on their Strava activity. Strava never stored the calories reported by the device before, but instead calculated its own estimate from other activity data. The new system provided a place to store calories, so we are now able to show athletes the estimated calorie burn from their recording device.
Downhill snow sports
We’ve had some downhill snow sport activity types for quite a while, but we were not displaying much meaningful data. On top of that, there was one big engineering pain point for downhill snow sports: the activities table has a total elevation gain column, but no column for elevation loss (also known as vertical). For some time, we were operating with an embarrassing code hack to store the vertical in the gain column for certain activity types. The new data storage system allowed us to store vertical, properly labeled, for downhill snow sports and ditch this hack. It also allowed us to add some new fields to these activities like number of runs — things we couldn’t have done before.
Building new functionality that is intimately tied to an existing application feature can be a tricky process. Ideally, the original feature backend is extensible and well suited to modification, but in reality the required changes may be unreasonable or even impossible. When Strava decided to store more varied activity data, we found ourselves in the latter case. Some extra work and time went in to designing and building a completely new storage system, but we were rewarded with scale and functionality that we would not have achieved otherwise.
For the next installment of this series, we will dive deeper into the process of integrating this new system with the existing concept of an activity in our Ruby on Rails application.
: While it is possible to write a query to accomplish this date range request in MySQL, due to the schema and size of the activities table, it is too slow to back widely-used product features.
: MySQL flushes every write to guarantee durability. Cassandra achieves durability via replication of each write to multiple nodes; if one node goes down and loses unflushed writes, other nodes still have knowledge of those changes.