Building a Feature Education Coordination Platform at Strava

Published in

strava-engineering

8 min readJul 1, 2022

Feature education is important to help users learn, understand and utilize your product. The goal of the Growth Team at Strava is to connect athletes with the value of the product and feature education is one of the most important tools to achieve this. Until recently, we didn’t have any systems or platforms that could make feature education easy to employ and use. I often saw my teams, and others, struggling to implement the feature education we were hoping to build or even just avoiding it all together. Growth Teams are all about moving fast towards solving problems and the right platforms and capabilities are instrumental in a team’s velocity and effectiveness. Platform capabilities don’t have to be super complex or take months to build. Instead, we look for repeated developer challenges or problems that, if solved, would allow us to move faster and build more comprehensive solutions. Feature education was a clear candidate for a platform capability that we should build.

In 2020, we built a very simple service called Ritmo, which means rhythm in Spanish. Ritmo is a feature education metering and impression capping service. It’s employed if you want users to see something, like a tooltip or a promotion, only a certain number of times. In essence it helps you set a pace or rhythm for feature education, hence the name. We built Ritmo because we needed to take a platform approach to our feature education. We had numerous issues with the client side solutions that had been used for many years. Let’s dive into the problems we wanted to solve and how Ritmo works.

The Problems

Problem 1: Client side flags don’t scale

Strava used local flags on clients to mark when a user saw some piece of UI, like a popup, modal or a coachmark. An example would be a flag called “user_seen_upsell_abc” which has a true or false value. If you are an iOS engineer, you’re familiar with using NSUserDefaults to store flags like these with relative ease. Using client solutions like this is quick and works pretty well, but lacks consistency across platforms, remote observability and remote logic and control. If you wanted to change the logic behind a certain flag, like the number of times a user could see a coachmark or the users who were even eligible for that feature education, you would need to ship a new app version. Business logic should sit on the server, not the client. And if you implement that logic on the client, you’re doing it repeatedly across the different platforms that you support, like iOS, Android and Web.

Problem 2: Client side flags don’t play well with server driven surfaces

The majority of our primary pages in our app are server driven. That means the server decides what UI will be displayed and sends this markup language, as well as the content to display in the UI, to the clients. The clients are agnostic to what they are displaying. This is really powerful because making server changes is fast and easy compared to making client changes. However, because the client is agnostic (or at least should be) to the content it’s displaying, local flags don’t work very easily. While there are definitely ways around this, it’d be much simpler and more elegant to use server side flags.

Problem 3: We wanted a centralized feature education system

The Strava product is really broad and we have a lot of feature education across many features, targeted at athletes at different moments in their lifecycle. For instance, when we redesigned our navigation patterns on mobile in early 2021, we had a comprehensive suite of coach marks and modals to explain the changes. A lot of the copy referenced the changes that had been made. But imagine if you were a new user signing up the week after the navigation changes — this new navigation pattern isn’t new. In fact, maybe you shouldn’t even see any of the education. We wanted to have a system that could make these types of decisions. Additionally, we wanted to be able to add logic between different pieces of feature education, so we could pace them out slowly for a user. Seeing five coach marks at once on a screen is not a great user experience.

Technical bits

First some definitions.

Promotion — The data structure that represents a component that will be metered. So if you have a coachmark you want powered by Ritmo, you would create a “promotion” to represent that coachmark.

Action — The data structure that represents user interaction with a Promotion, currently used to represent impressions (did they see it?) and dismissals.

Promotions are stored in a MySQL table. When the MVP of this service was built, we didn’t store the promotions and instead defined them in a thrift enum. This worked well for a while, but ultimately required the service to be deployed in order to add a new Promotion. This was an unnecessary step that we removed after we built an admin interface for the service.

Actions are stored in a MySQL database. This allows us to fetch all Actions for a given user efficiently. We store when an Action occurred, to which user, and for which Promotion. If an Action is representing an impression and a user has 10 impressions of some Promotion, then there will be 10 unique rows in the database representing those Actions.

There are two types of requests to this service:

1) Eligibility — checking whether a user is eligible for a Promotion

2) Recording an Action

Eligibility is calculated at run time by fetching a user’s Actions associated with that Promotion and then applying the desired filters and business logic. The most commonly used filters are usually a max number of Actions. For instance, if we want a user to see a coachmark 3 times, the filter will check whether the count of Actions for that Promotion is less than 3. We also support filters like max per day or per week. For instance, we might want to show an upsell to a user up to 3 times per week, but reset on a rolling 7 day basis. We are currently building out more complex filters taking into account more user characteristics, like registration age (new user or tenured user) or subscription status.

We built the Ritmo MVP in 2020 during some projects my team was running on the Feed (a fully server driven page), so we didn’t have to build any mobile clients initially. This allowed us to ship Ritmo in just 2 weeks to production. I want to give all the credit for this rapid work to Dickson Lui and Javier Vegas, two server engineers on the Growth Team at Strava. Our Foundation Engineering team deserves a ton of credit as well. Without them building streaming and service infrastructure that is both easy to use and easily scalable, this project would have taken longer.

Mobile Clients

After our Ritmo V1 was in production and working well, we opted to build a client for our iOS and Android apps, so that we could run mobile experiments. The client libraries are fairly straightforward. On app startup, we bulk fetch all Promotions that have been specified in a manifest list. The Promotions are stored in a local database with a cache in memory. The API that engineers use locally only reads from the local memory, so the requests are always synchronous. The only other necessary API is to record an Action. The iOS engineers on the project took the opportunity to build a developer interface in-app powered by SwiftUI, which was fairly new at the time. This allowed them to learn SwiftUI in production, but avoid creating any user-facing issues.

Using Analytics Data

One of the biggest challenges we had with Ritmo is that it didn’t work well on server driven pages that extended “below the fold”. When we first used Ritmo for the Feed projects, the feature education we shipped sat at the highest point in the Home Feed, position 0. That means that when the client requested the Feed data via our Feed API, we could be fairly certain that the user would actually see the UI associated with that Ritmo promotion. So we would record the impression Action on the server side when we decorated the API response. This did not have sound correctness, but it worked for those initial use cases with a high degree of accuracy. However, as soon as we wanted to put some feature education lower down on the Feed, this approach fell apart. We can’t assume that users will scroll and see the Promotion. Promotions can be used to meter or pace any module that we display in our server driven pages, and we have around 40 different modules, so we needed a solution that never required mobile changes and was easy for engineers to implement.

To solve this, we decided to use our analytics events. We already had comprehensive analytics tracking impressions in our server driven pages like the Feed. Our analytics library is built on top of Snowplow. Analytics events from our clients eventually end up in Snowflake, but before that, they pass through AWS’s Kinesis. The machine learning engineer on our team, Jun Sun, built a service called Laser, which is a real time streaming service that allows us to process Snowplow events in real time. This allows us to stream specific events to our own sinks. We use Laser for a variety of purposes, like using Snowplow data to train machine learning models, but we also use it to update Ritmo Promotions.

Using Laser is simple and incredibly low latency. An engineer registers a processor which lists which analytics event it is listening for and which promotion that event is associated with. That processor has instructions to call the Ritmo API and record an action when it’s executed. I really liked this solution from the team because it didn’t require any additional client code and utilized existing data pipelines versus creating new ones.

Adoption & Next Steps

Since creating the service in 2020, 75 Promotions have been created, meaning 75 different user experiences in the app are powered by Ritmo. Every user facing product team at Strava has used it and has provided valuable feedback on how to improve. Of the three problems I listed earlier in this post, we have solved the first two but have not fully implemented solutions to problem 3. This represents the next evolution of this service. We want to use user characteristics to power decisions around feature education, as well as thematically grouping different pieces of feature education, pacing them according to user needs.

Hope this inspires other Growth Teams to build their version of Ritmo and take a platform approach to Feature Education. And if you made it this far, do yourself a favor and check out a song called Ritmo by the Black Eyed Peas. It’s become the official track of the Ritmo platform and I strongly encourage new internal customers to listen to the track. Who says we can’t have fun while we build internal platforms?!

Building a Feature Education Coordination Platform at Strava

The Problems

Technical bits

Adoption & Next Steps

Written by Jason van der Merwe