How We Design for Growth At Strava

How to Design Lean Experiments to Validate Hypotheses Faster

Paolo Ertreo
Strava Design
7 min readJul 20, 2018

--

At Strava, the Growth team’s purpose is to grow the most engaged community of athletes in the world.

In order to achieve this the team is responsible for several stages of the new members experience — from awareness through to first product use. Our intention is to help athletes confirm that Strava is the right community to help them towards their athletic goals, whatever they are.

We connect our team’s broader goals with specific projects using data-driven insights and research to guide us. Then, we ship experiments to constantly test if our assumptions match our users’ reality, and iterate to close any gaps.

Designing for growth

Just like any product designer, a growth designer must be an advocate of the user and of the business simultaneously, always thoughtful of maintaining equilibrium. This balance ensures both usability and strong product performance are achieved.

As growth designers, we must test our way into each project, often starting with purposefully bite-sized designs that are aimed at expediting our learning and helping us validate our hypotheses as fast as possible.

A “failed” experiment may never be released to our entire athlete base, thus we must be thoughtful of the time and resources we invest into it. When designing with this in mind, we must ask ourselves if a particular element of the experience will have a positive and measurable impact on the outcome of the experiment. If it doesn’t, we must de-prioritize it to be part of a later phase to be built and released if the experiment should prove successful. This approach ensures we can validate a hypothesis in the most inexpensive way possible.

We are comfortable talking about data and very aware of the fact that our design decisions must have a measurable impact. Our approach is obviously driven by quantitative data, yet we still validate our design decisions with qualitative data, too. On larger projects, this means conducting user interviews with our athletes. On smaller projects, this means performing usability tests to uncover potential areas of weakness in the experimental experiences we are designing from the onset. This ensures our designs are always data-informed yet user-centered.

Our Process

1. Defining hypotheses

In a group setting, the growth team defines the hypothesis we seek to validate and the business metrics we desire to influence with our experiment.

These adhere to broader company goals and may be informed either by quantitative data, qualitative research, or both.

Our hypothesis becomes a North Star of sorts that keeps us grounded and focused on our KPIs, ensuring we stay within scope during the design and development phases.

2. Designing experiments

Each project has its own specific metrics we are attempting to influence. These may be as simple as install or signup rate, or more complex longitudinal measures such as retention or upload activity rate.

We approach each experience we design as an experiment that is aimed at expediting our learning and validating/invalidating our assumptions.

An example of this approach is our activity tagging feature (shown below).

Activity tagging in its first phase (left) which allowed athletes to invite friends to Strava from an activity view, and in a later phase (right) which allows them to actually add other Strava athletes or friends that didn’t record.

When first launching the experiment, we wanted to validate the hypothesis that athletes had high intent of inviting friends they had worked out with but who had not recorded the same activity (or weren’t on Strava) and would “add” (invite) them when viewing that activity in the app.

To validate this assumption, we first launched a simplified mobile experience that allowed athletes to invite others to Strava using the native share sheet from the activity detail view.

UI and copy tests designed to increase feature adoption and outbound inviting.

Once that initial hypothesis was validated (and after several copy and UI tests), we shipped a more refined version that allowed the activity owner to share an actual copy of the activity rather than an invite, and prompt the person receiving that copy to save it to their profile and customize it.

We later followed with the latest version, which also allows the activity owner to seamlessly add other Strava athletes in addition to adding friends that aren’t on Strava yet.

3. Releasing experiments

Once an experiment is built, we release it in a controlled manner to a portion of our athletes as a test.

Experiments are structured so a new design is compared to a control group (the original design) in an A/B or multivariate test.

Testing an experience or feature before releasing to the entire user base allows us to isolate its impact and iterate further if performance differs from the expectation. Additionally, this allows us to test in select languages (such as English) and forego localization processes, expediting the testing process further.

An example of an A/B test. Our hypothesis was that surfacing signup options immediately upon app load would increase overall signup rate, especially through Facebook.

There may be instances in which we do not have an existing experience to use as the control group for our experiments, such as when launching a new feature. Activity tagging was a example of this.

Our hypothesis was that inviting another friend that worked out with you but didn’t record would drive more invites than our standard invitations. There was no existing version of this feature, but we were able to compare it to our standard invites which use the same primary KPI (new user registrations).

During testing, tagging showed a 5x increase in invites sent, compared to standard invitations.

Two features that shared the same KPI (outbound invite volume, new user registration), used for performance comparison.

4. Evaluating experiments

We return to an experiment after it has been running for a defined period of time (usually two weeks) or until statistical significance has been achieved, and then we dive into the data to analyze the results as a team.

There are instances when reaching statistical significance could take several weeks if not months, for example, if we’re running tests on experiences with lower volume of usage. In these cases, we can opt for an A/B test rather than a multivariate one, which expedites learning by limiting the distribution of user impressions or traffic to only two variants.

5. Graduating experiments

Based on the insights we have gathered, we decide whether the experiment should “graduate” into an experience for all our athletes.

If the experiment was successful and our hypothesis is validated, we will roll it out to our entire athlete base.

If it wasn’t successful and our hypothesis was invalidated, we’ll dig deeper into the data with the help of our product analysts. They provide insights on why performance may have differed from our expectations, which informs our design for follow-up tests and uncovers areas of opportunity. In these cases we may also return to our initial research or usability tests in an attempt to find any qualitative insights that may help pinpoint what we’re seeing in the quantitative data and reveal a more comprehensive story.

6. Leveraging qualitative data

Our approach isn’t always purely quantitative and scientific.

We may conduct early user research to validate our assumptions in a qualitative manner and to gather early signals that will inform whether the project at hand has the potential to drive the metrics the growth team cares most about. If it doesn’t, we will consider transferring ownership to another team or moving on to ensure we concentrate our attention on areas of the product with high growth potential. This allows the team’s invalidated hypotheses to still be highly valuable to other teams that operate with different business goals.

For example, the growth team learned from an experiment that a particular change in the way we presented content on a mobile page led to a ~2x increase in outbound invites compared to control. As a result, when the team that focuses on that area worked on a redesign, they leveraged this quantitative data and other qualitative learnings from the growth team to maximize product performance.

Conclusion

A growth oriented approach to building products will ensure you validate hypotheses in a fast and targeted way. You can do this by starting with a clear hypothesis you wish to validate and a series of metrics you wish to influence with your experiment.

Keep your designs simple and prioritize features and components by asking yourself if these will have a direct impact on your target metrics. Deprioritize them if they don’t, but never compromise usability or clarity of the experience (i.e. avoid dark patterns).

When your experiment is ready, release it to a portion of your users that is large enough for statistical significance to be reached as quickly as possible (try two weeks). If your user base is not very large, opt for an A/B test instead of a multivariate one.

Your ultimate goal should be validating your hypotheses as quickly as possible and in a cost effective way, in order to graduate your successful experiments to your entire user base and maximize performance increases.

We’re excited to hear how you design your experiments! Let us know by leaving a comment below.

If you are interested in helping growing the most engaged community of athletes in the world, check out our open positions.

--

--

Paolo Ertreo
Strava Design

Design @Dropbox, previously design @Strava. Design mentor @InneractProject.