Becoming Data Driven Level 3: Using Data to Make Awesome Decisions

Published in

Scout24 Engineering

8 min readJun 14, 2017

In the second article in this series, we talked about ways you can detect if what you’re planning to do to your online product is a bad idea. This time we’ll be more forward thinking and talk about forecasting and prioritising work. Again, I went with an exciting title only to disappoint with the reality of the topic. Life is full of disappointing shit like this, get used to it.

It Started with a Hypothesis (I never thought it would come to this)

Say you own a website that sells socks and you’ve noticed you are shipping more cheap socks than premium socks. Based on this, you have a hypothesis: people will buy more if you increase the prominence of cheap socks. You just need to figure out how to test this hypothesis and get rich. Sock rich. You talk to your team and come up with the following experiments:

Display cheap socks on the homepage instead of the coolest socks selected by your in-house sock experts.
Change the search results to order by price ascending (cheapest first) rather than newest first.

Known Knowns and Known Unknowns

Let’s focus on the second experiment listed above: to order search results by price ascending. You may be thinking about the impact of this in one of two contexts:

You’ve done something like this before on search results or another part of the funnel so have some idea as to the potential impact of such a change. Ideally you’ve run this as an experiment and have high confidence in the results.
You’ve never done this and have no idea what to expect.

In the case where you’ve done similar work before, it’s a bit easier to set a goal: you can use data from some previous experience as a baseline and use some judgement. If the least effective positive change you have made elicited a 0.25 percentage point (referred to as “pp”) improvement in conversion and the most effective a 2pp improvement, then you have some range. You can then apply other factors to refine that range, such as relative prominence in the funnel, but you are getting yourself close to a realistic goal. It’s easy to ignore but one of the most important artefacts from an experiment is a sense of what impact can be expected for future changes. The more you experiment and understand the results, the more you learn about your product, your users and what is possible. If you’re storing the results of previous experiments somewhere, then you get an extra gold star. Well done.

In the case where you’ve never done such a change to your sock website before and have no idea what to expect, the range for a conversion rate change might be anything above or below zero percentage points. This is where something like funnel analysis can be helpful (see Becoming Data Driven Level 1: Having User Data You Trust). Now close your eyes and imagine you are doing that analysis… Nice one. Here’s what you discovered:

You have 1,000,000 sessions and 20,000 sales in total each month. This is a conversion rate of 2%.
250,000 of those 1,000,000 sessions include a search (25%).
100,000 of the 250,000 sessions including a search then look at a sock details page. 10,000 of those sessions result in a purchase (a conversion).
Search therefore has a conversion rate of 4% (10,000 being 4% of 250,000)
Search represents 50% of all conversions (10,000 sales from a total of 20,000 sales).

Let’s come up with the hypothesis that changing the order of search result to price ascending will increase the number of detail page views by 50% (from 100,000 to 150,000):

The conversion rate from search moves up 2pp, from 4% to 6%.
The overall conversion rate moves up 0.5pp from 2% to 2.5%.
You sell 5,000 more socks per month.

To keep the maths simple, I’ve assumed a linear increase in conversion, where 50% more detail page views translate into 50% more sales. This is very much a stretch as you may incur more casual browsing by showing more relevant products so consider this an absolute maximum benefit. You’ll see how close that really is by running an experiment.

Now you have a well considered hypothesis with a reasonable goal (even if you started with no idea) and an idea of what to test to reach that goal. By calculating the potential yield from our hypothesis, we have the baseline for prioritising the work to test it. You can start writing tasks and people can start designing and coding. Awesome!

… There is, however, an enormous “but” when forecasting the impact of a change to the order of search results (and I don’t just mean your momma’s*): but who cares about the order of socks on search results? The biggest mistake I see over and over and over and over and over again is that teams and companies celebrate launches and not impact. This is fine if you are contractually obliged to deliver something in return for money but not if your job is to improve your own product. If changing the order of search results doesn’t increase sale of socks, nobody cares. Go and try something else until you hit that goal. To be clear: in planning, you are looking for effort and a measurable goal. If you don’t hit that goal then learn and try something different.

Comparing Apples with Apples

A significant challenge for companies building products is knowing, from the myriad of ideas, which to do. What you should work on next may occasionally be obvious. However, it is likely that you have more ideas than you can possibly execute on. To figure out what you should do first, a lot of companies and teams perform some scheduled planning, this might happen annually but it could be as frequent as every week. In this planning, ideas from multiple sources are presented, prioritised and approved… Or rejected and ignored. As established in the first paragraph: life is hard. Here are four things that are important, but often lacking, from these exercises.

Consistent impact measurement. As previously stated, unless you have some contractual obligation, the most important thing when executing a project is the impact, not the delivery. Despite it often being quite possible to work out a goal (see above), it is common to hear project proposals without one. This doesn’t have to be a long prose, just something like “0.5pp increase in conversion”, an ability to explain how you plan to reach that number and the assumptions you made.

When it comes to multiple teams proposing a project, Conway’s law often applies and you may end up with different expressions for the same thing. For example, you might hear “0.5pp increase in conversion” stated as “305 extra socks sold”, “10 extra socks sold per day”, “50% increase in conversion from search results”, “$6982 extra revenue per month”, “increased user satisfaction from finding cheap socks” and so on. All of these could be stating the same thing but in ways that make it hard to understand and compare. It is therefore beneficial to provide your teams with a small number of metrics and the units they should use. For our sock website, this could be:

Percentage point change in conversion rate (sessions that result in a sale).
Number of new users (those without a previously recorded session).
Number of returning users (those with a previously recorded session).

A very large proportion of our sock projects should be able to impact one of these three metrics. Your product may not be a sock website, but the key thing is using simple metrics that can be understood and expressed consistently throughout your organisation.

Work backwards. An extension of the point above is this: start with a goal. Personally I love having a goal set and being given the freedom to work out how to achieve it. Start with “we plan a 0.5pp increase in conversion” and afterwards “we will try to achieve this through changing the order of the search results because we assume most people are interested in cheap socks”. By doing so, you are emphasising what is important: a 0.5 percentage point increase in conversion. You’re also expressing how you might get there but you shouldn’t be afraid to change your plans.

T-shirt sizes are localised measures. This is literally true: t-shirt sizes reflect the size of the people in the local market. Who knew? It’s also figuratively true when it comes to measuring the complexity or effort of work. Two teams might have very different ideas on what “small”, “medium” and “large” mean. So when comparing proposals from multiple teams, try sticking to how much time you will invest and who will work on it. The mental model I go with is “if a team were locked in a room to just do this, who would be in that room and for how long?”. You can express that there is some uncertainty should you run an experiment, where the results may impact your later decision making, or some other unpredictable thing. All of this is flawed, but you’re looking for a way to reasonably compare work in order to prioritise it and days, weeks and months are well understood units of time.

Your hypothesis could be wrong. But that’s cool. Optimise your execution style around this belief. It’s ok to be wrong if you’ve just learned something new about how your users interact with your product. In fact it is important. It’s less cool to expend needless energy to achieve that learning. When figuring out how to test your hypothesis, focus on the least possible effort and ways to fake the experience for users rather than build the whole thing.

Wrap Up

At ImmobilienScout24, we’ve been using planning cards to help consistently frame our projects across multiple teams so we can compare, prioritise and plan. Below is a version of that you might use for your sock website:

A planning card and an illustration of why my parents bought me a computer

As this might be hard to read, the card above includes:

What we’re planning to do (“increase conversion by promoting cheap socks”).
Our three goals and our forecasted impact (+0.5pp on conversion incorrectly written as 0.5% and no expected impact to the volume of new or returning users).
The experiments we plan to run to get there (“search results ascending” and “cheap socks on homepage”).
The assumptions we’ve made (“price is more important than recency for users”).
The people we will require, which helps us plan capacity.
How long we think it’ll take to set up, run and analyse our experiments. As mentioned above, units of time help normalise across multiple teams. It’s not cool but it is practical. Practicool, if you like.

The key message for this post is encapsulated in this planning card: have a hypothesis, something to measure and a starting point (in this case experiments); the final deliverable is less important than the goal.

The next post will be all about “Using Data to Shape Your Organisation”, where you can start applying what’s been outlined here to design team structures and processes.

* Although I do hear it has its own moon.

Becoming Data Driven Level 3: Using Data to Make Awesome Decisions

It Started with a Hypothesis (I never thought it would come to this)

Known Knowns and Known Unknowns

Comparing Apples with Apples

Further Reading

Wrap Up

Written by Scout24