Designing a better feedback loop

Kevin Schiroo
Camber Product, Data & Tech
8 min readFeb 11, 2022

Using data science to improve our ability to learn

Photo by Tine Ivanič on Unsplash

Without effective feedback loops all businesses are on a random walk towards success or failure. Each new idea is a step in one direction, we hope that it is towards success but there are many more paths to failure. Knowing this it’s worth taking a moment to think about how we can make our feedback loops better. Over the last several years I’ve spent working in data science I’ve become convinced that its greatest value is in its ability to build better feedback loops. Good feedback satisfies two requirements:

  1. It is prompt. It’s very difficult to learn anything when action and result are separated by large amounts of time.
  2. It is unambiguous. Needing to spend large amounts of time trying to figure out what the feedback actually means dramatically increases the odds that it will be misinterpreted and decreases the odds that people will pay attention to it at all.

These are both points in which the tools of data science can be brought to bear to design better feedback loops.

To put this in concrete terms let’s look at an example. Let’s pretend we’re working in marketing for a SaaS company. We have a product that offers a four week demo before customers must either convert to a paid plan or leave. As marketers our goal is to get as many soon-to-be paying customers into the product as possible. How do we set up our feedback loop to help us achieve this goal?

Iteration 1: The simple metric and its proxy

A natural starting point might be to just look at the goal directly and draw feedback from the number of conversions we’re seeing in the product. Let’s go to the numbers!

Now for that core question of learning: Is what we’re doing working? The initial answer would seem to be “yes, we’ve recently seen a really substantial uptick in conversions”, but let’s take another moment to think it through. The product has a four week trial period so people are probably waiting most of those four weeks before converting. The feedback we’re getting right now really isn’t about what we’re doing right now, but rather what we were doing four weeks ago. Is that when we started using that new channel or is that just after that one trade show? Having feedback so substantially delayed dramatically limits its effectiveness: it isn’t prompt. This leads us to go to the next best thing, a proxy.

Soon-to-be paying customers are what we really care about, but it’s really difficult for us to learn anything if we need to wait that long, so instead we need something that is related to the metric we care about, but can be observed much sooner. Sign ups seems like it would fit the bill.

Now that metric tells us a different story. We had been doing pretty well four weeks ago, but in the last week we actually have seen a pretty steep decline; we should figure out what’s causing that. The sign up proxy gives us the promptness that we need for effective feedback.

Beware though, because this metric doesn’t provide us unambiguous feedback! This is especially dangerous because as humans it’s very easy for us to forget that the proxy metric is just a proxy and not the actual thing we care about. Combine this with Goodhart’s law and danger emerges.

Goodhart’s Law: When a measure becomes a target, it ceases to be a good measure.

Remember that our goal is to juice these numbers. Our metric is sign ups. We do all the math and find out that with the current conversion rate and customer lifetime value (LTV) we can responsibly improve our numbers by offering a free T-shirt for every new sign up. We launch our campaign and sign ups go through the roof, high fives all around! Yet lo and behold in four weeks those conversions aren’t manifesting and in fact the raw conversion count is actually lower than normal. It turns out that most of those people were just signing up for a free shirt and all that volume actually made it more difficult for the sales team to find the good sign ups to work.

While this example may be extreme, it illustrates the risks of using proxy metrics, especially when there isn’t good visibility and focus on the metric that actually matters. In our effort to improve them we can make their meaning more ambiguous.

Iteration 2: The cohorted metric

Aware now of the hazards of proxy metrics we go looking for something better, something that keeps us focused on what actually matters, but also gives us feedback quickly enough that we can actually learn from it. This is where cohorting can become useful.

Cohorting is simply the practice of grouping people based off of some characteristic, in this context that characteristic would likely be sign up time. We group all of our sign ups by sign up week[1]. Once they are cohorted we monitor the conversion rate by week.

Let’s take a moment to discuss what this chart means. Along the left side is the cohort date and along the top is the cohort age, the numerical values between them denote the percentage of the cohort that converted by that point in their lifecycle. If we look at January 1 (the cohort for all accounts that signed up during the week of January 1) we see that 0.2% converted by the end of the first week, 0.5% converted by the end of the second week, 1% by the end of the third week, and 4% by the end of the fourth week. Why are there a bunch of blank cells in the lower right corner? Those are weeks that haven’t happened yet, each cohort is one week younger than the previous one.

What does this chart tell us about how we’re doing right now? It looks like what we’re doing is working! Between January 15 and January 22 our one week conversion rate saw a meaningful increase, if we can maintain that while also maintaining our sign up volume it’s a good sign for the weeks ahead.

Cohorted metrics allow us to get better feedback much more quickly than we would otherwise. Often the effects of our current actions get washed out by the momentum of our past actions. In this case where most conversions happen in week four it would be difficult to see the 0.05% change in week one that tells us we’re on the right track.

It is important to remember though that these metrics still are not perfect. In this example if we allow ourselves to get too focused on the early conversion figures we risk spending a bunch of energy trying to accelerate the conversion timeline rather than increase total conversions. It’s nice if we can move conversions from week four to week one, but if the goal was to increase overall conversions we missed the mark.

Iteration 3: Pull in the meaningful metric

In our ideal world the moment we act we would see how that action impacted the conversions. This gives us a tight and clear feedback loop that we can act against. This is a case where we can use machine learning to support human learning.

Once we’ve managed to get a sign up into a properly instrumented SaaS product they very quickly start demonstrating their intentions. There is no deep magic here — if you could watch over someone’s shoulder you’d probably quickly get an idea as to whether they intended to convert. A person who signed up to get some free swag acts differently than someone who is seriously evaluating the product. Machine learning simply lets us answer that question at scale.

We ask our machine learning model one question, “Based off of everything we’ve seen this person do, do they look like the sort of person that is going to convert within four weeks?” In asking this one question we pull in the metric we actually care about from 4 weeks out to right now.

Likely to convert (LTC)

Now we can see things very quickly and very clearly. How are we doing this week? Better than last week and a lot better than two weeks ago. We are able to keep our attention focused on the thing that really matters, conversions, while still getting feedback in time to act on it and learn from it.

While there are hazards to be aware of with such an approach, they tend to originate from places that are fairly easy to avoid. The trickiest one is the desire for an explanation as to why things are the way they are. Why are number up? Why are they down? The truth is that machine learning models are correlation engines, they can’t determine causes. There are methods for causal inference in the scientific community, but they take significant amounts of effort and still tend to leave results requiring asterisks[2]. Confusing correlation with causation will just lead you down blind alleys.

To illustrate the point suppose we find a relationship between inviting three other users and converting, does that mean we should launch a campaign to get people to invite three new users? Probably not. It’s more likely that people who are seriously considering converting are also going to invite other users. Pushing them to do something they aren’t ready to may actually hurt conversions.

Humans and machines share an odd commonality here — you can make them talk, but you can’t make them tell you what they don’t know. It’s better to take the soft approach, let them tell you what they can and let human learning do the rest. Use their feedback to form theories with all the information you know that the machine simply can’t. Try them out, and let the machine learning model give you the quick feedback you need to refine your ideas and try again.

Effective feedback loops are key to guiding a business towards success. In order to determine if we are heading in a good direction we need prompt an unambiguous feedback. The tools of data science help us deliver that high quality feedback and are well worth the effort.

[1] A week tends to be a nice cohorting period in that it smooths out the arbitrary peaks and troughs that come with weekdays and weekends.

[2] They often tend towards, “We weren’t able to show that it wasn’t casual”, which is an improvement, but not a silver bullet.

--

--