How to Prevent Customer Churn With Machine Learning

3 steps to prevent churn without overthinking it

Published in

The Startup

5 min readJul 2, 2020

You’ve been hired as the first data scientist at a hot new startup, with a revolutionary idea that can change the world. The vibe is electric at the office; everyone knows you’re working on something special that only comes around once in a decade.

Thanks to some good marketing efforts, users sign up in droves! In order to take the company to that 10X level, however, users need to get hooked. The problem? Most don’t even last a month.

Meme made by Richie Frost, using Mematic

“You’re the data scientist”, they say. “Tell us what the data says. Why are they leaving?”

Is the app buggy? Not exciting enough? Poor product-market fit? These are important questions that aren’t always easy to quantify.

There are a million different directions you can go from here. But you don’t have forever to do this. If you don’t sell (AKA get more subscribers to stay), you and your team don’t eat. So you really need to spend your time on high leverage activities in order to be effective.

What I’m going to tell you isn’t new. But techniques you use first in your churn analysis could make all the difference. Here are three steps to using machine learning to prevent churn.

But first…

What is churn?

To put it bluntly, churn is when your customer has decided to, well, stop being your customer. If they’re a subscriber, they’ve canceled. Simple as that.

Obviously that’s not something we want. So how do we prevent it?

1. Figure out when they churn

Technique: The Engagement-Retention curve

Some users stay longer than others. How long does a user need to be engaged before we know if they’re hooked?

The engagement-retention curve tells you how engaged users need to be before they’re hooked. Typically, users who are more engaged over a period of time are more likely to stick around when there are bumps in the road.

Here’s an example. If you’re measuring engagement by the number of requests users make to your chat bot, your engagement-retention curve might look something like this. This x-axis is the minimum number of requests per user (i.e. ≥ 10), and the y-axis is how likely those users are to stick around:

Example engagement-retention curve for a hypothetical chat bot startup

It’s pretty clear that the retention rate levels off around 90–95% once users hit around 30 total requests. If you can get users to that point, you’re much more likely to crush it. You’ve also got some clear milestones along the way that you can use for setting goals for user engagement in order to keep them engaged long-term.

2. Focus on the most valuable users first

Technique: Clustering

One common way to find out which users are most valuable is with an unsupervised machine learning technique called RFM clustering (Recency, Frequency, Monetary Value). The trick is to bucket or bin each individual metric and order the bins by business value before combining them into a single value score. A good place to start is to have 4 bins per metric.

Once you’ve calculated these metrics, you can easily see how likely your users are to be valuable. For example, an excited new user might have made a request fairly recently (recency bucket 4, highly valuable), talks to the chat bot every day (frequency bucket 4, highly valuable), and made a purchase or two (monetary value bucket 2, somewhat valuable). They’d have an overall value score of 4 + 4 + 2 = 10 out of 12.

Cluster your users by this value score to see which groups tend to be the most valuable to the business and leaving sooner than expected, and focus primarily on these users for the next step.

Here’s a simple example of how to do this in Python using Pandas and Scikit-Learn:

Simple RFM clustering example

3. Figure out why they churn

Technique: Calculating entropy with Decision Trees

Now that you’ve identified your most valuable users, you should figure out why they’re churning.

One quick way to do this is to borrow an idea from information theory — entropy. You can think of entropy as a way of quantifying how much “surprise” there is in a random variable. When looking for reasons why users churn, sorting attributes in the data by their “surprise factor” gives you the biggest rocks to move first.

In practice, it’s easy to use machine learning to get the most “surprising” reasons for churn. Decision trees are a machine learning algorithm that calculates the relative “surprise” of each attribute in your data to decide how to separate training examples as cleanly as possible.

Lower entropy means there are more examples of one outcome than expected, hence the decision tree’s “surprise” that there are so many of that class. When there are mostly churned examples for one attribute split, for example, it’s likely that the attribute used to calculate entropy is a strong indicator of a reason why customers in that group have churned.

Feature importances in decision trees tell us which features, on average, have lower entropy than the others. So sorting by feature importance will give us the most obvious reasons for churn.

Say we have an imaginary CSV file with only the most valuable customers from step 2. We can load this CSV into a Pandas data frame, then fit this data to a decision tree to get the feature importances. Then, sort the features by their importance to highlight the most relevant features. Let’s do this in Python:

Determining reasons for churn with Decision Trees

Once you have the reasons for churn that matter to your most valuable users, you can leverage the results of your churn model and take action.

Subscription fees too high? Experiment with offers, or even lower fees a bit.

Too many bugs in the app? Work with your engineering team to prioritize bugs.

You get the picture.

Interestingly, this type of churn analysis can also tell you the good things to stick with — the things that keep users from churning. When the positive attributes in your data are most important, you know that that’s what your customers love.

What’s next?

What’s important moving forward is to measure progress effectively. Based on the results of your churn model, run A/B tests on random samples of your users to see if the actions you determined from step 3 actually do have a good chance of decreasing churn. And decide as a team whether the effect of the experiment is good enough to apply to the masses.

Conclusion

In this article, we used machine learning to figure out:

when customers churn
who’s most valuable
why valuable customers churn

Then, based on the results from your churn analysis, take action — by running and evaluating experiments that have potential to make big changes. Your customers will thank you!