Photo by Ian Gonzalez on Unsplash

How to estimate customer lifetime using churn rates

An introduction to the Geometric Distribution

Prash Majmudar
7 min readFeb 22, 2019

--

Recently I was listening to Tom, our CEO, explain how we use average customer lifetime in our Return on Investment (ROI) calculations— it’s one of many elements that we estimate when building a business case with our customers. Tom explained how we use 1 / churn to estimate average customer lifetime — a commonly adopted approach. For example if we know that the churn is 20%, we would estimate the average lifetime as 1 / 0.2 = 5 years.

Listening to him, I realised, I didn’t know why this was valid. It is simple to estimate and is an accepted method, but I wanted to dig into why it’s used.

Defining churn and lifetime

To get started, it’s important to explain what we mean by churn. We can define churn as the probability that at the end of a year a customer leaves. For example, if churn is 50%, then at the end of a year we flip a coin: heads the customer churns, tails the customer renews. Obviously we can pick any timescale to consider churn over, it doesn’t have to be a one year (e.g. it could be daily, weekly or monthly churn).

Clearly 50% churn is insanely high, so let’s continue by assuming a 20% churn rate. We can write the probability, P, of a customer churning after 1 year, Y, like this:

We can also ask the question — what is the probability that a customer will churn after two years? In this case they renewed in year 1 (we had a 80% chance of renewing), but they churned at the end of year 2. So the probability of churning after 2 years is:

Thus, we have a 0.16 or 16% chance of a customer churning after two years. Similarly we can ask what is the chance of a customer churning after three years, or even ten years (nine renewals and subsequent churn)?

We calculated a 12.8% chance of a customer churning after 3 years or, equivalently, having a lifetime of 3 years. We can also say there is a 2.7% chance of a customer having a 10 year lifetime.

A simple model for churn: the geometric distribution

The last calculation was a little tedious to write, but we can write it more generally as:

where p is the probability of churning — a value between 0.0 and 1.0.

This equation describes the geometric distribution, sometimes called the shifted geometric distribution. It is a probability distribution used to model the number of trials (in our case a trial is each successive year a customer decides to renew or churn) up to an event (in our case the churn event).

To show how it related to our earlier examples, for a 20% churn probability, p=0.2:

We can plot this probability against each year — to visualise the chance of a customer churning after 1,2,3,4… years. We can also plot what this would look like if p=0.1, i.e. with a 10% probability of churn:

Probability distributions of Customer Lifetime with 20% (left) and 10% (right) annual churn

These distributions seem intuitive — as each year passes, the chance of a customer remaining reduces. What this also tells us is that the most common lifetime for a customer is 1 year — this is the mode of this distribution.

In the left-hand plot, as expected, we can see the probabilities are 20% after one year and 16% after two years — as we calculated previously. The probability of retaining a customer for five years is only 8.2% and yet (perhaps unintuitively) this is what we estimate to be the average customer lifetime. How can this be?

The answer lies in the long-tail of this distribution: unlike the normal distribution the most likely lifetime is not equal to the average lifetime for the geometric distribution. A lifetime of one year is most likely (20% chance), whilst the likelihood of five years is only 8.2%. In fact, some customers will only churn after 10, 20, or more years. These small chances of longer lifetimes all contribute to the average — i.e. these longer-term, loyal customers average out the shorter lifetime customers.

We can calculate the average lifetime as the sum of all possible lifetimes of a customer multiplied by the chance of a customer having that lifetime. A simple example of such a weighted sum is the average number you get from rolling a die. The outcomes are either 1,2,4,5,6 and the chance of each outcome occurring is 1/6— hence the average score from repeatedly rolling the die is:

Similarly, we can proceed to calculate the average customer lifetime by calculating the weighted contribution each year makes. The first few contributions are:

This sum goes on forever, but we can plot how it grows with each contribution:

How the average lifetime converges to 5 years for 20% churn

As expected this average converges towards 5 years after the contributions from the first 30 terms. It turns out we can show that this sum is exactly 5 years — but in order to do that we’ll need to get into some maths.

Limitations

Before getting to that, it’s worth explaining the limitations of using the geometric distribution as a model for customer lifetime.

Firstly, this model assumes that each year the customer churn event is independent of previous years (e.g. is always 20%) i.e. regardless of whether the customer renewed in prior years. This independence assumption intuitively feels too simplistic — you may expect your first-year churn to be 20%, but if a customer renewed in year one, you’d expect the chance of renewing or churning in year two to be strongly correlated to that outcome. Unlike flipping a coin, where the outcome of each flip is completely independent of the previous flip — the chances of a customer churning each year are not independent events.

The second assumption is that the churn probability is always 20% — again maybe this is too simplistic. Different types of customers may behave differently e.g. perhaps smaller customers churn more easily. In fact, we may have many pieces of information about a customer e.g. how engaged they are with your product, when they last responded to an email, how rapidly they’ve grown, which sector they operate in — all of which could be important signals when predicting churn. These signals can’t be incorporated into the simple model we’ve so far described.

Finally the possible outcomes each year might not be two-fold: customer churns vs customer renews. Customers may become dormant or go out of business; we may want to model these outcomes differently.

A more flexible approach, e.g. using Machine Learning, can help overcome these limitations — allowing you to incorporate additional data or signals and train a more sophisticated model of churn.

Summary

When building business cases, it is perfectly acceptable to use 1 / churn as an estimate of average customer lifetime — this is a consequence of modelling lifetime using the geometric distribution. It should be noted that this approach is not particularly robust for more sophisticated use cases.

I’ve just tried to cover the basics of probability distributions, averages and infinite sums — if you’re interested in one way of proving that the expectation really is exactly 1/p then this is covered in the final section.

Average customer lifetime is exactly 1 / p

In order to calculate the Average Customer Lifetime, we need to evaluate the infinite sum of weighted contributions. This sum is called the mean or expectation, E, of the probability distribution and it is given by:

To remind ourselves, we are trying to prove that this expectation evaluates to 1 / p or 1 / churn. Let’s work through this, substituting in for P(Y=y) from earlier.

Let’s substitute 1- p with q for convenience:

The trick here is to recognise that the summation can be replaced by a derivative of another summation — the derivative of a power series (also note that the summation starts from y=0 now):

Now, we can recognise that we are taking the derivative of a geometric series:

As we know that q < 1, the geometric series will converge to:

We can substitute this for the summation:

If we evaluate the derivative and simplify:

Hence, we know for the geometric distribution, the mean is 1 / p and hence why it is valid to take this value when estimating average customer lifetime from churn.

--

--