Looking at retention & lifetime value with data science

Yuiti Ara
Liv Up — Inside the Kitchen
6 min readJul 30, 2019

In this post, we will present a little bit about the Beta-Geometric/Negative Binomial (BG/NBD) model and the Gamma-Gamma model, simple but very useful Bayesian models that we have used here at Liv Up to evaluate retention and lifetime value of our customers.

Why is it useful?

Customer retention and customer lifetime value are both very important metrics for any e-commerce to evaluate the past relationship with its customers. With these, we can answer important questions such as:

  • How frequently our customers have returned to our e-commerce?
  • How much revenue have our customers generated?

However, in some cases it would be very useful if we could look at these metrics in the future, then we would be able to answer a bit more interesting questions such as:

  • How much will this new feature improve/deteriorate retention in our e-commerce?
  • How much revenue will this customer acquisition campaign bring in the long run?

How does it work?

So the context in which we are working is the following:

We will have customers with histories of purchases and would like to know how the future timeline would look like in terms of purchases. From this scenario we are going to focus on the following historic information from each customer:

  • How long ago a customer did their last purchase (recency)
  • how many purchases they made so far (frequency)
  • how much they spent so far (monetary value)

Now we break the modeling problem in three questions:

  • How likely it is that a customer will buy again?
  • How many purchases we would expect a customer to make in the future?
  • How much can we expect him to spend in the future?

For the first question, the model will try to answer the opposite, how likely it is for a customer to never buy again. It will assume that after every purchase, a customer will have a probability p to not return and that this event will happen following a Geometric distribution on each day following the purchase. In order to model the variance between different customers, the model uses a Beta distribution. With that we will have a Beta-Geometric distribution, that is part of the BG/NBD model.

Geometric & Beta distributions

Now for the second question, what we could say about the number of purchases a customer would make in a given period? For that, the model assumes that each user will have a natural purchase rate, and it uses a Poisson distribution to model this behavior. For a given time period, some customers will have a higher purchase rate and some will have a lower one. To capture this variance between different purchase rates, a Gamma distribution is used. The combination of these distributions, a Gamma mixture of Poisson distributions, is also known as a “Negative Binomial distribution”, which makes the second half of the BG/NBD model.

Poisson & Gamma distributions

Finally for the third question, how much can we expect him to spend? For that, we will have a Gamma distribution to model the purchase average value for each customer. To capture the variance in the distribution of value spending among each customer we will have another Gamma distribution. As a result, we have our Gamma-Gamma model. One caveat is that this model will assume independence between the number of purchases and their values, which may or may not be the case depending on the context.

How can I use it?

For practical use here at Liv Up we use the lifetimes’ package, an awesome python implementation for both models by Cameron Davidson-Pilon. Below you can check a few interesting ways you can use the results from the models.

Probability to Buy Again

The blue line is the probability to be “alive” (buy again) for an individual customer for each day, while the dashed red lines are actual purchases made. We can see how the model adapts differently based on the customer history, for a slow buyer his probability takes more time to really decrease, while for a fast buyer it decreases really fast.

Customer segmentation by lifetime value

Here we sorted all the customers by their predicted lifetime value (from lower to higher), then we break into chunks of percentile groups. Now we can check how much each group contributes to our overall revenue. This is interesting information when we want to understand more about our different groups of customers.

How do we use at Liv Up

Today we use the predicted probability to rebuy and lifetime value in several Liv Up areas.

Measuring product performance on retaining and resurrecting customers

Our business growth is driven by retention. We need to know whether our new products and new features are helping us achieve the objective of retaining customers. However, it is hard to define when a customer is active or inactive and the model provides a data-driven approach to this problem. Using the probability to rebuy, we define for each repurchase, if the customer was retained or resurrected.

Creating a less lagging indicator for retention

Retention takes time to measure. We are a 3 years old company and we can’t wait 12 months to measure retention. So with our model we can have an indicator with a drastically reduced observation period.

Segmenting customers for campaigns

Based on the probability to rebuy, we segment our customers to deliver a personalized campaign based on their engagement. Combined with the predicted lifetime value, we can send a more efficient message, which helps in optimizing ROI on retention campaigns.

Summary

In this post, we gave a very brief overview of the BG/NBD & Gamma-Gamma models. We have mostly focused on the overall intuition, but if you are still curious you can find the complete details from the original authors here and here. The official documentation for the lifetimes’ package is also a great resource, and you can quickly run the models on your data. The package also offers useful methods for plotting & evaluating model performance.

The BG/NBD & Gamma-Gamma models, and also the lifetimes’ package, have been very useful here at Liv Up, and we think it might be as useful for any business that wants to understand the relationship with its customers.

--

--