Customer Life Time Value Prediction by Using BG-NBD & Gamma-Gamma Models and Applied Example in Python

Published in

Analytics Vidhya

8 min readSep 13, 2021

Hi everyone. In this story, I’ll try to explain what BG-NBD Model is, what Gamma-Gamma Submodel is and how we can calculate Customer Life Time Value by using these statistical methods. At the end of this story, we will see an applied example in Python. I will be explaining the code’s meaning here I’ll also share the codes in the Kaggle notebook. You can access the notebook from here.

We can predict the following things by using these techniques

Which customer will do a purchase in the next period?
Top N customers are expected to make the most purchases in the next period
What value will the customer create for our business?

BG-NBD Model will model each customer’s purchase behaviours’ distribution and will predict the expected number of transactions for each customer.

Gamma-Gamma Submodel will model the expected average profit distribution and will predict the expected average profit for each customer.

BG-NBD Model

Beta Geometric / Negative Binomial Distribution known as BG-NBD Model. Also sometimes it comes up as “Buy Till You Die”. It gives us the conditional expected number of transactions in the next period. This model can answer the following questions

How many transactions will be next week?
How many transactions will be in the next 3 months?
Which customers will do the most purchases in the next 2 weeks?

This model models 2 processes by using probability for predicting the expected number of transactions

Transaction Process (Buy)
Dropout Process (Till You Die)

Transaction Process (Buy)

We use this for indicating the purchase process
During the customer is alive, the number of will have made by the customer, will be distributed poison by transaction rate parameter
During the customer is alive, they will be purchasing around their own transaction rate
Transaction rates will change for each customer and they will be distributed gamma (r,α)

Dropout Process (Till You Die)

It means dropping purchasing
Each customer has their own dropout rate by p probability
The customer will be a dropout by p probability
Dropout rates will change for each customer and they will be distributed beta (a,b) for the mass

BG-NBD Formula

We see the formula of the BG-NBD model below.

I’ll explain the parameters left side of equality.

E refers to the expected value
| refers to that this probability is conditional (conditional expected number of transactions)
x refers to frequency for each customer who purchased at least 2 times.
tx refers to recency for each customer. In this case, we will assume the recency will be based on weeks. The time from the last purchasing date to the first purchasing date (weeks).
T refers to the time from today’s date to the last purchasing date (weeks).
r,α comes from the gamma distribution (buy process). Transaction rate of the mass.
a,b comes from the beta distribution (till you die process). The dropout rate of the mass.
Y(t) refers to the expected number of transactions for each customer.

Gamma-Gamma Submodel

We use this model for predicting how much average profit we can earn for each customer. It gives us the expected average profit for each customer after modelling the average profit for the mass.

A customer’s monetary value (the sum of a customer’s transaction amounts) will be random distributed around the average of its transaction values
An average transaction value can change in periods between the customers but it's not changing for a customer
The average transaction value will be distributed gamma between all customers

Gamma-Gamma Submodel Formula

We see the formula of the Gamma-Gamma submodel below.

Again I’ll explain the parameters left side of equality.

E refers to the expected value
x refers to frequency for each customer
mx refers to the monetary for each customer
M refers to the expected value of transactions (expected average profit)
p,q,γ comes from the gamma distribution

Now we can go to the “An Applied Example Section”.

An Applied Example in Python

In this section, we will use this dataset. Also, you can get it from here. As I said before, I will be explaining the meaning of the code here and will be sharing the codes on Kaggle.

Let’s Start

I’m going to import the below packages. We will use datetime for creating the today_date variable. The dataset contains data between 2009 and 2011. I’ll use the 2010–2011 sheet.

import pandas as pd
import datetime as dt
from lifetimes import BetaGeoFitter
from lifetimes import GammaGammaFitter

We will use lifetimes package for using the statistical models. You can install the package on your workspace like the following:

!pip install lifetimes
!pip install openpyxl

Also, I’ve installed openpyxl package for reading excel files.

Loading Dataset

I’ve already uploaded the dataset on my Kaggle workspace.

raw_data = pd.read_excel('../input/uci-online-retail-ii-data-set/online_retail_II.xlsx',sheet_name='Year 2010-2011')df = raw_data.copy()df.head()

You can learn the variable meanings below.

Invoice: Invoice number. If this number starts with ‘C’, it means this transaction is cancelled.
StockCode: Product code
Description: Product Name
Quantity: Product counts
InvoiceDate: Transaction date
Price: A single product price
CustomerID: Unique customer number
Country: Customer’s country name

Data Preparing

I’ll exclude entries that have na value in rows. After that, I’ll apply another filter to the dataset. I will also choose data which Invoice column doesn’t start with ‘C’. Because if it starts with ‘C’, it means this is a cancelled transaction. And then I’ll filter the Quantity column and I’ll calculate TotalPrice for each transaction.

df.dropna(inplace=True)
df = df[~df["Invoice"].str.contains("C", na=False)]
df = df[df["Quantity"] > 0]
df['TotalPrice'] = df['Price'] * df['Quantity']

Actually, we can ask some questions to the dataset and we can get some insights from it but I won’t do that. I want to focus on the main topic.

Preparing Dataset for Calculating CLV

I’m going to create a dummy date.

today_date = dt.datetime(2011, 12, 11)

Now I need to calculate columns for the statistical models.

cltv = df.groupby('Customer ID').agg({
    'InvoiceDate': [
        lambda x: (x.max() - x.min()).days,  # recency
        lambda x: (today_date - x.min()).days  # T
    ],
    'Invoice': lambda x: x.nunique(),  # frequency
    'TotalPrice': lambda x: x.sum()  # monetary
})

I need to repair the dataset’s columns. Pandas made level based columns to it because of the 2 lambda function in the InvoiceDate column. Also, I’ll be renaming the columns like what the statistical models use.

cltv.columns = cltv.columns.droplevel(0)

cltv.columns = ['recency', 'T', 'frequency', 'monetary']

I’ll calculate the average earning per transaction. In this case, we will suppose that monetary value is the average earning per transaction. Gamma-Gamma model uses monetary value like that.

cltv = cltv[cltv['monetary'] > 0]cltv['monetary'] = cltv['monetary'] / cltv['frequency']

And I need to transform days to week based for the NG-NBD model.

# transforming days to weekscltv['recency'] = cltv['recency'] / 7cltv['T'] = cltv['T'] / 7

And I’m filtering out frequency values greater than 1.

cltv = cltv[(cltv['frequency'] > 1)]

Creating NG-NBD Model

I’ll use BetaGeoFitter from lifetimes package. I’ll create an instance and will fit it by using frequency, recency and T values.

bgf = BetaGeoFitter(penalizer_coef=0.001)bgf.fit(cltv['frequency'], cltv['recency'], cltv['T'])

We’ve created an NG-NBD model above. Now we can predict some expected values. We’ll see some question examples below.

Top 10 customers expected to make the most purchases in a week

bgf.conditional_expected_number_of_purchases_up_to_time(1, # week
                                                        cltv['frequency'],
                                                        cltv['recency'],
                                                        cltv['T']).sort_values(ascending=False).head(10)

Top 10 customers expected to make the most purchases in a month

bgf.conditional_expected_number_of_purchases_up_to_time(4, 
                                                 # 4 weeks = 1 month
                                                        cltv['frequency'],
                                                        cltv['recency'],
                                                        cltv['T']).sort_values(ascending=False).head(10)

There is an important point that we shouldn’t miss out. This model works week based. Therefore we need to pass the time argument (the first argument in function) based on weeks.

Top 10 customers expected to make the most purchases in next 6 months

bgf.conditional_expected_number_of_purchases_up_to_time(4 * 6, 
                                           # weeks * count = months
                                                        cltv['frequency'],
                                                        cltv['recency'],
                                                        cltv['T']).sort_values(ascending=False).head(10)

The count of total transactions expected in next 6 months

bgf.conditional_expected_number_of_purchases_up_to_time(4 * 6,
                                                        cltv['frequency'],
                                                        cltv['recency'],
                                                        cltv['T']).sum()

Creating Gamma-Gamma Submodel

I’ll use GammaGammaFitter from lifetimes package. I’ll create an instance and will fit it by using frequency and monetary values.

ggf = GammaGammaFitter(penalizer_coef=0.01)

ggf.fit(cltv['frequency'], cltv['monetary'])

Also, we can answer questions by using this model like below.

The top 10 customers expected to be most valuable

ggf.conditional_expected_average_profit(cltv['frequency'],
                               cltv['monetary']).sort_values(ascending=False).head(10)

Predicting CLV by Using BG-NBD and Gamma-Gamma Models

# The customers' lifetime values expected to in the next 3 monthscltv['cltv_pred_3_months'] = ggf.customer_lifetime_value(bgf,
                                   cltv['frequency'],
                                   cltv['recency'],
                                   cltv['T'],
                                   cltv['monetary'],
                                   time=3,  # 3 months
                                   freq="W",  
# frequency information of T. In this case we set week by using 'W'
                                   discount_rate=0.01)cltv

Also, if we want, we can segment our customers by using their CLV values. It can be helpful for taking action by using this data. You can see an example below.

cltv['segment'] = pd.qcut(cltv['cltv_pred_3_months'],4,['D','C','B','A'])

cltv

Final Thoughts

Hopefully, it was helpful and you enjoyed it. I didn’t deep dive into more complex statistics. To be honest, sometimes it can be too sophisticated for some people who are not math-based. However, it’s easier for people who understand the inputs. There was an important point. We learnt the inputs for each model. After now we can separate the difference between BG-NBD and Gamma-Gamma models and we can use these models for our business.

Kind regards.