Customer Lifetime Value (CLV) Analysis using Python.

Aleem Adewoyin
4 min readJul 1, 2023

Customer Lifetime Value (CLV) Analysis is a method of evaluating the total revenue a business can expect from a single customer account. It considers a customer’s revenue value and compares that number to the company’s predicted customer lifespan. Businesses use this metric to identify significant customer segments that are the most valuable to the company.

Understanding and working to increase CLV can result in increased revenue for the company. It can help businesses make decisions about how much money to invest in acquiring new customers and how much they’re willing to spend to retain existing ones. It also affects many other business decisions, like marketing, customer support, product development, and pricing.

Average Purchase Value:
First, calculate this by dividing your company’s total revenue in a time period (usually one year) by the number of purchases over the course of that same time period.

Average Purchase Frequency:
This is calculated by dividing the number of purchases over the course of the time period by the number of unique customers who made purchases during that time period.

Customer Value:
This is calculated by multiplying the Average Purchase Value by the Average Purchase Frequency.

Average Customer Lifespan:
This is calculated by averaging out the number of years a customer continues purchasing from your company.

Customer Lifetime Value:
Multiply the Customer Value by the Average Customer Lifespan. This will give you the CLV.

1. Download dataset
https://statso.io/wp-content/uploads/2023/04/acquisition_data.zip cv


2. Read the data into Python and setup the colour palette for seaborn

import pandas as pd
%matplotlib inline
import matplotlib.pyplot as plt
import seaborn as sns

# Setting seaborn colour palette
palette = sns.set_palette('Set2')

data = pd.read_csv('customer-acquisition-data.csv')
data.head()

3. Check the shape of the data

data.info()

4. Statistical descrptions of the data

data.describe()

5. Examine the Distribution of Numeric Variables

fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 4))
ax1.set_title('Distribution of Cost')
sns.histplot(data, x='cost', kde=True, ax=ax1)
ax2.set_title('Distribution of Revenue')
sns.histplot(data, x='revenue', kde=True, ax=ax2);


6. Examine the Distribution of Non-Numeric (Categorical) Variables

sns.countplot(data, x='channel');


7. Aggregating Channels

# Grouping the data based on 'channel'
channel_groups = data.groupby('channel')
# Aggregating 'cost' for each channel group
cost_by_channel = channel_groups['cost'].mean().reset_index()

sns.barplot(cost_by_channel, x='channel', y='cost');

8. Aggregating Conversion Rate

# Aggregating 'conversion_rate' for each channel group
conversion_rate_by_channel = channel_groups['conversion_rate'].mean().reset_index()

sns.barplot(conversion_rate_by_channel, x='channel', y='conversion_rate');


9. # Aggregating 'revenue' for each channel group
revenue_by_channel = channel_groups['revenue'].sum().reset_index()

fig, ax = plt.subplots(figsize=(8, 6))
plt.pie(revenue_by_channel['revenue'], labels=revenue_by_channel['channel'], autopct='%1.1f%%');


10. ROI

data['roi'] = (data['revenue'] - data['cost']) / data['cost']
data.head()


11. ROI vy channel analysis

roi_by_channel = data.groupby('channel')['roi'].mean().reset_index()

sns.barplot(roi_by_channel, x='channel', y='roi');


12. ROI and other variable analysis

# Computing the correlation matrix
corr = data[['roi', 'cost', 'conversion_rate', 'revenue']].corr()

sns.heatmap(corr, annot=True, cmap=sns.color_palette('Purples_d'));


13. CLTV

data['cltv'] = (data['revenue'] - data['cost']) * data['conversion_rate'] / data['cost']
data.head()


14. CLTV by Channel

cltv_by_channel = data.groupby('channel')['cltv'].mean().reset_index()

sns.barplot(cltv_by_channel, x='channel', y='cltv');

15. Box plot

# Selecting records where channel is referral or social media
select = data[data['channel'].isin(['referral', 'social media'])]

sns.catplot(select, x='channel', y='cltv', kind='box');
2. Read the data into Python and setup the colour palette for seaborn
. Check the shape of the data

--

--

Aleem Adewoyin

Certified Microsoft data analyst ,BSc Computer Science👩‍🎓, Ms Azure Fundamentals badge holder, An advocate for a better world💯, Chess lover♟.