Customer Lifetime Value (CLV) Analysis using Python.
Customer Lifetime Value (CLV) Analysis is a method of evaluating the total revenue a business can expect from a single customer account. It considers a customer’s revenue value and compares that number to the company’s predicted customer lifespan. Businesses use this metric to identify significant customer segments that are the most valuable to the company.
Understanding and working to increase CLV can result in increased revenue for the company. It can help businesses make decisions about how much money to invest in acquiring new customers and how much they’re willing to spend to retain existing ones. It also affects many other business decisions, like marketing, customer support, product development, and pricing.
Average Purchase Value:
First, calculate this by dividing your company’s total revenue in a time period (usually one year) by the number of purchases over the course of that same time period.
Average Purchase Frequency:
This is calculated by dividing the number of purchases over the course of the time period by the number of unique customers who made purchases during that time period.
Customer Value:
This is calculated by multiplying the Average Purchase Value by the Average Purchase Frequency.
Average Customer Lifespan:
This is calculated by averaging out the number of years a customer continues purchasing from your company.
Customer Lifetime Value:
Multiply the Customer Value by the Average Customer Lifespan. This will give you the CLV.
1. Download dataset
https://statso.io/wp-content/uploads/2023/04/acquisition_data.zip cv
2. Read the data into Python and setup the colour palette for seaborn
import pandas as pd
%matplotlib inline
import matplotlib.pyplot as plt
import seaborn as sns
# Setting seaborn colour palette
palette = sns.set_palette('Set2')
data = pd.read_csv('customer-acquisition-data.csv')
data.head()
3. Check the shape of the data
data.info()
4. Statistical descrptions of the data
data.describe()
5. Examine the Distribution of Numeric Variables
fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(12, 4))
ax1.set_title('Distribution of Cost')
sns.histplot(data, x='cost', kde=True, ax=ax1)
ax2.set_title('Distribution of Revenue')
sns.histplot(data, x='revenue', kde=True, ax=ax2);
6. Examine the Distribution of Non-Numeric (Categorical) Variables
sns.countplot(data, x='channel');
7. Aggregating Channels
# Grouping the data based on 'channel'
channel_groups = data.groupby('channel')
# Aggregating 'cost' for each channel group
cost_by_channel = channel_groups['cost'].mean().reset_index()
sns.barplot(cost_by_channel, x='channel', y='cost');
8. Aggregating Conversion Rate
# Aggregating 'conversion_rate' for each channel group
conversion_rate_by_channel = channel_groups['conversion_rate'].mean().reset_index()
sns.barplot(conversion_rate_by_channel, x='channel', y='conversion_rate');
9. # Aggregating 'revenue' for each channel group
revenue_by_channel = channel_groups['revenue'].sum().reset_index()
fig, ax = plt.subplots(figsize=(8, 6))
plt.pie(revenue_by_channel['revenue'], labels=revenue_by_channel['channel'], autopct='%1.1f%%');
10. ROI
data['roi'] = (data['revenue'] - data['cost']) / data['cost']
data.head()
11. ROI vy channel analysis
roi_by_channel = data.groupby('channel')['roi'].mean().reset_index()
sns.barplot(roi_by_channel, x='channel', y='roi');
12. ROI and other variable analysis
# Computing the correlation matrix
corr = data[['roi', 'cost', 'conversion_rate', 'revenue']].corr()
sns.heatmap(corr, annot=True, cmap=sns.color_palette('Purples_d'));
13. CLTV
data['cltv'] = (data['revenue'] - data['cost']) * data['conversion_rate'] / data['cost']
data.head()
14. CLTV by Channel
cltv_by_channel = data.groupby('channel')['cltv'].mean().reset_index()
sns.barplot(cltv_by_channel, x='channel', y='cltv');
15. Box plot
# Selecting records where channel is referral or social media
select = data[data['channel'].isin(['referral', 'social media'])]
sns.catplot(select, x='channel', y='cltv', kind='box');