RFM Analysis: An Effective Customer Segmentation technique using Python
RFM analysis enables personalized marketing, increases engagement, and allows you to create specific, relevant offers to the right groups of customers.
This post explores the benefits of RFM analysis, shares step by step instructions on how to perform RFM analysis in Python, and finally showcases the created RFM customer segments to maximize ROI.
What is RFM Analysis?
RFM analysis is a data-driven customer behavior segmentation technique where RFM stands for recency, frequency, and monetary value.
The idea is to segment customers based on when their last purchase was(Recency), how often they’ve purchased in the past(Frequency), and how much they spent(Monetary). All three of these measures have proven to be effective predictors of a customer’s willingness to engage in marketing messages and offers.
How to do a RFM Analysis in Python?
We will follow 5 steps to do RFM analysis, which will be explained in subsequent steps taking the data from an apparel retail store.
Importing useful Python libraries:
import numpy as np
import pandas as pd
import datetime as dt
from datetime import datetime
import datetime as dt
from datetime import datetime
%matplotlib inline
import matplotlib.pyplot as plt
import matplotlib.ticker as mticker
import seaborn as sns
import squarify
from sklearn.cluster import KMeans
Step 1: Data Import
rfm_single_view=pd.read_csv('RFM Data.csv')
rfm_single_view.head()
The sample records of the imported data:
Step 2: Data Preprocessing
It includes two steps:
I- Dropping records with empty values
rfm_single_view.dropna(axis=0,inplace=True)
II- Removing Top 1% record for analysis (as they might skew the analysis), these customers can be studied separately, if they are outliers or genuine bulk buyers.
#Creating Individual tables of RFM after removing top 1%recency_cleaned = rfm_single_view[rfm_single_view['Recency']<rfm_single_view['Recency'].quantile(0.99)]
frequency_cleaned = rfm_single_view[rfm_single_view['Visits']<rfm_single_view['Visits'].quantile(0.99)]
monetary_cleaned = rfm_single_view[rfm_single_view['Spend Per Visit']<rfm_single_view['Spend Per Visit'].quantile(0.99)]#Merging three dataframes to create rfm tablerfm_table=pd.merge(pd.merge(recency_cleaned[['Cap User ID','Recency','Visits','Spend Per Visit']],frequency_cleaned[['Cap User ID']],on='Cap User ID'),monetary_cleaned[['Cap User ID']],on='Cap User ID')
The sample records of the data after preprocessing, Visits become Frequency whereas spend per visit has been taken as the Monetary field.
Step 3: Deciding RFM Clusters
First, we decide on the optimum no of clusters. Here, we get 3 as optimum no of clusters which means there will be three cuts for recency, frequency, and monetary each. This is done using the K-means clustering algorithm.
Data visualizations after deciding RFM clusters
Step 4: Finding a Combined RFM Score
Now the individual RFM scores ranging from 0 to 2 as we decided on 3 clusters are summed up to get a combined RFM score against each customer.
Step 5: Generating Unique Customer Segments based on RFM Score
1.Core - Your Best CustomersRFM Score: 222Who They Are: Highly engaged customers who have bought the most recent, the most often, and generated the most revenue.2.Loyal - Your Most Loyal CustomersRFM Score: X2XWho They Are: Customers who buy the most often from your store.3.Whales - Your Highest Paying CustomersRFM Score: XX2Who They Are: Customers who have generated the most revenue for your store.4.Rookies - Your Newest CustomersRFM Score: 20XWho They Are: First time buyers on your site.5.Slipping - Once Loyal, Now GoneRFM Score: 00XWho They Are: Great past customers who haven't bought in awhile.6.Regular - The customers having common behaviour across these metrics.RFM Score: Remaining ScoresWho They Are: Customer who have average metrics across each RFM scores.
Snapshot of some of the KPI’s against each customer segment clearly shows the best groups are the Core and Loyal customer segments.
Let’s create a nice visualization for our data.
#Create our RFM Segment plot and resize it.
fig = plt.gcf()
ax = fig.add_subplot()
fig.set_size_inches(12, 8)
squarify.plot(sizes=rfm_single_view_after_tags_v1['CustomerCount'],
label=['Core',
'Loyal',
'Regular',
'Rookies',
'Slipping',
'Whales'], alpha=0.8 )
plt.title("RFM Segments",fontsize=18,fontweight="bold")
plt.axis('off')
plt.show()
Conclusion
RFM technique is a proven marketing model that helps retailers and e-commerce businesses maximize the return on their marketing investments.
The above-generated RFM customer segments can be easily used to identify high ROI segments and engage them with personalized offers