Unveiling Customer Behavior: An In-depth Analysis of RFM Segmentation

Adesanya Oreoluwa
6 min readApr 24, 2023

--

The process of segmenting customers using RFM analysis in MySQL

Customer Performance

Are you curious about how to identify the most valuable customers in your organization? One effective technique for customer segmentation is RFM analysis, which can help you group customers based on their activities.

In today’s competitive business landscape, understanding customer behavior is key to staying ahead of the game. Customer segmentation is a popular method for dividing customers into groups based on shared characteristics, such as demographics, behavior, or purchase history.

One effective approach for customer segmentation is RFM analysis, which stands for Recency, Frequency, and Monetary value. This technique allows marketers to analyze customer behavior based on three factors and create customer segments that can be used for targeted marketing campaigns.

I will delve into the details of RFM analysis and explain how to perform it using SQL to segment customers based on their purchasing behavior.

What is RFM Analysis

RFM analysis is a marketing technique that involves analyzing customer behavior based on three factors: recency, frequency, and monetary value.

These three factors are used to create customer segments that can be targeted for specific marketing campaigns.

The recency factor measures the time since the customer’s last purchase, with recent customers being more valuable than those who have not purchased in a long time.

Frequency refers to how often the customer makes purchases, with more frequent buyers being more valuable.

The monetary value measures how much the customer spends on purchases, with higher spenders being more valuable.

By analyzing these factors, marketers can create customer segments based on purchasing behavior.

Performing RFM analysis using SQL

In order to conduct RFM analysis through SQL, I collected data on the purchase history of each customer, which included information such as the date of purchase, purchase amount, and customer ID. By utilizing SQL, I was able to derive the recency, frequency, and monetary value metrics for each individual customer. This approach allowed me to effectively analyze customer behavior and develop valuable insights for the business.

select 
CUSTOMERNAME,
ROUND(SUM(sales), 2) Monetary,
COUNT(ORDERNUMBER) Frequency,
datediff((SELECT MAX(STR_T0_DATE(ORDERDATE, '%m/%d/%Y %t'))
FROM sales_data_sample), MAX(STR_TO_DATE(ORDERDATE, '%m/%d/%Y %t'))) AS Recency
FROMsales_data_sample
GROUP BY CUSTOMERNAME;

This is the query I used to calculate the monetary value, frequency, and recency of purchases made by customers. To calculate the monetary value, I employed the SUM function to determine the total amount spent by each customer. For frequency, I used the COUNT function to count the number of unique purchase dates for each customer. Lastly, for recency, I utilized the MAX function to calculate the time since the most recent purchase by subtracting the maximum purchase date for each customer from the current date.

In order to analyze customer behaviors, I have employed a customer segmentation strategy utilizing RFM analysis. This method allows me to better focus my marketing efforts by dividing my customer base into segments.

For this purpose, I have opted for the quintiles method, which is a straightforward and uncomplicated way to perform customer segmentation.

What are Quintiles

Quintiles refer to a statistical technique that involves dividing a dataset into four intervals or five equal parts based on the distribution of values. Each interval represents a quintile that businesses can use to categorize customer data into groups for analysis and comparison.

By using quintiles to segment their customers, businesses can identify different customer groups with unique needs and behaviors. Analyzing the characteristics and behaviors of each quintile enables businesses to customize their marketing and communication strategies to better meet the needs of each group.

Quintiles can help businesses identify their most valuable customers and develop loyalty programs or personalized offers to retain them. Additionally, they can identify customers who are at risk of churning and implement strategies to re-engage them.

Overall, quintile-based customer segmentation provides businesses with a powerful tool to understand their customers and develop targeted strategies that improve customer satisfaction and retention, leading to increased revenue and profitability.

I utilized SQL to compute the quintile value for Recency, Frequency, and Monetary metrics. I achieved this by utilizing the NTILE() function.

SELECT*,
NTILE(4) OVER (order by Recency desc) rfm_recency,
NTILE(4) OVER (order by Frequency) rfm_frequency,
NTILE(4) OVER (order by Monetary) rfm_monetary
FROM rfm;

I performed customer segmentation into four distinct groups based on the recency and frequency of their orders. The closer a customer’s maximum order date is to their last order date, the higher their recency score. Additionally, customers with a higher number of orders were considered to be more valuable.

I utilized the Create Table Function in MySql to create a temporary table called rfm_table for the queries I had previously performed. To calculate the RFM values, I added the recency, monetary, and frequency values together using the CONCAT function instead of the SUM function, as I needed the values to be in string format rather than numeric format.

create table rfm_table as WITH  rfm as 
(
select
CUSTOMERNAME,
ROUND(SUM(sales), 2) Monetary,
COUNT(ORDERNUMBER) Frequency,
datediff((select max(str_to_date(ORDERDATE, '%m/%d/%Y %t'))
from sales_data_sample), max(str_to_date(ORDERDATE, '%m/%d/%Y %t'))) AS Recency
from sales_data_sample
group by CUSTOMERNAME
),
rfm2 AS (
select *,
NTILE(4) OVER (order by Recency desc) rfm_recency,
NTILE(4) OVER (order by Frequency) rfm_frequency,
NTILE(4) OVER (order by Monetary) rfm_monetary
from rfm
)
select rfm2.*, rfm_recency+ rfm_frequency+ rfm_monetary as rfm_cell,
concat(cast(rfm_recency as char),cast(rfm_frequency as char),cast(rfm_monetary as char))rfm_cell_string
from rfm2;

Next, I used the CASE function to segment customers into six distinct groups based on their purchase behavior using RFM. These customer segments are:

Lost Customers: These customers have stopped visiting the store and are considered lost due to their low recency and high-frequency scores.

Slipping Away Customers: This group includes valuable customers who are big spenders, but have low recency scores. The company cannot afford to lose them as they tend to spend a lot when they do visit.

New Customers: This group includes customers who have recently visited the store and thus have a high recency score.

Potential Churners: These customers have recently visited the store, made purchases, and spent money. However, they have the potential to stop visiting the store and should be closely monitored.

Active Customers: This group includes customers who buy frequently, but at low price points, and have made recent purchases.

Loyal Customers: This is the most valuable group of customers who visit the store and make purchases often, and are willing to spend more money on high-price point items.

select CUSTOMERNAME, rfm_recency, rfm_frequency, rfm_monetary, concat(cast(rfm_recency as char),cast(rfm_frequency as char),cast(rfm_monetary as char))rfm_cell_string,
(case
when rfm_cell_string in (111, 112 , 121, 122, 123, 132, 211, 212, 114, 141) then 'lost_customers'
when rfm_cell_string in (133, 134, 143, 244, 334, 343, 344, 144) then 'slipping_away_customers'
when rfm_cell_string in (221,222, 223, 232, 234, 243, 233, 312, 322) then 'potential_churners'
when rfm_cell_string in (323, 333,321, 412, 421, 422, 423, 332, 432) then 'active'
when rfm_cell_string in (433, 434, 443, 444) then 'loyal'
end) as ref_segment
from rfm_table;

These customer segments can be leveraged to design tailored marketing strategies and personalize our approach to each segment.

An exclusive promotional offer can be extended to customers classified under the active customers and potentials to incentivize them to make additional purchases.

Alternatively, loyalty rewards should be provided to customers in the Loyal segment to cultivate their continued patronage.

Additionally, it is imperative to enhance the commercial rapport with customers from the slipping away and new customer segments, to mitigate potential churn and fortify their engagement with our brand.

However, developing targeted retention strategies to win back these lost customers could potentially lead to increased revenue and business growth.

Finally, it is important to analyze the reasons behind the loss of these customers to avoid similar situations in the future.

click here to view the full SQL code

--

--

Adesanya Oreoluwa

Data Anaylst| Research Analyst | Power Bi | Microsoft Excel | SQL