RFM Analysis Step by Step: Guide to building an RFM model (+ Open-source Example)

Tadeh Alexani
Formaloo
Published in
7 min readMay 5, 2021

Introduction

Previously we’ve talked about different ways to segment customers in the article below:

(If it’s the first time you hear about customer segmentation I strongly suggest you first read the article above.)

One of the powerful segmentation ways today businesses can use is RFM segmentation. Today we will talk about what is RFM, why we should have that in our business, and how we at Formaloo developed it for our CDP customers.

About Formaloo CDP:
Formaloo Customer Data Platform (CDP) collects, analyzes, and unifies data from all data sources in order to grow customers’ loyalty.

What is RFM Segmentation?

Source: Particle, Inc.

In short, RFM is a method used for analyzing customer value.

RFM stands for three dimensions:
• Recency — How recently did the customer purchase?
• Frequency — How often do they purchase?
• Monetary Value — How much do they spend?

For example, a service-based business could use these calculations:

• Recency = the maximum of “5 — the number of months that have passed since the customer last purchased” and 1
• Frequency = the maximum of “the number of purchases by the customer in the last 12 months (with a limit of 5)” and 1
• Monetary = the highest value of all purchases by the customer expressed as a multiple of some benchmark value

Alternatively, categories can be defined for each attribute. For instance, Recency might be broken into three categories: customers with purchases within the last 90 days; between 91 and 365 days; and longer than 365 days. Such categories may be derived from business rules or using data mining techniques to find meaningful breaks.

Why we should use RFM Segmentation?

RFM is one of the cleanest and most powerful segmentation ways. As a business, you can get a strong insight into your customers' status by just using 3 simple variables that any business can calculate and store easily with the customer data platforms and AI tools.

It enables marketers and business owners to target a specific segment of customers with communications that are much more relevant for their particular behavior — and thus generate much higher rates of response, plus increased loyalty and customer lifetime value.

After using this feature, depending on the nature of your e-commerce business, you will most likely find that 10–25% of your customers account for 60–80% of sales (the 80/20 rule). So you can target them and keep them active by using the different segments of the RFM-based segmentation.

How to effectively segment your customers with RFM Analysis?

To demonstrate this, first, I will explain to you how our gamification system works. Formaloo connects to your business and integrates your business data (customers, orders, etc.) from different data sources. Using the Formaloo actions system, you as a business owner can define custom actions (for example, clicking on some button can count as an action separately) or use a predefined action, such as completing an order that will contain the order date, the order monetary amount, etc. for each customer.

We generate meaningful variables using the information given above and store them as separate gamification variables. As we defined in our gamification system, you can get the below information (+ tens of other information) about your customers using our API or just by checking customers' profiles on CDP’s dashboard. These are the 3 variables we have used in Formaloo’s CDP gamification system to calculate RFM scores for customers:

  • R = Recency: How recently did the customer purchase?
  • F = Frequency: How often do the customer purchase?
  • TM = Total Monetary Value: How much do they spend in total?

So the task is to generate an RFM score for each customer and use it to segment customers properly by a relevant tag for each of them. In the implementation process, we had 2 obstacles. First was the amount of those 3 variables which wasn’t normalized and the second was choosing the right segment title based on the final generated RFM score.

We started by illustrating a flowchart for the whole process based on what we knew and what we wanted to achieve:

RFM Calculator Flowchart

So using Formaloo’s Customer Insights API, we first will retrieve a business’s customers, and using the gamification variables response which will contain the R, F & M variables we will normalize them to be between 1 and 5 (as an RFM standard). Then by using the formed RFM score for each customer we can assign a tag containing the proper title according to their score for each of them.

In Formaloo CDP, each customer can have numerous number of tags which can help the business owners and marketers to analyze and cluster their customers.

Finally, we will use the API’s customers batch update endpoint to update the CDP’s customers.

So let’s open up the normalization process. As we discussed before, we have 3 variables in different ranges (R, F & M) but we want each of the final items to range between 1 & 5. Using the formula below and by considering the min range to be 1, the max range to be 5, calculating the minimum and maximum value of each of R, F & M variable in the list of customers and, the value itself for each customer we were able to normalize the scores:

Normalization Formula

So that now our first obstacle is removed, let’s move to the second obstacle. Now it’s time to choose the proper tags based on each score (ranging from 111 to 555).

In this step, our goal is to select groups of customers to whom specific types of communications will be sent, based on the RFM segments in which they appear. It is helpful to assign names to segments of interest. After spending some time researching about different naming styles and segmentation options, we prepared the chart below which is Formaloo’s exclusive RFM-based segments chart:

Formaloo RFM-based Segments

Based on the chart above, we will segment customers into 10 separate segments and as a rule, each customer must be in only one segment at the same time. The next step was to set a bunch of rules for each of these segments so whenever we meet one of them we assign the proper tag to each customer.

After discussing and revising the rules for some time, we’ve reached the set of rules described below (Considering 5 to be the high range value and 1 to be the low range value):

1. Champion

Your best customers, they buy and spend a lot and made their last purchase recently. -> 555

2. Loyal Customer

Very good customers — they spend a lot. -> X5X

3. Potential Loyalist

Recent customers, but who have already spent a lot -> Added (registered) in less than 3 months but they spent more than ATM (Average Monetary Value)

4. New Customer

Recent customers, who made only a few purchases. -> 52X

5. Promising (Default)

Customers who buy frequently and spend a lot, but made their last purchase some time ago. -> X53 and X52

6. Need Attention (Default)

Customers with recency and above-average spending. -> Customers whose New Level < Last Level

The level is the health score & loyalty score of each customer. For every customer, it’s a number between 0 and 10 and it will be calculated once after the import/connect of a data source and then once every month.

7. At Risk (About to leave)

Customers who bought frequently, but haven’t made any purchases in a long time. -> Customers whose New Level < Last Level (3 times)

8. Can’t lose them

Customers who have spent a lot, but have been inactive for a long time. -> 22X

9. Hibernate

Low-frequency, low-spender customers who haven’t bought in a long time. (6 months) -> R & F & M < 3

10. Lost

Your worst customers. They haven’t bought in a long time, they only bought once (or very few times) and they spent very little. -> 111

As a rule of thumb, if none of the conditions above met in the processing of the RFM scores for each customer, we’ll check if the final RFM score is above the average or below average and give them one of the Promising and Need Attention tags as default.

Wrap Up

This is it! We've discussed every part of our RFM calculator with you guys. We are happy to announce that the code of this calculator is available open-source here:

Its syntax is in PHP but you can easily convert it to whatever language you want to develop your service on. So feel free to use and contribute!

Thank you for reading this story, I would love to hear your feedback and your experiences regarding it. If you want to contact me or ask me any questions, here is my LinkedIn, I would be happy to hear from you.

--

--