Using customer segments to drive decision making and increase revenue
Understanding customer behaviour is key to capture growth opportunities and solve pressing business problems. In a study by Boston Consulting Group¹, more than 75% of the senior executives overall thought customer insights were key to accelerating growth. When asked about which capabilities their companies needed to develop, customer insights were listed at the top.
Although the world of customer experience and insights has expanded significantly over the last decade with a range of tools available for companies, the focus of this article is customer segmentation. In Bain’s Customer Experience Tools and Trends survey 2020² comprising 1200 executives across a range of industries, 67% said they have adopted customer segmentation as an important customer experience tool over the last 5 years.
On a surface level, the concept of segmentation is quite simple. Divide the customers into segments; each segment has some common customer characteristics. How these segments are best defined, depends on the use case. Here, the focus will be to discuss two key techniques:
1. Cohort Analysis
2. RFM Segmentation
Cohort Analysis
Cohort analysis groups customers into mutually exclusive cohorts; these can be measured over time for high level trends and insights. There are three different types of cohorts:
1. Time cohorts: Time cohorts group customers based on their purchase activity in a defined time frame. Comparing customers acquired in January with the ones acquired in December will be an example of time cohort analysis. In this case, this analysis can be especially insightful since December is likely to have more new customers and higher spending due to the holidays.
2. Behaviour cohorts: These cohorts group customers based on their behaviour in terms of the type of product or service purchased. An airline, for example, has regular business flyers and the occasional leisure flyers. Analysing them as different cohorts will help personalise the customer experience effectively.
3. Size cohorts: These cohorts group customers based on their amount of spending, or the quantity of products purchased in a defined timeframe. Low, medium, and high spend cohorts can be defined based on spending levels over a 12-month period.
Let’s see how cohort analysis can be implemented. The focus for this article is on insights rather than the code. However, you can find a link to the code at the end of the article. I am using a dataset comprising online retail transactions for a UK retailer from January to December 2011³. It includes ~17K transactions by ~4.2K customers for ~3.6K unique products. Figure 1 gives a snapshot of the data.
Considering the month of the first purchase for a customer as their acquisition month, we can define the retention rate as the percentage of customers who came back in the subsequent months. First month’s retention rate will be 100% since it is the acquisition month, meaning that each customer made a purchase. Tracking the retention rate in subsequent months post the acquisition month can be visualised in the below retention table. Cohort month is the month of acquisition and cohort index refers to the subsequent months post the acquisition. All the values in column 1 (Cohort Index = 1) below are 100% because it is the acquisition month.
To understand the meaning of the percentages on the retention table, look at the highlighted number, 36%. It tells us that 36% of the customers acquired in January 2011 made a purchase in month 3 (March 2011) from their first purchase.
The retention table provides early indications if any changes across the business are impacting customer purchase behaviour. For a business with a typical retention rate in the range 25–30% at month 2, any drop below this range can be a cause for investigation. Perhaps a certain pricing change impacted the purchase behaviour or release of new product features did not sit well with customers. Once the underlying cause is identified, the business can quickly take a mitigating action.
For certain businesses where the basket size varies significantly across customers, tracking the average quantity bought or the average money spent since the acquisition month could be a better metric rather than retention.
RFM Segmentation
In addition to segmenting customers in cohorts, another popular technique is RFM segmentation, which stands for Recency, Frequency and Monetary value segmentation. For this segmentation, we calculate three customer metrics — Recency, Frequency, and Monetary Value. Recency measures the number of days (or another time period) since the customer’s last purchase, frequency measures the number of transactions over the last 12 months (or another time period) and the monetary value measures the money spent by the customer over the last 12 months (or another time period). Figure 3 below is a snapshot of the customers in the dataset with their recency, frequency, and monetary values.
These metrics can be used in a variety of ways. One option is to define quartiles for each metric and then add them together to generate the RFM score. For frequency and monetary value, the higher quartile values are better. Therefore, customers in quartile 4 will have higher purchase frequency and higher spend value than customers in quartile 1. However, for recency the lower values are better. The average recency of customers in quartile 4 will be lower than the recency of customers in quartile 1. This needs to be considered while defining the quartiles. Once the quartiles have been defined, they can be added together to create an RFM score.
RFM Score = Recency quartile + Frequency quartile + Monetary Value quartile
Figure 4 below is a snapshot of the data with customers’ recency, frequency and monetary values, recency quartile (R), frequency quartile (F), monetary value quartile (M) and RFM score.
Analysing the RFM score cohorts, we can see the differences in the purchase behaviour between them.
In addition to defining RFM scores by adding the quartile scores, we can also use another popular technique to group customers based on their recency, frequency, and monetary value. This technique is k-means segmentation. It is one of the most popular unsupervised learning techniques. Implementing k-means segmentation has three prerequisites: all the variables have symmetrical distributions (no skewness), all variables have the same average value so that they get an equal weight in the k-means segmentation and third, the standard deviation of all the variables is the same. Since these conditions are not true for most data, we need to centre and scale the data before we can apply k-means segmentation. In our dataset, these three conditions are not met for the recency, frequency, and monetary values of the customers. Before applying k-means segmentation, we need to centre and scale the data. Once the data pre-processing step is complete, we can implement the k-means segmentation. As the first step, we need to define the number of customer segments (or clusters) for the segmentation. Typically, this is done using the elbow criterion method. The elbow criterion method plots the sum of squared errors for different number of clusters. The sum of squared errors is the sum of squared distances from each data point to its cluster centre. The plot shows the segment number where the rate of decrease of SSE reduces and becomes marginal. That point where the rate of decrease of SSE reduces, shows where there are diminishing returns by increasing the number of clusters. This point represents the optimum number of clusters from a sum-of-squared errors perspective. However, it is good practice to choose two or three different cluster numbers around the elbow to test what makes the most business sense. Here is the elbow criterion plot for this dataset.
Based on the plot, we will implement k-means segmentation using both 2 and 3 clusters. Figures 7 and 8 show the results for the k-means segmentation for both the cases.
Based on these results, it makes more sense to use 3 clusters because they vary significantly from each other and are considerable in size. To better visualise these 3 clusters and how do they compare with one another in terms of different metrics, we can create a snake plot. Using the plot, we can clearly see customers in cluster 2 spend more and make more frequent purchases compared to the other clusters.
Now that we have three customer segments based on this exercise, they can be used in a variety of ways.
1. Product strategy: Segmentation allows companies to understand the preferences of their segments and allows them to offer products geared towards their target customers.
2. Customised marketing strategy: Each segment requires a different targeting strategy. The underlying characteristics and behaviour can help personalise the experience for each segment.
3. Cross-sell and upsell opportunities: Certain segments offer more upsell or cross sell opportunities. Moving customers from a low spend segment to a high spend one will increase the overall revenue.
4. Increase customer retention: With a better understanding of the segments, companies can personalise the experience for each segment which is likely to increase customer satisfaction and retention.
5. Targeted advertising and media buying: Knowledge of segments can also allow companies to run targeted marketing campaigns rather than a broad ‘hit-all’ approach.
Customer segmentation offers great advantages when used effectively in conjunction with other techniques. It is relatively straight forward to implement and offers a convenient way to provide a tailored experience to a variety of customers with certain shared attributes, while improving the company’s revenue streams.
Link to the code: https://github.com/j-arora/Customer_Segmentation.git
References:
2. https://www.bain.com/insights/customer-experience-tools-and-trends-2020-let-no-tool-stand-alone/
3. http://archive.ics.uci.edu/ml/datasets/Online+Retail+II
4. https://learn.datacamp.com/courses/customer-segmentation-in-python
5. https://www.kaggle.com/fabiendaniel/customer-segmentation/notebook