Customer Segmentation

Strategy development based on customer lifetime value

Rashid Kazmi, Ph.D.
The Startup
19 min readOct 14, 2020

--

(https://www.istockphoto.com)

NTRODUCTION

Business analytics, big data, and data science, are very hot topics today, and for good reasons. Companies are sitting on a treasure trove of data, but usually lack the skills and people to analyse and exploit the data efficiently. Those companies who develop the skills, and hire the right people to analyse and exploit the data,will have a clear competitive advantage. It’s especially true in one demand,marketing. About 90% of the data collected by companies today are related to customer actions in marketing activities. Which, what pages customers visit, what products they buy, in what quantities, and at what price, which banners customers see or which emails they open, and how effective these actions have been to influence their behavior.

The domain of marketing analyticsis absolutely huge, and may cover fancy topics, such as, text mining, social network analysis, sentiment analysis, real time bidding, online campaign optimisation, and so on. But at the heart of marketing, lie few basic questions, that often remain unanswered.

  • One, who are my customers?
  • Two, which customer should I target, and spend most of my marketing budget on?
  • And three,what’s the future value of my customers?

The core parts of customer relationship management (CRM) activities are understanding customers’ profitability and retain profitable customers. So one can concentrate on those who will be worth the most to the company in the future. That’s exactly what this article will cover. Segmentation is all about understanding your customers whereas customer lifetime value (CLV) is about anticipating their future value.

STATISTICAL SEGMENTATION

To illustrate how you build a segmentation lets take a very simple graphical example. Let’s assume that you have only ten customers in your database and that these ten customers are only described by two factors, or what we call segmentation variables. Graphically these ten customers can be represented on a two-dimensional map, where the first horizontal axis represents the frequency of purchase, with the most frequent shoppers being on the right. And the second vertical axis represents the average purchase amount, with those customers who spend the most on each trip appearing at the top of the chart.

Intuitively, you can see that the market neatly separates into three segments. Each segment represents a group of customers that is distinct from the other groups and all customers within each group is quite similar. It is striking to notice that the purple segment contains customers who buy very often, and for a lot of money at each purchase event. The green segment groups together customers who make regular purchases, however for lower amounts. In orange segments are a customer who shops really less frequently but they spent whenever they do shopping. From a natural point of view, these three segments, make a lot of sense. The purple segment is strategic and may generate a vast portion of the firm’s profit. The green segment constitutes the bulk of the purchases, because they buy frequently, but may not be as profitable as one might think. And the orange group is a perfect target for marketing actions that might encourage more frequent purchases, such as special store events or seasonal coupons.

In this article, we propose a framework for analysing customer value and segmenting customers based on their value. After segmenting customers based on their value, strategies building according to customer segment will be illustrated through a customer purchase data

HIERARCHICAL SEGMENTATION

Segmentation is all about finding a good balance between usability and completeness, between simplifying enough so it remains usable and not simplifying too much, so it’s still valuable. A good segmentation should be statistically relevant and managerially relevant. But how can a statistical software find the three segments automatically which we have mentioned in the above example? There are many methods, but the one I am going to explore is called hierarchical clustering. The internal process a software goes through goes as follows. First, you consider that each and every customer. It has its own segment, to begin with, so there are as many segments as there are customers, in this case, 10.

Then you ask the question which two clusters or at the very beginning, which two customers could I group together so that I would lose the least information. That is, they would be so similar that if I treated them as absolutely identical, it would make no difference. Obviously, the answer is that the two most similar customers or segments are those in the lower right corner because these two customers are almost identical in terms of frequency of purchase and purchase amount. So if you group them together and treat them as if they were identical you wouldn’t lose too much information. This process goes on until you mind a workable solution for the business needs and wants.

RECENCY, FREQUENCY, AND MONETARY VALUE

In many marketing studies, three specific marketing indicators often turn out to be invaluable. They’re called Recency, Frequency, and Monetary Value. They have been shown to be some of the best predictors of future purchases and customer profitability. Marketing segmentation that uses recency, frequency, and monetary value as segmentation variables are often referred to by the acronym RFM segmentations [Analysis Notebook].

  • Recency indicates when a customer made his or her last purchase. Usually the smaller the recency, that is the more recent the last purchase, the more likely the next purchase will happen soon. On the other hand, if a customer has lapsed and has not made any purchase for a long period of time, he may have lost interest or switched to competition, which is bad news for future business.
  • Frequency refers to the number of purchases made in the past. The more purchases have been made in the past, the more likely additional purchases will occur in the future.
  • Finally, Monetary value refers to the amount of money spent on average at each purchase occasion. Obviously, the more a customer spends on average, the more valuable he is

We explored and extracted recency, frequency, and monetary value and we have 1,800 customers purchase data. Within them, obviously, it’s not readable. You can make it prettier if you’d like, and then you can see how all these individuals have been clustered together progressively step by step up to a stage where there is only a big cluster here. We stopped at five, then basically what I did cut the clustering tree, the dendrogram, much lower, where I had five clusters. It becomes tricky where to stop because there are multiple criteria to use in terms of statistical fit, in terms of modular relevance, and in terms of targeting ability [Analysis Notebook].

Of course, that’s not very useful if you don’t know what these clusters are all about, so what you’d I did compute the average profile of each segment. What you care about are the averages in terms of recency, in terms of sales amount, in terms of euros, frequency in terms of the number of purchases. The first few variables, which are relevant to us, meaning frequency, recency, monetary value were grouped based on their cluster membership which came from the number of the cluster we decided to cut the tree i.e. 5 clusters [Analysis Notebook].

What you see, for instance, is that if you focus on cluster number 2, which contains 306 individuals, cluster number 3 has average recency of 162 days, an average frequency of 4.39 purchase made in the past, and an average purchase amount of €63. Clusters number 2 spends much less. Cluster number 3 made a huge number of purchases in the past and so and so forth. So if you study that more carefully, you can see that the segmentation mechanism grouped people into clusters of imagined customers, all with different profiles and all that can be characterised in terms of managerial interest.

MANAGERIAL SEGMENTATION

We have talked about recency, frequency, and monetary value, and run a statistical segmentation to identify segments in the data set. Although it might come as a disappointment to some of you, many companies do not use this kind of statistical methods to segment their customer database, but use rather a simple set of rules, and for good reasons.

First of all, customer data is added on a continuous basis. Every day, every second. Customers make the purchase and modify their behavior. Old customers disappear, new customers are acquired, and the segmentation solution you’ve put in place becomes obsolete almost at the moment you run it. So you need to update your segmentation very frequently. Second, you cannot run that kind of statistical segmentation automatically without the supervision or involvement of a data analyst or statistician. It could become very expensive to run frequent updates on a fast-moving database.

Finally, even if you could solve the issue of frequent costly updates, the optimal segmentation solution will vary over time. Maybe a bunch of new customers will be acquired and will create a new segment solution. Or maybe customer behavior is seasonal, and the best segmentation solution from a strictly statistical point of view will not be the same, whether you analyze your data during the summer season or around Christmas.

It makes managers’ life very tough, because segments change all the time, and may not be comparable from one update to the next. How do you follow customers over time, or put in place marketing actions specific to a segment, if the very definition of that segment may be modified during the next update? So, let’s create an example of managerial segmentation together. Let’s go back to our data set, and think about ways managers might be using the marketing segmentation. For example, which customers should receive more catalogs, more coupons, more emails, phone calls, or direct mail solicitations? How should we split or segment our database? We’d like these decisions to be based on who the customers are, how much they spend, and how likely they will buy from us in the future. So, let’s take our customers, and divide them into four groups, or segments, based on their recency [Analysis Notebook].

  • Active: We define as active, a customer who purchased something within the last 12 months,
  • Warm: warm, someone whose last purchase happens a year before, that is between 13 and 24 months.
  • Cold: We qualify as cold, a customer whose last purchase was between two and three years ago.
  • Inactive: For those who haven’t purchased anything for more than three years, we qualify them as inactive.

Now, let’s go one step further, and divide the remaining active customers into two subgroups. Based on how much money they spend on average. Based on our analysis, we’ll decide to put the bar at €100, and qualifies the highest value. Those customers spend €100 or more, on average, on each purchase occasion. And qualify, as lower value, those who spend less.

Next, I selected all customers who have had segments set to active for the time being and who have made the first within the last year. Well, if they are active and made a purchase in the last year, they are new active. And so I’m requalifying the segment here based on the segment that they had before and first purchase date. Or, here as the amount. So if you are qualified as active and you’ve made an average purchase of less than €100, then you are qualified as active low value. If it’s above 100, then it’s active high value. And remember, your segments cannot overlap. And they need to cover everybody. In total, we now have eight segments.

This segmentation is, obviously, very simple, and you could imagine much more complex segmentations with dozens of segments, and many more than only a few segmentations variables used. But you get the point. It can be already very useful. And it’s quite relevant for managers.

CUSTOMER LIFETIME VALUE (CLV)

Imagine the following situation, your company launches a large scale customer acquisition campaign. Your firm invests in public relations, advertising, email campaigns. It buys a host of keywords online and set up a promotion and offer a hard discount for customers who purchase for the first time. At the end of the campaign, because it’s so hard to get through to the clever and to reach, convince, and acquire new customers. Well, the acquisition campaigns turn out to be only a minor success. Sure enough, the company has acquired new customers and generating new sales, but the sales generated do not even cover the cost of the campaign. The company has lost money.

  • Now what?
  • Was it a success?

But of course, the manager in charge can always argue that these newly acquired customers will remain loyal, purchase again, and generate additional revenue in the future. In a sense, the acquisition campaign is not a loss, it’s an investment and the company will reap its benefits over the next few weeks, month, or even years.

Well, maybe it’s true or maybe it’s just wishful thinking and the major is trying to find a execute to justify a campaign that was not successful, but wouldn’t it be nice to be able to actually compute the value of a new customer over a given period of time and compare that figure to the cost of acquiring such a new customer to prove or disprove it was a good investment. That’s the whole point of customer lifetime value models or CLV.

The goal of such methods is to analyse what is happening today and what has happened in the recent past in order to predict the revenues customers will generate in the future, that’s what you learn to do in this module.

Customer lifetime value models have many other applications in practice. For instance, you could compare one acquisition campaign to another. Suppose you compare two campaigns, one with a very deep price discount and one with a moderate discount. It’s likely that the promotional campaign that offers the larger price discount will attract more customers, but these customers will be less profitable in the short-term and be much less loyal in the long run once the price discount disappears. Because the only reason they came in the first place was for the price promotion. At the end of the day, offering a moderate price discount and attracting fewer customers might be the best strategy. If they generate more revenue and remain loyal longer, but how can you know for sure?

If you cannot put a euro figure on the long-term value of your customers, but customer lifetime value is not only useful when applied to new customers. It’s also very useful when applied to exist ones. One of the best applications of customer lifetime value is to identify, which customers or which segments of customers are strategic for the future success of the firm [Analysis Notebook].

TRANSITION PROBABILITIES & TRANSITION MATRIX

Different methods have been proposed to compute customer lifetime value. Some are extremely simplistic. Probably too simple to be of value. Other techniques are overly complex. We’ll chose a middle-ground approach and learn to compute customer lifetime value using a concept we’ve already covered segmentation section. We have discussed in the previous section how to construct managerial segmentation and how you can develop rules to assign each and every customer to a segment based on his or her most recent behaviours, such as recency, frequency, and monetary value [Analysis Notebook]. As explained earlier you could not only run a segmentation study on the most recent available data but also you could go back in the past and run a segmentation analysis retrospectively.

It is interesting to note that this gives us interesting insights into how the firm is doing. Nonetheless, it also tells us which segments are growing, are we adding more new customers today compares to a year ago. Hence, it could conceivably be hypothesised that each segmentation study is a picture of customer database. Companies can easily build a series of snapshots. More interestingly companies can do even better, and analyse how the transition has have happened from one snapshot to the next.

For example, we have a very simple segmentation in place. In above analysis we have simplified version of the segmentation and kept only five segments namely active high-value customers, active low-value customers, warm, cold, and inactive customers. It can be clearly seen that the most active and most profitable customers are on top and the least valuable customers are at the bottom. Using this segmentation, you can make a snapshot of your customer database to date. Now how many customers you have and how many fall into each segment. Interestingly, one can also check how customers went from one segment to another.

If you take the active high value customers, maybe some kept making high volume purchases and remained in the same segment, which is good news. Others remained active but spent less and moved to the lower value segments. Yet other customers might have not purchased anything at all over the last 12 month, and because of that, have now fallen into different segments. The trick is to look at these figures and transform them into probabilities, and consider that these probabilities will likely remain stable over time. In other words, if half of your high-value customers a year ago remain in the same segment this year, then it’s likely that about 50% of your high-value customers today will remain, high-value customers, next year. And the same logic will apply to the year after, and the year after, and so on. What is striking, by analysing what happened in the recent past, we will predict what will likely happen in the near future.

REVENUE GENERATION PER SEGMENT

Moving on now to consider transforming the transition matrix in the relevant segment in the coming year or decade and ultimately transform that sgement into euro. The next logical step is to assume that the revenue generated by a customer can be fully explained and predicted by the segment to which he or she belongs to [Analysis Notebook].

Whether today or ten years from now. Now if an average customer in a high-value segment generates an average of over €200, we can simply assume that these segments will not change over the years. In reality, it might go up or down, but without additional information, our best guess is to assume that this figure will remain stable over time. But of course, all €200 are not created equally, for one customer will spend €200 today another will spend the same amount. But only five years from now, the first customer is more valuable. Why? Because future revenues are uncertain, distant, and not immediate. Therefore, they need to be discounted.

REVENUE GENERATION PER YEAR

Now we’re going to use these predictions we’ve just made to compute the value of the database. We have calcualted revneiue per segment. Customers called inactive, cold, warm, by definition generate no revenue. Whereas the active high-value customers on average generate €323 of revenue per year. The revenue for active low value and new active goes down to 52 and 79. After multiplying yearly revenue to predictions of segment membership over the next 11 years we can see here the active high-value customers will generate, actually have generated in 2015, €185,405. Then the number of customers in that segment increase, decreases, stabilise, and every time for every customer in that segment they will generate €323. And you see the values here going up and down as a function of predictions. The next thing we’d like to do because that’s segment per segment, year by year. What we’d like to do is to compute the sum of each column. So we have the yearly revenue of the database for that year. We summed up revenue for all years to get yearly reneveue for each segements. o in terms of revenue predictions, without any discount we’ve generated €478, 413 in 2015 [Analysis Notebook].

So how much money will we have made in 2025 correlated over 2015, 16, 17, 18, and so on and so forth? You can compute that using what is called the cum sum, the cumulative sum of yearly revenue. You compute it, print it, and that’s how much revenue will have been generated over the years. And as you can see, of course, it can only increase since every area you add new revenue to that. But you probably see that the slope of the curve is slowly deteriorating because every year a different customer hence we lose a different revenue [Analysis Notebook].

DISCOUNTING REVENUE

When we discount revenues in computations, we are not simply talking about inflation. We are talking about risk and uncertainty. Even if there were zero uncertainty about future revenues, most firms put more weight on short-term revenues. So, a euro today is worth more than the same euro tomorrow. For that reason, we discount future revenues in our computations. It’s like a reverse interest rate. The longer you have to wait to generate revenues, the less these revenues will be worth for the firm today. There’s no clear guideline about the kind of discount rate you should apply. But simply remember this, the higher the discount rate, the more short term focused you will be. On the other hand, with a discount rate closer to zero, future revenues would not be discounted as much and computations would be much more long-term focused. But a euro ten years from now is not worth a dollar to them. It needs to be discounted. I’m going to set the discount rate at 10% and compute the discount rate for years one to 11. So basically from the year 2015 to 2025, knowing that for 2015 the discount rate cannot be applied since it’s today. So I remove one year here as if it were immediately now. And the function is 1 / ((1 + discount_rate) to the power of how many years you need to wait to get the money [Analysis Notebook].

In the first year, we’re not going to discount anything. That’s today, today’s money worth in today’s dollars. Then after a year, a dollar would be worth €0.90. Then only €0.82, €0.75, and so on. And as you can see, after 10 years it will only be worth €0.38. How am I going to use that information? Well, simply by taking all the values I’ve generated so far. Such as the yearly revenue, and multiply these figures by the discount rate to get something that is worth €2,000. That’s how much money in two days’ worth is gonna be generated in 2008, 19, and so on. And so by the year 2025, we generate the equivalent of what would be worse to the €626,709 for two reasons. Number one, it’s in ten years, so it needs to be discounted. Number two, it’s in ten years. Meaning many customers will have left and will not be active anymore. And if I plot that here, here you have the undiscounted revenue generated of the years. And here you have the discounted ones. As you can see it, drops dramatically. Since the further away in the future you get the revenue, the less worth it is in today’s dollars. In terms of cumulative revenue, exactly the same thing. You take the yearly revenue discounted and compute the sum, and you get everything we’ve had already, except it’s now represented in discounted revenue [Analysis Notebook].

The last question we could ask is, of the next ten years, how much is my database worth? What’s the true value, the discounted cumulative value of my database in terms of expected revenue of the next ten years. What you can do is to look at how much it would be worth accumulated at the end of the period you are analysing, in this case, the 11th period. That would be 2025, of which you remove the revenue from to date, 2015, which already happened. And if you run and print that, the answer is 12,777,695. So, if you had to value your database in today’s euros in terms of how much revenue it will generate over the next ten years, the answer is €12.77 million [Analysis Notebook].

CONLUSIONS

Since the increased importance is placed on customer satisfaction in today’s business environment, many firms are focusing on the notion of customer loyalty and profitability to increasing market share and customer satisfaction. CRM, the core business concept to enhance customer relationships, is emerging as the core competence of a firm. Building successful CRM of a firm starts from identifying customers’ true value and loyalty since customer value can provide basic information to deploy more targeted and personalised marketing. It can be a starting point of relationship management to understand and measure the true value of customers since marketing management as a whole is to be deployed toward the targeted customers and profitable customers, to foster customers’ full profit potential. Corporate success depends on an organisation’s ability to build and maintain loyal and valued customer relationships. The evidence presented thus far supports the idea to build refined strategies for customers based on their value. Based on customer segmentation and CLV we can make decisions where to allocate resources as well as making sure marketing people have access to the best customer service while closely monitoring their loyalty. By analysing the characteristics of segmented customer groups, we can develop refined strategies for each segment.

👋 Thanks for reading. If you enjoy my work, don’t forget to like, follow me on medium. It will motivate me in offering more content to the Medium community ! 😊

REFERENCES

Ascarza et al (2018): In pursuit of enhanced customer retention management: Review, key issues, and future directions. Springer Science+Business Media, LLC 2017.

Anderson, E.W. and Sullivan, M.W. (1993). The Antecedents and Conse- quences of Customer Satisfaction for Firms. Marketing Science, 12(2), 125– 143.

Blattberg et al (2008): “Database Marketing: Analyzing and Managing Customers”. Springer.

Bolton, R.N. (1998). A Dynamic Model of the Duration of the Customer’s Relationship with a Continuous Service Provider: The Role of Satisfaction. Marketing Science, 17(1), 45–65.

Hallowell, R. (1996). The Relationship of Customer Satisfaction, Customer Loyalty, and Profitability: An Empirical Study. International Journal of Service Industry Management, 7(4), 27–42.

Hawkes, V. A. (2000). The heart of the matter: The challenge of customer lifetime value. CRM Forum Resources, 1–10.

Hwang, H., Jung, T., & Suh, E. (2004). An LTV model and customer segmentation based on customer value: A case study on the wireless telecommunication industry. Expert Systems with Applications, 26(2), 181–188.

“gist-syntax-themes”: https://github.com/lonekorean/gist-syntax-https://businessscience.github.io/correlationfunnel/articles/introducing_correlation_funnel.html

--

--

Rashid Kazmi, Ph.D.
The Startup

Data Scientist. Cricket lover, tech enthusiast, driven to succeed. Hard worker, dreamer, humanist & learner. Believes in data-driven decision making.