Customer Segmentation: Fooled by Randomness
How simple statistical phenomena can surprise us
Let’s imagine you work for a retail company that averages $1,000 per transaction. But in fact, this number can widely vary between hundreds and thousands of dollars per sale. You look a little closer and discover that among customers whose first transaction is under $200, their next will be, on average, twice that amount. Is this a meaningful discovery? Should you self-congratulate on a successful up-selling to this segment? Certainly, it is possible (and quite probable!) the initial transaction made a good first impression, but let’s explore this problem from another angle, and show how randomness can also play a role. What I hope to share with you here is a simple statistical outcome that might seem surprising to those not familiar with it, known as regression to the mean.
There are countless ways to segment customers: gender, location, age, product preferences, and so on. These divisions can help a company identify and understand its own market. They are also proxies to what we really want to know; at the most fundamental level companies want to anticipate where to expect their next dollar. And the more dollars, the better. So the most intuitively direct means by which to gather this information is to segment customers through absolute spending. After all, identifying and acting on loyalty makes for the foundation of a successful business.
For the sake of argument, say we are in a business of variable customer spending, with one transaction per customer, and that our past quarterly transactions looked like this:
Somewhat arbitrarily, a marketing executive decides to split up spending into ‘low’ (under $700), ‘mid-range’ ($700 to $1,300), and ‘high’ ($1,300+) buckets. Slicing customers into revenue groups appears to tackle the problem head-on, such that we can explore each group separately, perhaps between three different teams. ‘Divide and conquer’, so to speak. What I haven’t told you yet is how I generated the data, although maybe you could have guessed:
# --- Python code --- # import numpy as np# create random sample of spending on 10k customers
samples = np.random.normal(1000, 250, size=10000)
samples[samples < 0] = 0
Very simply, I created a normally-distributed sample of 10,000 customer ‘data’ points, with a mean of 1,000 and standard deviation of 250. Despite its artificially-constructed nature, there are many real-world cases where such distributions hold true. It turns out random, normal distributions explain a lot of things, from test scores to extreme weather events. Sometimes, they can be used to explain purchase behaviour, too.
In our make-believe scenario, after having labelled our historical data, we track these customers over the next 3 months while implementing various marketing actions according to these subdivisions. Assuming they all returned for a second transaction, the outcome would look like this:
For the group labelled ‘low’, average spending went up! For the group labelled ‘high’, spending went down, while the middle group stayed put. Does the low-spending team deserve promotions while the high-spending team re-asses their careers? The answer is an emphatic ‘no’ in both cases. Why? Because of regression to the mean! Here’s the punch line: The reality is that all 10,000 customers were really cut from the same cloth. By sheer chance, a subset of them spent a little less last quarter. But random numbers have no memory, and so each collection of labelled points naturally migrated back towards the $1,000 mean when the drawing of numbers was reset. Not only were the segments artificial, but here they were also completely imaginary.
Perhaps you are not convinced, since maybe some distinct differences between lower and higher spending groups really do exist. I couldn’t agree more, but that does not mean the ‘problem’ necessarily goes away. Here is a little messier example with three overlapping groups of spending:
sample_groups = np.array([])for i in (500, 1000, 2000):
segment = np.random.normal(i, i/3, size=4000)
sample_groups = np.append(sample_groups, segment)sample_groups[sample_groups<0]=0
This time we have segmented out the top 25% and bottom 25% by spending. What’s revealed in the next hypothetical quarterly review is different numbers but the same story. Although we have found statistically significant differences between our segments, regression to the mean still dominates the explanation. If your company is surprised ‘premium’ customers show a decline in spending from one quarter to the next, the explanation could be as simple as random noise. This oversight will appear time and again, unless the lines between customer groups can be truly separated and distinct.
The assumption that individual outliers stay anomalous over time is a fallacy, and a famous one at that. There are well-known instances of it appearing in all kinds of settings. One trivial case has been called the Sports Illustrated Jinx, whereby famous athletes are captured at their peak, then said to have declined in performance thereafter (hence mis-attributed to the cover photo). In an intervention noted by Daniel Kahneman, military pilots were observed to perform better after having been disciplined, but it was actually a case of mistaken causality since statistically, on average, many a poor performance is followed by a better one (yelling or not). It is somewhat like capturing a pendulum at its widest swing and falsely assuming it will prefer that position thereafter. By contrast, certain divisions like age, sex, and location do not randomly fluctuate over time, making these reliable features in a predictive model.
Customer segmentation is a challenging but rewarding topic. We can easily be fooled by randomness. As we gather more data, we want to feel that we can anticipate actions, but sometimes we must consider the limits to our knowledge. The most important takeaway here is to not think we know more than we actually do! Models help account for the nature of these random walks and the freedom of choice people experience while shopping. Furthermore, this helps make interventions to improve customer satisfaction at every level more meaningful, and the metrics behind these actions more significant. Even with all this uncertainty, we still want to have a good relationship with everyone that comes our way, and that is something here at SSENSE we are very certain about.
Editorial reviews Deanna Chow, Liela Touré, & Prateek Sanyal.
Want to work with us? Click here to see all open positions at SSENSE!