Finding Cash Cow Customers in Your Data

A 30 Day Writing Challenge

It’s Day 5 of my 30 Day Writing Challenge. I’m on a mission over the next couple of weeks to learn as much about machine learning as possible. I’m starting a new job later this month where I’ll be building a machine learning team, so I want to get a running start.

So far we’ve focussed mainly on supervised learning. That is, algorithms which require us to train models by inputting data and telling it what the outcome should be. We’ve explored tf-idf for finding important words in a document and then used Bayesian Classification to categorise documents based on the words they contain. Yesterday, we stumbled as we tried to apply machine learning to a problem that didn’t require it.

Today, let’s look at unsupervised learning. That is, algorithms which find patterns and correlations in raw data.

I want to change tack, moving away from finding insights in text. Imagine we run a mobile network and all of our customers pay a monthly fee to make phone calls. It’s safe to assume that some of our customers will be more valuable than others. Some customers will use all of their minutes every month and some will only use their phone occasionally.

We want to assign each customer to one of three groups. Our Cash Cows (people who spend a lot but don’t consume much resource), our Dogs (people who spend little but consume a lot of resource) and those in between. We could offer discount to the customers who pay the most but use the least amount of minutes. We could offer a different plan to the customers who pay the least but use the most minutes.

Here is some sample customer data, I’ve plotted the amount of minutes they’ve used against their lifetime value.

Just a little bit of Googling introduces the k-means Clustering algorithm for assigning points to one of k groups where k can be any number. It’s an unsupervised algorithm, so we don’t need any prior knowledge about which points belong to which group.

The algorithm works iteratively.

  1. Generate k random points as the mid-points of each group
  2. Assign each point to a group with the closest mid-point
  3. Move the mid-points to the centre of their groups

Repeat steps 2 and 3 until the mid-points no longer move.

Running this process against our example data clusters our customers like this:

The points in red look like our Cash Cows — they use the fewest minutes but add a disproportionate amount of value. We could offer them a discount to incentivise them to keep using the service. The points in blue look like our Dogs — they consume the most minutes without adding proportional value. We could suggest a different plan which might suit their needs better.

It’s amazing to see patterns extracted from our data without us training a model. Even without knowing what our Cash Cows and our Dogs look like, we can assign our customers to a group before iterating and refining the model until the groups stop changing. We can have the computer tell us what each group looks like.

There’s no guarantee that this process will always produce actionable results but it’s so simple to run that it’s worth checking to see what patterns you find.


This is a post in my 30 Day Writing Challenge. I’m a software engineer, trying to understand machine learning. I haven’t got a PhD, so I’ll be explaining things with simple language and lots of examples.

Follow me on Twitter to see my latest posts. If you liked this article please click the heart button below to share — it will help other people see it.