Uplift modelling is hard — but worth it
One of the main tasks in marketing is to make good use of money to inflict a “positive” change in customer behaviour through actions. As all actions try to change uncertain future behaviour, selecting the the course of action and the people these actions are directed towards is hard.
The whole industry of analytical customer relationship management (CRM) is based on this premise of this hard decisioning problem.
For quite some time now, a large chunk of the analytical CRM space mainly relies on predictive modelling. The most commonly used models are called propensity models and response models.
Propensity models are very simple models that just try to infer the future conversion and spending behaviour of existing clients by taking into account past interactions and purchases. These models are very suited to be used as a start into the world of predictive analytics and to get a grasp of the future lifetime value of a customer. They are, however, not very well suited to manage individual campaigns.
Response models are the next step. These models try to connect customer behaviour with certain marketing activities. They are built by looking at customers who received a marketing activity and looking at their actions afterwards. This allows for modelling expected responses to campaigns and is thus better suited for campaign management then propensity models, but this method also has a significant drawback: It does not take into account the actions of the customers that did not receive that marketing activity. This may lead to targeting people that would have spend money anyway. Even worse, this may lead to targeting people who might decide to not spend any more money because they received the marketing campaign. This is a common problem in the telecommunication industry, for example, where retention offers at the end of the customers’ contract term “remind” these customers that they can switch carriers in the first place.
If we look at this challenge from an analytical point of view, we can easily categorize the individual outcome (buy/not buy) of campaign treatment into four cases:
- Buy only if treated: The persuadables
- Buy always, regardless of treatment: The sure things
- Never buys, regardless of treatment: The lost causes
- Buy only if NOT treated: The sleeping dogs / Do not disturbs
Clearly, as a marketer, you don’t want to wake sleeping dogs and you also don’t want to waste money on treating customers that will never buy or that will always buy. This leaves you with only one interesting group to treat: The persuadables. These are the customers that show an increased propensity to buy if they are treated with a marketing activity.
Boldly speaking, the holy grail of analytical customer relationship management lies in arriving at predictive models that tell you
a) Which customers can be influenced positively by marketing spend?
(Who are the persuadables?)
b) Which marketing activity influences each individual the most?
(What makes the persuadables react?)
To answer this question, marketers need to take a different approach to building predictive models. They need to model the net effect a marketing activity has on a customer as opposed to do nothing with this customer. These kind of models are called uplift models, because they directly model the net increment in response probability due to a marketing action.
Doing this seems like a hard task at first, because it is only possible to observe one of the two possible scenarios within one customer: Either she received the treatment or she didn’t. This is the reason why uplift models are in fact much harder to implement than normal response models. The solution is to find similar patterns in customers and then randomly assign customers with similar patterns to a treated group and to a control group.
This means that it is not possible to estimate uplift models without having randomized control groups that are large enough for statistical inference. Using large(ish) control groups is something marketers often struggle to do. This relies on the simple fact that most marketing activities do have a positive effect and not subjecting a large group of your customers to these activities results in a loss of revenue.
I constantly try to convince our clients at Gpredictive to use control groups. They are not only useful for estimating uplift models but also for measuring the net effects of campaigns. If you don’t have random control groups, you’ll never know if your campaign has really increased your revenue or if you just targeted all the loyal clients who’d reacted anyway. Even worse, if you use vouchers and rebate actions, you might actually decrease your revenue by sending rebate offerings to customers who would have bought without it.
With uplift modelling, it is possible to answer these kind of questions and truly optimize your campaign targeting for net effects. Some use cases of uplift modelling are:
- Decide which customers really need to receive discounted offers
- Decide which customers need expensive marketing collateral like catalogues or even phonecalls
- Decide which customers really need a high frequency contact approach or would also have the same kind of spending with a lower contact frequency.
- Decide which customers not to send an retention offer for renewing contracts (Not waking sleeping dogs)
We have seen tremendous cost savings and revenue increases with uplift modelling in our industry. It does not come without cost, as you have to have a certain level of data quality and large enough control groups to be able to estimate these models — They are much more sensible to mis-specifications as traditional models. But, when implemented correctly, the ROI on using control groups and uplift modelling approaches can exceed 10x.
P.S.: In order to calculate the needed size of you control group you can use the excellent A/B-testing calculator of Evan Miller. If you want to know the minimum size of your control group, you just have to enter your assumed or already measured baseline conversion rate (i.e., the conversion in a campaign timeframe when there is no marketing treatment) and the minimum uplift effect you want to detect.
For example, if you have a baseline conversion rate of 1% and you want to be able to detect an uplift effect of your treatment if it exceeds 40% (i.e., the conversion rate with treatment is greater than 1.4%), then you’ll need a control group of about 10,000 customers in your campaign. If you are okay with an error margin of 10% instead of 5%, you only need about 8,000 customers in the control group.