An Introduction to Uplift Modeling

Emily Strong
The Data Nerd
Published in
2 min readJun 22, 2022

Marketing campaigns can be expensive. You want to contact only the people who you think will be persuaded to do whatever it is your campaign is about — buying something, donating money, voting, etc. If they were already going to do it, or never will, contacting them is a waste of resources. Worse, there are some people who would have done it on their own, but being contacted causes them to change their mind!

We need a model that can isolate only the people who are persuadable to optimize the return on our investment in the campaign.

This is where uplift modeling comes in. An uplift model predicts the incremental impact of a treatment (being contacted) on an individual when compared to if they were untreated in the control group. Uplift modeling is a form of causal modeling in which we attempt to isolate the effect of being treated on a person’s behavior. This experiment enables us to determine the true lift in the target variable that is attributable to the treatment. It is also sometimes called true lift modeling or incremental modeling.

This approach divides the people who are eligible to be contacted into four behavior segments:

  1. Sure things: the folks who would do the thing regardless of whether or not you contact them.
  2. Persuadables: the folks who will only do the thing if contacted.
  3. Lost Causes: the folks who will never do the thing regardless of whether you contact them.
  4. Do Not Disturbs: the folks who will be dissuaded from doing the thing because you contacted them. Also called Sleeping Dogs (as in “don’t wake a sleeping dog”).
Uplift model behavior segments
Behavior Segments in an Uplift Model

There are algorithms specifically designed for uplift modeling available in the CausalML library, however you can also use simple classification models with one of these techniques:

  1. Train separate models on the treatment and control groups. At inference, generate predictions from each model for the individual and calculate the difference in the target variable. This difference is the lift attributable to treatment.
  2. Assign a treatment indicator as one of the model features. At inference, generate predictions both with and without the indicator and calculate the difference.
  3. Assign people to the 4 behavior segments and train a multi-class classifier to predict which segment someone belongs to.

The use of uplift modeling can significantly improve the return on investment for a marketing campaign, both by avoiding wasting time and resources on contacting people who are sure things, and by decreasing the risk of triggering someone in the “Do Not Disturb” group from changing their mind.

Uplift modeling and other key concepts for working with models in real-world settings are covered in my Machine Learning Flashcards: Modeling Core Concepts deck. Check it out on Etsy!

--

--

Emily Strong
The Data Nerd

Emily Strong is a senior data scientist and data science writer, and the creator of the MABWiser open-source bandit library. https://linktr.ee/thedatanerd