Spotting The Right Targets for Your Marketing Strategies

Uplift Model — A predictive model to identify targets who are most likely to buy because they are targeted, unlike spray and pray marketing strategies with customer segmentation and propensity modeling.

Hshan.T
CodeX
7 min readJun 2, 2022

--

Previous post discussed customers segmentation with RFM analysis where the associated marketing strategy will be the so called spray and pray approach. We identify groups of customers with specific spending behaviour from basic transaction data based on three measures Recency, Frequency and Monetary Value, as targets for marketing promotion. The target group can be possibly lapsed, new acquisitions or those frequent purchasers of the similar items. The drawback of such a strategy is that we are clueless about who will be reacting to the content, some might not even open up the mail, like me? As compared to catalogue marketing in older days, digital marketing such as email marketing and social media marketing, are the current main streams in the industry. Flooding customers’ mailbox with unrelated promotions might frustrate them, reducing the tendency of content viewing.

Instead of searching for marketing targets by predicting who will be possibly making a purchase, the uplift model predicts who will be buying because the marketing campaign is directed to them. In other words, they buy because of the marketing effort, so your business spending/investment is more profitable (higher ROI) and creating value. Advantages of uplift model:

  • improve customer experience (Reducing bulk of unrelated promoting emails.)
  • increasing efficiency of marketing strategies (Increasing return of marketing investment.)
  • avoiding overuse or addiction to promotions

Before talking about the uplift model, let’s have a brief overview of another popular marketing model, propensity model. In simpler terms, the propensity model is actually a classical supervised machine learning model, but the specific problem statement and procedures following the prediction give it the title. Propensity model can also be applied with minimal RFM data, but as data collected by business is increasing in terms of width and depth the model can be more extensive or better performed. From the past marketing campaigns, we can get the target variable (binary variable) whether each customer responds by making a purchase. It works as follow:

  1. Data collection about marketing campaigns and customers receiving the promotion.
  2. Data pre-processing, including cleansing, features synthesized, data normalization, etc.
  3. Modeling.
  4. Customers with the highest top-k responding probability will be targeting utilizing the quota available.

In contrast to the propensity model, the uplift model gives a more actionable prediction. Instead of acting passively by sending the campaign brochures to potential responders and leaving everything to chance or fate, the uplift model puts business at a more proactive position to influence the outcomes by exploring how our intervention will increase the likelihood of desired outcome. Why uplift models? Let me answer this question with another question.

“ Would you spend an extra dollar on someone who is most likely to make a purchase or to persuade someone to change his mind and make a purchase? ”

The uplift model starts with two distinct setups for data collection, control and treatment. If you remember your science lesson, control is the setup where all the conditions are kept constant, in other words samples in the control set are not treated. Two different datasets enable us to compute differential advantages of action taken. At this point of time, you might be confusing or wondering how uplift modeling is done. Two models approach will be discussed briefly in this post. Target variable for uplift model can be numerical or categorical, it decides on the models to be trained on control and treated datasets. Similar to the propensity model, it is actually a supervised modeling problem which is slightly more complex as it involves two separate datasets simultaneously.

Nevertheless, the name ‘ two models approach ’ should have given hints on how the problem will be handled. 2x2 matrix below gives you an intuitive of what we are hunting for.

R: Responding; N: Not responding

If you have a second thought of the matrix, targeting A does not make a difference because they respond whether they are targeted or not. Why don’t you save the marketing expenses for better use? We are looking for the group of customers in B as campaign targets and trying to avoid sending campaigns to those in C to avoid negative impact from our marketing effort and possible loss of customers. Two models are trained on control set and treated set respectively, MC and MT. Uplift score is then calculated to evaluate the impact of treatment given.

In some cases, Uplift_Score mentioned here is also known as individual treatment effect. Sign of the score depends on the nature of the target variable we are looking for. In particular,

  • If predicting churned probability, we expect a drop in churned probability and hence MT(X)<MC(X), negative score is preferable.
  • predicting probability of purchasing, we expect an increment in buying probability and hence MT(X)>MC(X), positive score is preferable.

Next step will be answering the aim of the analysis that you should have been deciding in the very beginning of the analysis. Are you intending to conduct population level analysis or prediction at a higher granularity of individual level. For population level analysis, we will be looking at the average uplift score and its confidence interval. If zero is included, treatment effect may not be considered significant depending on overall distribution of the scores. By conducting an A/B test, uplift modeling can be intermediate steps to choose between the range of treatments available. For individual level, as discussed earlier, we are able to identify the right targets through the scoring for our campaign to maximize return on investment. I hope the discussion above is brief enough as an introduction for uplift models and, this will excite interest and curiosity to further explore this topic.

Example

Consider a retailer who would like to conduct uplift analysis by offering a 20% discount to sell a new product. 1000 customers are selected randomly from the customers database for the experiment. There is cost associated with every marketing campaign so it is logical thinking to have a smaller treatment set. Cost referred here can reduce profit margin.

Experiment Set: 250 randomly picked from 1000 customers.

Control Set: remaining 750 customers.

There are 100 customers (40% responding rate) and 600 customers (80% responding rate) respond by placing an order for treated and control sets, respectively. An overall uplift score for treatment given is -0.4, showing an adverse effect. Confidence interval at 99% is computed from the control set using the individual uplift score. Plotting a histogram of uplift score to visualize the distribution and draw insights from statistical point of view, such as distribution function, range of values, location of zero, skewness, etc.

Implementation of Uplift Model

Let’s get our hands dirty by applying the uplift model using a simple python library with simple implementation, evaluation and visualization, scikit-uplift.

Dataset.
Two models approach training.
Average uplift predicted and its confidence interval at 99%.
Distribution of uplift prediction.
Qini Curve for evaluating uplift models.

Implementation shown above is the most basic application of the uplift model. scikit-uplift is selected here as I think it is a good idea to begin with something fundamental. There are some drawbacks associated with this two-models approach and hence there is single-model approach introduced and now even more advanced meta-learner algorithms. Other elegant and sophisticated python libraries are available out there to build more performing models. Explore it if you are interested to go further and deeper!

Summary

Advanced development in modeling algorithms, storage and computing hardware makes data application or data driven decision making more seamlessly, provided you have enough and good quality data. It is always too late to realize data captured for past events is insufficient for a model to learn. For example, in the case discussed above, a business might consider storing not only customers data but also campaigns data. Data is always the foundation of a decent data solution.

Regardless of the types of marketing model you choose, improving customer relationship management (CRM) is the ultimate goal, typically for up-selling, cross-selling, customers retention (high cost for new acquisitions), boosting sales cycle, etc. Uplift model is investigating causal effect of actions/treatments for effective resources allocation, enhancing the customers conversion and minimizing negative impact of marketing campaigns forwarded. Although we focus on business application, the uplift model is suitable for use cases in other industries as well, as long as our main objective is to model the incremental impact of an action. Its modeling setups that are identical to classical scientific experimental setup reveals its degree of generalization.

--

--