Optimized a promotion campaign by using Uplift Models and Machine Learning

Jumei Lin
5 min readAug 7, 2021

--

Maximizing profits and reducing the cost of large-scale marketing campaigns

A great marketing offer complements the products and services your business sells. Sometimes it’s not profitable to send marketing offers to all customers. For example, a customer subscribes to the paid subscription and forgets he pays a monthly fixed fee. He would continue paying unless a company sends him a discount offer. At this moment, he may realize he does not need to pay the monthly fixed fee or he may cancel the subscription. The marketing campaign ends up cannibalizing you by giving the promotion.

In this article, we will show you how to send the marketing offers that make the right impact by using uplift Models and machine learning. The data used in this article is from Kaggle: Marketing Promotion Campaign Uplift Modelling.

Uplift modeling is a technique that helps to determine probability gain that the customer getting the marketing offers will buy a product.

It’s used widely to maximizing the incremental return of the marketing campaigns, and help you estimate the effectiveness of a marketing strategy.

There are 4 groups in the uplift models including:

  1. Treatment Responders (TR): Customers will buy because they receive an offer. (positive uplift)
  2. Treatment Non-Responders (TN): Customers will not buy regardless of whether they received a marketing offer. (zero uplifts)
  3. Control Responders (CR): Customers will buy regardless of whether or not they received a marketing offer. (zero uplifts)
  4. Control Non-Responders (CN): Customers will not buy because they receive an offer. (negative uplift)
uplift model by Lumei Digital

Our target would be Treatment Responders (TR) which are customers who will not purchase unless you offer a promotion so we will target those customers in the promotional campaigns. Conversely, we will not send the offer to Treatment Non-Responders (TN), Control Non-Responders (CN), and Control Responders (CR) to reduce the marketing cost and maximize the return on investment (ROI).

Hence, today we use uplift models and a classifier to identify the likelihood of a customer purchasing a product. After that, we can target those identified as Treatment Responders (TR) in the marketing campaign.

Your support would be awesome❤️

Please help me get 100 followers.

▹Step 1: Labeling the customers and predicting the probability of customers being in each group.

▹Step 2: Calculate the uplift score. A higher score implies higher uplift.

Uplift Score = P(TR)/P(T) + P(CN)/P(C) - P(TN)/P(T) - P(CR)/P(C)
uplift modeling by Lumei Digital

★ Step 1: Predicting the probability of customers being in each group by using Kmeans and XGBoost Classifier.

We label the customers based on their purchase behaviors. If a customer never receives a promotional offer and purchases the product, he is labeled as “1”, CR: Control Responders. In other words, the customer buys without an offer.

After that, divide the data into 4 groups based on customers’ purchasing history by using Kmeans, and then using XGBoost Classifier to predict the probability of being in each group.

# load data
df = pd.read_csv('data.csv')
# label for classifier
df['campaign_group'] = 'treatment'
df.loc[df.offer == 'No Offer', 'campaign_group'] = 'control'
#0 = CN: Control Non-Responders
df['target_class'] = 0
#1 = CR: Control Responders
df.loc[(df.campaign_group == 'control') & (df.conversion > 0),'target_class'] = 1
#2 = TN: Treatment Non-Responders
df.loc[(df.campaign_group == 'treatment') & (df.conversion == 0),'target_class'] = 2
#3 = TR: Treatment Responders
df.loc[(df.campaign_group == 'treatment') & (df.conversion > 0),'target_class'] = 3
#creating clusters
kmeans = KMeans(n_clusters=4)
kmeans.fit(df[['history']])
df['history_cluster'] = kmeans.predict(df[['history']])
#dropping unnecessary columns
df_model = df.drop(['offer','campaign_group','conversion'],axis=1)
df_model = pd.get_dummies(df_model)
#create feature set and labels
X = df_model.drop(['target_class'],axis=1)
y = df_model.target_class#splitting train and test groups
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=23)
#fitting the model and predicting the probabilities
xgb_model = xgb.XGBClassifier().fit(X_train, y_train)
class_probs = xgb_model.predict_proba(X_test)

★ Step 2: Calculating the uplift score by using this formula:

proba = xgb_model.predict_proba(df_model.drop(['target_class'],axis=1))df_model['proba_CN'] = proba[:,0] 
df_model['proba_CR'] = proba[:,1]
df_model['proba_TN'] = proba[:,2]
df_model['proba_TR'] = proba[:,3]
#calculate uplift score for customers
df_model['uplift_score'] = df_model.eval('\
proba_CN/(proba_CN+proba_CR) \
+ proba_TR/(proba_TN+proba_TR) \
- proba_TN/(proba_TN+proba_TR) \
- proba_CR/(proba_CN+proba_CR)')
#assign it back to main dataframe
df['uplift_score'] = df_model['uplift_score']

★ Model Evaluation: After calculating the uplift score for each client, let’s evaluate the results by using the QINI curve.

The QINI, uplift value, is calculated by

QINI = TR - [(CR*T)/C]# Normalizing the QINI
QINI = (TR/T) - (CR/C)
where
T denotes the total treated population (TR + TN), and
C denotes the total untreated population (CR + CN)
Lumei Digital

Based on the QINI curve, the uplift model performs better than a random model.

With this model, you can predict whether a new customer is a persuadable or a sleeping dog based on the customer’s demographic and purchase history and then go with different marketing strategies to increase the likelihood of making a purchase.

Ultimately, the goal of the uplift model is to find Treatment Responders (TR), Persuadables, so you can concentrate your resources on those who are likely to be positively impacted by your marketing promotions.

In the retail industry, the most important thing is not to predict the likelihood of a client purchasing but to know what can be done to increase the likelihood of clients making a purchase.

Uplift implies the increase in the likelihood of the outcome with the treatment compared to the outcome without the treatment.

It focuses on the effectiveness of the treatment and to find persuadables.

Apply the Uplift model to various use cases:

The uplift model can be utilized in any scenario other than in the marking campaigns. It estimates the treatment effect at an individual or subgroup level, for instance, it can be used to evaluate the effort of fertilizer on crop yield, or the effort of sending emails in political.

uplift model by lumei digital

More Optimized Marketing Campaign Topics:

The Complete Guide to A/B Testing in Python

Your support would be awesome❤️

Having more followers will encourage me to write more articles.

The Jupyter Notebook is updated in Github

Keywords: Optimized campaign, Uplift

--

--

Jumei Lin

Entrepreneur, Writes about artificial intelligence, AWS, and mobile app. @Lumei Digital