After A/B Testing: Optimize a Advertising Promotion Strategy by Customer Targeting

A/B test has achieved practical significance for boosting sales by promotion strategy, what next? Let’s optimize the right client to maximize the profit!

TING DS
9 min readJan 24, 2024

The notebook presenting all the analyses and attempts can be seen here. The full project, including datasets, can be found in my Github.

Background

The initial dataset is a task prepared by Starbucks for DS candidates, with approximately 120,000 data points divided into training and testing files in a ratio of 2:1. In the simulation A/B testing experiment, we conducted tests on advertising promotions aimed at attracting more customers to purchase a specific product priced at $10.

Each data point includes:

ID: individual id

Promotion: indicate whether an individual received the promotion

Purchase: indicate whether that individual ultimately made a purchase

V1-V7: Additional features of each individual (we are not told what these characteristics actually represent, and it’s our job to understand their behavior to target right clients)

In addition to testing the popularity of new features, another important benefit of A/B testing is collecting relevant data from experiments. Through this data, specific characteristics related to whether customers ultimately make a purchase (more responsive to new features) can be understood and inferred. Especially when the new feature involves strategies that generate operational costs, such as sending advertising promotions, relying solely on the significance results of A/B testing to decide whether to send the new promotion strategy to all audience customers may lead to decreased expected benefits or even potential losses in company performance. Since the cost of sending each promotional activity is $0.15, after achieving practical significance (required statistical significance) through A/B testing, the best next step is to limit the promotion to the audience with the highest acceptance rate.

The goal of targeting right audience is to maximize two metrics:

  • Incremental Response Rate (IRR): How many additional customers purchased the product with the promotion compared to those who did not receive it?
  • Net Incremental Revenue (NIR): What is the net profit (or lost) from sending out the promotion? (product price: $10, cost of promotion: $0.15)

If we send the promotion to all the clients using test dateset:

we got an IRR of 0.01 (at least 0.0188 as company expected) and NIR of -1132 (at least 189 as company expected), which is high negative value, meaning that this sending to all clients strategy would cause Starbucks money loss.

How to optimize promotion strategy by audience targeting?

Analyzing features by correlation and distribution

Build several strategies by domain knowledge and statistics

Test strategies

Correlation Analysis

Pearson’s correlation coefficient between V1-V7 with IsPurchase in two groups

Among the features with relatively high correlation, V4 is the most responsive to the promotion with positive correlation. Given that A/B testing is a randomized ideal experiment, we can say that the higher V4 is, the more likely customers are to be influenced by promotional activities and make purchases. And since V4 is a binary feature (1 or 2), we can infer and translate it as (V4 = “2”) being more receptive to promotional activities compared to (V4 = “1”).

The behavior of V5 is similar to V4, but lower sensitivity to promotion. V5 is a categorical variable (1, 2, 3, 4). We can say that the higher the category of V5, the more likely the audience is to be positively influenced by promotions and make purchases.

In the control group and experimental group, V3 showed opposite correlation directions (highest positive correlation in the control group, highest negative correlation in the experimental group). When there is no promotional activity, customers with higher V3 are more likely to make purchases. However, when there is a promotional activity, customers with higher V3 are less likely to make purchases. These customers are not sensitive to promotional activities and may even have a negative reaction towards them. Interesting! What kind of features might V3 represent?

V1, V2, V6, and V7 either have low correlation with the outcome or have low sensitivity, making it difficult to obtain useful conclusions.

Considerations on Correlation Analysis

First of all, even though the highest correlation value is very low among V1-V7, we must say that there are definitely other important customer aspects that have not been captured by these features.

Still focusing on the features we possess, customers in category 2 of feature V4 and higher categories in feature V5, as well as customers with lower values in feature V3, have greater potential to respond positively to promotional activities (showing relatively strong positive correlation with purchase and higher sensitivity towards promotional activities).

Distribution Analysis

Histogram by purchase or not and promotion or not

Note: only the more obvious distribution variation were captured for clear layout in article

Control group (No promotion):

Red — not purchase; Blue — purchase

Treatment group (promotion):

Red — not purchase; Blue — purchase

Considerations of Distribution Analysis:

V3 strengthens the conclusions we draw from analysis: correlation customers with higher V3 values are more likely to purchase the product without any promotional activities. However, this trend is completely opposite when there are promotional activities. (Inference: V3 may represent the income of the clients, as those with higher incomes can purchase the product at any time, while those with lower incomes increase their purchasing volume after being exposed to promotional activities. On the other hand, clients with higher incomes may reduce their purchasing volume during promotional activities due to concerns about product quality etc.)

V2 shows that customers who are closer to the average value are more sensitive to promotions and more likely to make purchases due to promotions (Inference: V2 may represent age, with V2 concentrated around thirty years old on average).

Among the customers who purchase products, customers with V4=’2' tend to be more sensitive to promotional activities (Inference: V4 may represent user level (VIP or not), VIP customers and regular customers may behave differently when facing promotional activities).

Build potential strategies

We send promotion to below audience and calculate the IRR and NIR on test dataset:

  1. All clients
  2. Clients whose featuress values with relatively high positive correlation
  3. Clients who are predicted as “purchase = 1” by ML classifer with GridSearchCV and SMOTE techniques and Grid(to balance “purchase=1” group and “purchase=0” group)

ML classifer trials:

Starting from a simple model configuration, evaluate whether the gap between the metric values of the test and the desired metric values of the company has decreased, and gradually increase complexity of model or re-wrangle feature.

Trained the ML classifier only using treatment group data except for one-versus-rest feature engineering, because the company is concerned about the impact of future promotional revenue (promotion should be 1 in train set)

Because we have already defined the business metrics IRR and NIR that need to be optimized, we can skip the evaluation of AUC, recall, precision, and f1-score, However, we need to consider which type of groups’ prediction accuracy has the greatest impact on IRR and NRR, and apply this logic in feature engineering

The company is more concerned about the proportion of people who are predicted to make a purchase and actually end up purchasing the product., which means precision (TP/TP+FP) is more important.

  • Logistic Regression
  • Logistic Regression with feature selection (feature importance by decision tree)
  • XGBoost Classifier with one-versus-rest classification(using all data including control and treament group: create a new outcome: should_send_promotion 0/1 ) :

if the clients obtained both promotions and completed the purchase, these clients should get promotion, should_send_promotion is 1. (Clients who company wants to target)

All other clients, these clients shouldn’t get promotion (without sensitivity to promotion), should_send_promotion is 0.

  • XGBoost Classifier with categorizing numeric variables based on histogram distribution (more detailed cut-offs)
  • XGBoost Classifier with categorizing numeric variables based on histogram distribution (less detailed cut-offs for skewed distribution; If the distribution approximates a normal distribution, such as V2 (age), most clients are concentrated around the mean. Let’s truncate the tails on both ends to form a single group, as the company wants to focus attention on the largest common group.)
  • XGBoost Classifier with categorizing numeric variables based on weights by the importance of target audience, the values closer to the mean for V2 or (proportionally more responsive to the promotion for V3), the higher weights. For V3, lower values were labeled with higher category since they represent the ones who are more likely to purchase with promotional event.

Test several strategies

Metric Evaluation in different audience targeting strategies

In this final optimization strategy, we have achieved results that surpass Starbucks’ expectations. The IRR has similar values compared to the Starbucks model, and the NIR shows a 13% improvement.

We win!

Conclusion

In this project, we are able to:

Analyze the results of AB testing and determine that promotional strategies should be released based on the significant improvement in purchase conversion rate.

Infer the meaning of features through business knowledge, feature’s behavior, distribution and correlation analysis; And perform reasonable feature engineering.

Optimize a promotion strategy by targeting the right clients to maxmize the business metric Incremental Response Rate (IRR) and Net Incremental Revenue (NIR)

AB testing can determine whether a new event/feature should be released, and subsequent data analysis can enhance its effectiveness. There are many dimensions to analyze the data, not just machine learning modeling. It includes understanding business problems, changes in feature behavior and their relationships, trade-offs between business metrics, etc. Only by achieving all of these can we truly unleash the power of data!

Summarization

When a company wants to test whether a new product or advertising strategy will increase customer engagement, they often do so by designing and conducting A/B testing within a certain customer segment. The variations in A/B testing metrics (such as user engagement, purchase conversion rates, average click-throughs rates, etc.) are analyzed, and if there is a significant increase, it proves that the new feature or strategy is effective. The next step might seem to be rolling out the new feature/product/strategy to all customer groups or the entire platform, but that is a complete mistake!

Launching new products/advertising strategies/features requires investment in terms of money and time. The results of A/B testing only prove the effectiveness of intervention, but if this intervention is blindly applied to all future customers, the company may not only fail to make a profit but also incur losses! The correct approach is to establish new business metrics while considering costs and benefits. These metrics represent the profit growth that the company expects to achieve through the launch of new features/strategies.

Detailed analysis should be conducted on experimental data from A/B testing, including exploring changes in customer behavior, clustering characteristics, feature analysis, statistical forecasting modeling, etc. Ultimately, it is necessary to accurately determine the characteristics of customer groups that are more sensitive to interventions and develop customer segmentation strategies. By evaluating the profitability ratio of various strategies using new core business metrics, targeted advertising strategies can be selectively deployed to precisely positioned customer groups instead of covering all groups. This will maximize enterprise profits!

If you find this article helpful to you, please click clap and follow to inspire me. I will publish related blogs on data science and statistical analysis regularly!

Thanks for your reading and feel free to leave comments and discuss!

The notebook presenting all the analyses and attempts can be seen here. The full project, including datasets, can be found in my Github.

--

--

TING DS

Lover for Data Science & Statistics. Write as I learn