Measuring Digital Ad Effectiveness using Incrementality Testing

Sneha Chokshi
MiQ Tech and Analytics
6 min readDec 5, 2019

The last few decades have seen a major shift in advertising dynamics due to the rise of digital advertising. According to Emarketer, Digital ad spending continues to rise globally and is set to cross the $330 billion mark in 2019. With increased investments, quantifying the effectiveness of their advertising investments becomes essential for advertisers and takes these learning to use as a basis to construct targeting strategies, bidding strategies and budgeting within campaign design. At the same time, the complexity of the ad-tech ecosystem is constantly growing with brands running marketing activities across multiple channels, ad formats, and targeting capabilities. Due to this, traditional digital campaign measurement metrics like Cost per Click, Return on Investment, Cost per Acquisition, Conversion Rate, etc. just scratch the surface when it comes to measuring the impact of marketing strategies. These metrics fail to answer the most common question today’s advertisers have — “Did my ad campaign cause the user to convert and generate more revenue for my brand or would that have happened anyway?” This gap in measurement leads us to look at the incremental lift as a metric to measure the impact of marketing strategy.

What is Incrementality testing?

Incrementality testing is a mathematical approach to measuring the causal impact of ad investments, whether it’s on-site visits, conversions or bottom line sales. Incremental customer/revenue is customer/earnings an advertiser wouldn’t have gained without that advertising campaign. This can be achieved from the concept of measuring the impact of a variation from baseline by analyzing and comparing performance between two user groups -

1. Users exposed to advertiser Ad

2. Users not exposed to advertiser ad

Stages of Incrementality Test Experiment Design

Incrementality Test Experiment design can be split into 4 stages -

Let’s understand the entire process with a real-world example

Advertiser: Fashion Retailer

Advertising Budget: $5k for a month for digital programmatic advertising

Advertiser Objective: Measure Incremental customers driven by an advertising campaign

  1. Research

Data exploration and market analysis to understand advertiser’s business goals, product/services to be promoted and user behavior to formulate -

a. Problem Statement: Does my campaign drive users who would not have made a purchase otherwise?

b. Duration of experiment: Define the duration of the experiment depending on the experiment's budget and goal. For this use case let’s consider 1-month

c. Target Audience: Users in the age group 20–25 browsing for fashion apparels

d. Identify one metric to measure business goals: Incremental sales / Incremental page visits / Incremental customers. As the client wants to drive incremental customers via the advertising campaign, we will take Incremental customers for this use case.

2. Randomization

Decide Test vs. Control split depending on the budget and target audience sample size. If the sample size is large Test to control the ratio of 95%-5% can give enough data points for testing. Both groups need to be mutually exclusive and have similar behavioral characteristics.

3. Target

Show advertiser ad to test group and suppress advertising for or show charity ad to control group. Targeting strategies and bidding algorithms should be exactly the same across both groups. Ensure that no control user is exposed to advertiser ad and in case of exposure due to technical error, remove them from the analysis.

4. Analyze

This stage can be further broken into 2 parts — Calculation and Validation

Calculations: Measure Performance for both groups as below

Validation: Here we validate if the above results are statistically significant. Statistical significance is the likelihood that a relationship between two or more variables is caused by something other than chance. There are various hypothesis testing methods to determine the significance of our results. Let’s take the example of the Z -Test which is useful when dealing with a large sample size (n>=30). Z test determines if the two population means are different when population variances are known. First, we formulate the hypothesis -

Null Hypothesis: There is no difference between the user conversion ratio of Test and control groups. In other words, advertising investment does not lead to a change in customer engagement.

Alternative hypothesis: There is a significant difference between the user conversion ratio of test and control groups. Advertising campaign did have an impact on customer engagement. Whether this impact is positive or negative is indicated by the lift value calculated above.

The goal is to determine which claim - the null or alternative - is better supported by the evidence found from sample data. To begin with, it’s important to define the confidence interval (99 %, 95%, 90%, ..) you want to work with and then work your way to get the p-value which helps you to determine the significance of your results. In this use case, let’s take a confidence interval of 95%. Below are the mathematical calculations to arrive on p-value and how to draw inferences from p-value -

From Z-score, there are various ways to compute P-value using statistical functions in python/ excel / R / Z table which can be found on the internet.

P-value = 0.0329

Drawing inference from confidence interval, p-value and Upper - Lower bounds:

Depending on the confidence interval, select the significance threshold value to compare with p-value to determine whether to reject the null hypothesis.

For confidence interval 95%:

  1. p-value < 0.05 indicates null hypothesis can be rejected and the calculated lift value is statistically significant,i.e., accept the alternative hypothesis
  2. p-value > 0.05 fails to reject the null hypothesis and the calculated lift value is not statistically significant.

In this use case, P-Value = 0.0329 < 0.05 so we can reject the null hypothesis and say that there is a significant difference between test and control performance.

Furthermore, the upper and lower bound for 95% provides another layer of validation of the outputs. On the basis of this, we can say that for multiple runs of this experiment with the same settings, 95% of the times the outputs will lie between 0.23% - 3.29%.

Finally, we can say that -

“The advertising campaign was successful in driving a statistically significant incremental lift of 1.8% + 1.53% in user conversion rate with 95% confidence. Additionally, this experiment got 4805 incremental customers who would not have made a purchase had they not been exposed to the ad.”

These outputs can be further used for campaign strategy planning and optimizations to improve efficiency. There are various methodologies to implement this experiment design on platforms of digital advertising ecosystems. I will be talking about one of those methodologies and how to use these outputs for campaign strategy planning and optimization in my next blog.

--

--