Sitemap

How to measure the impact of a marketing campaign without a randomised experiment?

6 min readJan 21, 2022

--

Image source: https://www.inovex.de/de/blog/causal-inference-introduction-to-causal-effect-estimation/

Advertising across all digital channels will exceed 60% of the global ad spend of US$769bn¹ in 2022. While companies are using a variety of tools to make sure that their digital ad spend is getting the maximum possible ROI, there is a lot left to be desired. The tools try to capture all the touchpoints in the user’s journey to a conversion or a click, but the methodology is not perfect especially with the increasing limitations around using cookies to track consumer behaviour. And when multiple campaigns are running in parallel on different media channels and platforms, it becomes difficult to determine the how different campaigns are contributing to conversions. With the multiple ongoing campaigns at any time, if a business wants to launch, modify or stop a campaign, the question to be answered is, “How is the change impacting the overall conversions or the status quo?” Did the 10% increase in clicks occur due to the new campaign or is the bad weather making customers more likely to visit your website?

To answer these questions, causal inference comes to the rescue. Causal inference is the branch of statistics that deals with the consequences of an action that is a part of a larger system. In the field of marketing, the gold standard for causal inference is a randomised experiment which requires the change to be implemented in the test group while making no change to the control group. Once the change is implemented in the test group, all the effects which are observed in the test group but not the control group can be attributed to the change. This might mean launching a campaign in a group of cities in a region while making no change in other cities. While randomised experiments are effective in understanding the causal effect of a change, it is not always possible to run them because they can be expensive to run, time consuming, sometimes unethical or complex to design. There is another technique that can be used in the absence of a randomised experiment. To explain this technique, consider the potential outcomes table below.

Potential Outcomes table
Table 1: Potential Outcomes Table

The experimental unit A could represent the customers of a business in the market. They have been subjected to a new campaign and we have the observed data. What we want to know is what would have happened in the absence of the treatment or the campaign. The outcome under no treatment is the counterfactual estimate referred to as ‘CF estimate’ in the above table. The difference between the observed data and the counterfactuals is the causal impact of the campaign. The covariates are a list of attributes that can explain the observed data reasonably well and are not affected by the campaign or the treatment. These could be data such as stock price, weather data, Google search trends etc. It is advisable to choose at least a few covariates to build a good model.

Using an example, we will see how we can estimate the causal impact of a campaign. I have created a simulated dataset which contains app downloads for a food delivery company — ‘ABC Foods’ from January 2020 to March 2021. As covariates, I have chosen the Google search trends for the term — ‘food delivery’ and the ad spend in thousands of euros. Here is a snapshot of their data.

A table showing the snapshot of the data that is being used for this example
Table 2: Data Snapshot

The company overhauled one of their existing campaigns and implemented it on 1st January 2021 without changing the overall ad spend. They want to understand the causal impact of the modified campaign on their app downloads.

A graph showing monthly app downloads
Figure 1: Monthly App Downloads

In the monthly app downloads graph above, the black line represents the observed data i.e., the app downloads. Using the data in the time period Jan — Dec 2020, we build a prediction model to explain the observed data using the covariates. The model is shown by the blue dotted line with the blue shaded region representing the confidence interval of the model. Once we have the model, we use it to predict the observed data in the post-period after the launch of the modified campaign. The difference in the observed data and the predicted data (counterfactuals) gives the causal impact of the campaign. To calculate the counterfactual estimate, the team at Google developed a R package called ‘CausalImpact’ and made it available on Github². The package uses a series of ‘Bayesian Structural Time Series models’ to calculate the counterfactual estimate. I have used this package in R to calculate the causal impact of the campaign for ABC Foods. Let’s look at the results.

Causal Impact Result Summary from RStudio
Figure 2: Causal Impact Analysis

There are 3 graphs in the above figure (Figure 3). In the first one, the black line shows the app downloads over time and blue line gives the counterfactual estimate. This is the same as figure 2. The shaded blue region gives the confidence interval. After the launch of the modified campaign, the black line begins to trend up while the counterfactual estimate shows what would have happened in the absence of the change. The second graph shows the point estimate which is the difference between the observed data and the counterfactual. This provides a good view of the causal impact and shows that the modified campaign increased the app downloads. The third graph provides a cumulative view.

It shows that the campaign led to ~10K additional app downloads in the time period: Jan to Mar 2021. Let’s review the results in more detail from the analysis in R. During the post-intervention period, the response variable had an average daily value of approx. 380.01. By contrast, in the absence of an intervention, we would have expected an average response of 261.64. The 95% interval of this counterfactual prediction is [241.55, 280.14]. Subtracting this prediction from the observed response yields an estimate of the causal effect the intervention had on the response variable. This daily effect is 118.37 with a 95% interval of [99.87, 138.46].

Summing up the individual data points during the post-intervention period, the response variable had an overall value of 34.20K. By contrast, had the intervention not taken place, we would have expected a sum of 23.55K over the 3-month period. The 95% interval of this prediction is [21.74K, 25.21K]. The above results are given in terms of absolute numbers. In relative terms, the response variable showed an increase of +45%. The 95% interval of this percentage is [+38%, +53%].

This means that the positive effect observed during the intervention period is statistically significant and unlikely to be due to random fluctuations. The probability of obtaining this effect by chance is very small (Bayesian one-sided tail-area probability p = 0.001). This means the causal effect can be considered statistically significant.

Please refer to the Github link below for the data and the R code.

https://github.com/j-arora/CausalImpact

As demonstrated using the example above, calculating the causal impact using this package in the absence of a randomised experiment is an effective way for a business to measure the impact of a change on their business metrics.

Reference:

  1. https://www.zenithmedia.com/digital-advertising-to-exceed-60-of-global-adspend-in-2022/#:~:text=Advertising%20across%20all%20digital%20channels,rise%20to%2065.1%25%20by%202024
  2. https://google.github.io/CausalImpact/CausalImpact.html

--

--

Jay Arora
Jay Arora

Written by Jay Arora

Startups | Analytics | Public Speaking

No responses yet