Consider the causal structure in Marketing Mix Modeling with Robyn

Ryo Tanaka
4 min readJan 15, 2023

What is Marketing Mix Modeling?

Marketing Mix Modeling (MMM) can represent the measurement how is the effect of cost and advertisement performance (impression/click) for each channel to the focusing KPI (sales/cv). The concept of MMM is basically composed of ;
1. Consider Ad-stock (carryover & saturation) with specific distribution
2. Model KPI with the cost and imp/click for each channel
3. Optimize budget allocation for each channel

Currently, some big tech companies are developing R/Python packages in order to run MMM easily. One of the most useful packages is Robyn, developed by Meta Platforms, Inc. In Robyn, we can also care about time-series components (trend, seasonality, holiday, or week), multicollinearity among many regressors, and so on besides the above.

Concerns when we introduce MMM with Robyn in business

As shown in this formal guide, Robyn provides mainly 2 types of one-pager modeling results;
1. Evaluation of the performance of current advertising activity.

2. Suggestions for the optimized budget allocation in the future

These results give us a lot of data-driven insights which are helpful for driving our business from the advertising point of view.
However, it should be taken care of in interpreting the results since the current model in Robyn evaluates the effect of channels equally. In the practical case, each ad has a role in which funnel customers it should reach. For example, clients often spend on TV-CM to expand the awareness of their product, but they also spend on digital ads like banners to reach customers closer to the conversion. In that case, TV-CM may be more difficult to directly affect conversion, which may lead to the underestimation of that channel’s importance.
Moreover, it can be the case that channels may influence each other. Customers may have a journey to their conversion and each channel contribute to each step of their journey. It poses the problem of endogeneity from a theoretical point of view since it creates a correlation between the endogenous regressor and the model error term.

Considering the causal structure in Robyn

To care about the above problem, one possible approach is to consider the causal structure when modeling. I tried to incorporate Structural Equasion Modeling (SEM) in existing modeling part of Robyn. The SEM is based on a python package called “semopy”, which can run regularlized SEM to care about multicollinearity. You can see more detail from the project below, which I submitted for the Robyn hackathon.

In this post, I try to run the above “Robyn with SEM” with this sample dataset in Robyn as a quick hands-on. Most of the modeling process followed this official demo script.
Initial configuration as follows;

InputCollect <- robyn_inputs(
dt_input = dt_simulated_weekly,
dt_holidays = dt_prophet_holidays,
date_var = "DATE", # date format must be "2020-01-01"
dep_var = "revenue", # there should be only one dependent variable
dep_var_type = "revenue", # "revenue" (ROI) or "conversion" (CPA)
prophet_vars = c("trend", "season", "holiday"), # "trend","season", "weekday" & "holiday"
prophet_country = "DE", # input one country. dt_prophet_holidays includes 59 countries by default
context_vars = c("competitor_sales_B"), # e.g. competitors, discount, unemployment etc
paid_media_spends = c("tv_S", "ooh_S", "print_S", "facebook_S", "search_S"), # mandatory input
paid_media_vars = c("tv_S", "ooh_S", "print_S", "facebook_I", "search_clicks_P"), # mandatory.
organic_vars = "newsletter", # marketing activity without media spend
window_start = "2016-11-21",
window_end = "2018-08-20",
adstock = "geometric" # geometric, weibull_cdf or weibull_pdf.
)

After the setting of initial hyper-parameters, causal model is newly defined in manual for SEM calculation.

model <- '
# measurement model
interest =~ revenue
awareness =~ revenue

# regressions
interest ~ facebook_S + search_S
awareness ~ tv_S + ooh_S + print_S
dep_var ~ competitor_sales_B
dep_var ~ newsletter
dep_var ~ holiday
dep_var ~ season
dep_var ~ trend

# residual correlations
search_S ~~ facebook_S
search_S ~~ tv_S
search_S ~~ ooh_S
search_S ~~ print_S
facebook_S ~~ tv_S
facebook_S ~~ ooh_S
facebook_S ~~ print_S
tv_S ~~ ooh_S
tv_S ~~ print_S
ooh_S ~~ print_S
'

In this hands-on, it is assumed that

  • Facebook and Search may affect those who have an interest on a product, then the latent valuable of “interest” is defined for these channels.
  • TV, OOH and Print expands the exposure of a product and are expected to improve the recognition. Therefore, the latent valuable of “awareness” is defined for these channels.
  • As for competitor’s sales, newsletter and all of the time-series components (trend, seasonality and holiday), they are assumed to affect direct to the revenue.

After finishing 10000 iterations (by default), one of the optimized models can be described as follows.

According to the result, the performance of the model was not good at all.

Of course it is the lack of iteration to converge both evaluation metrics, but it seems that there is a lot of room with built model to improve.

First of all, the structure was completely defined based on the assumption which analyst have. Thus, it can be the option to explore optimized structure will be built based on the existing data.

On the other hand, SEM is not the only one to express the causal structure. SEM is in a way a unique approach since it can assume latent variable to the structure. It can also be applicable by like bayesian network etc.

These considerations will be the requirement to make MMM more useful or enrich what MMM can describe.

--

--