Optimising Marketing Allocation with Marketing Mix Models

Chris Fenton
Trusted Data Science @ Haleon
10 min readSep 25, 2023

by Christopher Fenton and Tom Dawson

“Half the money I spend on advertising is wasted; the trouble is I don’t know which half.” — John Wanamaker

In the dynamic landscape of marketing, understanding the impact of various elements is crucial. This is especially true at Haleon, where a large amount is invested in the global marketing of its products. Enter Marketing Mix Models (MMM) — a powerful analytical and predictive tool that we use at Haleon to better understand and optimise our marketing strategies. This article delves into the intricacies of MMM, how it is used within Haleon, and its role in driving informed decision-making within marketing.

Haleon products on display.
Haleon produces a range of products (examples shown above) that can use MMM to drive the efficiency of their marketing.

Why Do We Use MMM?

MMM is used to objectively measure the effectiveness of different marketing activities and to quantify the impact of external influences on the sales volume of Haleon products. Through this analysis, we can observe the following benefits:

  • Improved Resource Allocation: Identify high-impact marketing activities in order to prioritise their budget allocation, allowing us to focus our spend on areas with the highest ROI.
  • Informed Decision-Making: Enabling marketing managers to fine-tune their strategies based on empirical evidence and regularly updated results.
  • Holistic View of Performance: Understand the interconnectedness of various marketing elements and their cumulative effect and avoid narrow-sighted evaluations by considering the broader marketing ecosystem.

Haleon’s in-house MMM solutions provide the above benefits through the use of statistical modelling and access to aggregated historical data that covers marketing spend, sales and external factors. By modelling with both simple and conditional linear models we are able to determine the contribution of each marketing activity on sales volume and relay this information to marketing specialists. This information allows them to optimise allocation of marketing spend towards higher-impact channels and thus maximise ROI and marketing effectiveness.

Examples of marketing activities can include:

  • Search Engine Optimisation: Budget invested to improve a products search engine presence.
  • Social Media Adverts: Adverts on popular sites such as Instagram or Facebook.
  • Pricing Promotions: Temporary reductions in price to increase sales of a product.
  • TV Adverts: Adverts aired on local and international television

Our Approach to MMM

Data

Data Collection: The first step was to gather historical data on various marketing variables, sales information and external factors such as seasonality, economic environment and COVID. The more comprehensive the set of variables used, the more descriptive the model can become, which will enrich the insights we can provide our stakeholders. With this in mind, we try to focus a lot of our attention on sourcing and understanding the available data.

Data Preprocessing: Once collected, this data needs to be cleaned and organised to remove errors, inconsistencies, and duplicates. Following this step, an important decision must be made on what granularity/level to model. This granularity/level will directly influence the decisions that stakeholders will be able to make. A higher level analysis will have more general results and may lack the insight to perform meaningful actions, whereas a lower level analysis may not have the data to support a good quality model. Furthermore, if a lower granularity/level is used, more models may need to be created to fully cover the range of products being marketed.

Choice of granularity can include, modelling with weekly vs. monthly data, modelling at a brand vs. sub brand level or using campaign level vs. driver level data. These choices are usually shaped by the quality of the data that is available for the market or product in question. While modelling on a granular level is usually preferential, in some cases it is not feasible due to data constraints.

3 examples of parodontax sub-brands.
Brands can have many sub brands or products within them. 3 parodontax sub-brands are shown above.

Transformations: Marketing initiatives will often not see a linear relationship with sales volume so feature transformations are also required to more accurately capture the effect of how marketing initiatives affect sales. We used two feature transformations: Adstock and Lag.

  1. Adstock: This effect is a crucial concept in marketing that underscores the prolonged impact of advertising on consumer behaviour. It acknowledges that the effects of advertising efforts do not disappear immediately after a campaign ends, but instead exhibit a delayed and lingering influence. Much like a stock, the impact of past advertising expenditure accumulates and decays over time, affecting consumer responses. Understanding Adstock allows marketers to fine-tune their advertising strategies by considering not only the immediate effects but also the longer-term repercussions of their campaigns.
  2. Lag: This incorporates the idea that an advert’s impact won’t always be observed immediately. For example, if an advert airs on a Sunday evening and most of the target demographic do their shopping on Saturdays, there might be a week-long lag between the ad and its first impact on Sales Volume.

Data Exploration: Once data has been collated and processed, it is important to analyse and attempt to find correlations within this data. Correlation Matrices are a useful tool for identifying these correlations, not just between input variables and sales but also between input variables themselves.

When features are highly correlated, it becomes difficult to interpret the coefficients associated with each variable in the final model. The coefficients can then become unstable, and small changes in the data can lead to significant changes in the estimated coefficients. Furthermore, correlated features can lead to misleading conclusions about the importance of individual variables. A variable that is statistically significant in a multi-collinear model might not be as important as it appears because it shares its effect with other correlated variables.

This is particularly prevalent in marketing, whereby initiatives can be synchronised to obtain the greatest possible uplift. Plotting sales volume side-by-side with input variables will also help identify any drivers that may be missing. If a large increase is seen in a given period in sales volume but not across any input variables, it may highlight that extra information is required. This information may just be pertinent to a particular brand, i.e. Cold and flu medications may have a spike during particularly cold winter months, whereas allergy medication may be higher during summer periods.

Modelling

Model Architecture: In this step, statistical models are created to determine relationships between marketing variables and Key Performance Indicators (KPIs). The most common type of model used is a multiple linear regression model. Multiple linear regression assumes a linear relationship between these variables, allowing it to quantify the impact of each independent factor while controlling for the effects of others. Fitting a model using this technique will allow us to determine a coefficient for each variable and its total impact on sales.

Linear regression is the first model we try for any use case.

Another technique we use at Haleon is known as Bayesian Hierarchical Regression. This innovative approach involves constructing a hierarchical data structure, wherein multiple regression models are intricately nested. This nesting operates on different tiers:

At the highest level, the model captures global influences that span across all observed data groups. Progressing to lower levels, more refined models come into play, concentrating on factors that are distinct to each specific group. In the context of MMM, these groups could correspond to individual brands or sub-brands, or other hierarchies within the data.

A distinguishing feature of this approach lies in the application of Bayesian techniques. These techniques offer the flexibility to integrate prior beliefs and account for uncertainty when estimating parameters. This allows us to incorporate domain expertise and market-specific insights from stakeholders, providing a balance between data-driven and expert-driven insights.

Variable Selection: We perform this step iteratively in a process where the modeller will decide on a set of features to input into an initial model. The modeller will then analyse the performance of this model using several performance metrics to drive further changes and improvements to the model. This process will repeat until the modeller is satisfied that the model performs well and meets the business requirements.

As with many other machine learning models, there is a trade-off between the number of features used and the model overfitting to its training data. In cases with limited data, too many variables can lead to overfitting whereas too few can lead to poor explainability within the model. Ensuring that the model performs well on both training and test data can help ensure that this trade-off is being well balanced.

Each feature can also be analysed using its relative contribution, features with contributions much larger or smaller than expected may be suffering from collinearity with other features which can lead to corrupted results. In that case, the following can be done to rectify this.

  1. Remove one or more correlated variables: Keep the most relevant variable(s) and remove the redundant ones. During initial data exploration, it can be difficult to determine which features will be the most impactful within the model. As such, collinear features may be kept until this stage, then model results are used to inform of and remove the less-useful features.
  2. Combine correlated variables: Create a new variable that represents the combined effect of correlated variables.
  3. Regularise the model: Techniques like Ridge Regression or Lasso Regression can help mitigate multicollinearity by introducing a penalty for large coefficients.
L1 lasso regression equation.
L2 Ridge regression equation.
Lasso L1 (Shown upper) & Ridge L2 (Shown lower) regularisation is used to reduce the risk of overfitting.

If a specific value or range is known for a given variable, then constraints can be used to create an upper and lower limit.

Model Evaluation: The model’s performance is assessed using various statistical measures including R-Squared, P-Values and by performing a residual analysis.

  1. R-Squared: This indicates the proportion of the variance in the target variable explained by the model. MAPE and RMSE may also be used to quantify model performance.
  2. P-Values: This is used to test the significance of each variable in the model.
  3. Residual Analysis: This is used to evaluate the errors of the model to identify any shortcomings.

In addition to these key metrics, cross-validation is also used to help mitigate overfitting by simulating the model’s performance on unseen data. In the context of MMM, where multiple variables and complex interactions are at play, cross-validation involves dividing the dataset into multiple subsets (folds). The model is then trained on one subset while being evaluated on another. This process is repeated multiple times, with different subsets used for training and validation in each iteration.

By assessing the model’s performance across various subsets in the data, cross-validation provides a more robust estimate of how well the model is likely to generalise to new market scenarios. It helps ensure that the model is capturing meaningful patterns while avoiding overfitting, thus, enhancing its reliability and predictive accuracy in MMM tasks.

Cross Validation is used to ensure the model is robust. Example of how training and test set can be allocated in time-series data shown above.

Model Updates: Markets and consumer behaviour change over time, so it’s essential to regularly update the MMM to keep them accurate and relevant to the current market. As new sales and marketing data is obtained, the model can be refreshed to show how marketing initiatives have improved or declined.

Industry Applications and Success Stories

MMM has found widespread applications across various industries, revolutionising marketing strategies and resource allocation decisions. For these industries, MMM has been instrumental in optimising advertising spend, helping brands determine the most effective channels and timing for their campaigns.

Case Study 1

At Haleon, we’ve used our in-house MMM models to bring dynamic solutions to previously underserved brands and markets in EMEA. The insights generated from our models have been surfaced on a dashboard that our stakeholders can now use to gain a deeper understanding of how their decisions impact sales volumes. The initial use case involved piloting 5 brands operating in an initial market and has since expanded to further markets and brands following its initial success and represents an important step forward for the Data Science team at Haleon.

Some of Haleon’s Brands are shown above, MMM is being implemented across many of these and more.
Some of Haleon’s Brands are shown above, MMM is being implemented across many of these and more.

Case Study 2

Resident, an e-commerce company, embraced MMM as its primary means of determining monthly budget allocations. They witnessed a remarkable 20% quarter-over-quarter growth in overall US revenue across all its products. The integration of MMM not only facilitated Resident’s increase in profitability but also offered invaluable insights into its marketing mix, empowering the company to make informed and transformative budget decisions.

Challenges and Limitations

MMM offers powerful insights, but also presents certain challenges and limitations. One key challenge is the potential over-reliance on historical data and assumptions. MMM heavily relies on past trends and patterns, which may not always accurately predict future market behaviour, especially in rapidly changing markets or during significant events like economic crises.

Additionally, the model’s effectiveness is contingent on the quality and completeness of the data, which can be difficult to obtain. Moreover, MMM assumes linearity and may struggle to capture nonlinear relationships or abrupt market shifts. In rapidly evolving markets, where consumer preferences, technologies, and competitors change swiftly, MMM might not be the most suitable tool. In such cases, real-time data analysis, machine learning models, or alternative forecasting methods might provide more timely and accurate insights.

Despite these limitations, when applied judiciously and with an awareness of its constraints, MMM can still be a valuable tool for understanding and optimising marketing budget allocation.

Conclusion

In conclusion, Marketing Mix Modelling (MMM) represents a dynamic and data-driven approach to support companies while navigating the complexities in today’s multifaceted economic landscape. As we’ve explored throughout this blog post, MMM enables organisations to make more informed decisions and optimise their marketing strategies. By harnessing historical data, advanced statistical techniques, and an understanding of market dynamics, MMM empowers businesses to decipher the intricate interplay of factors that influence their performance. As we move forward, the role of MMM will likely continue to grow, helping organisations thrive in mixed-market environments, adapt to new challenges, and unlock new opportunities in the pursuit of sustainable growth and success.

References

--

--