Unlocking Influencer Marketing Success with Google’s Lightweight MMM

10 min readAug 23, 2023

In today’s dynamic digital landscape, influencer marketing has evolved into a formidable strategy for businesses seeking to forge authentic connections with their target audiences. By harnessing the influence of individuals who command devoted followings on social media, brands can effectively engage consumers and cultivate meaningful relationships. As the prominence of influencer marketing continues to rise, optimizing costs has emerged as a pivotal pursuit for enterprises aiming to achieve heightened returns on investment (ROI). This article delves into the realm of influencer cost optimization and introduces the innovative application of Google’s lightweight Marketing Mix Modeling (MMM) algorithm to tackle this challenge head-on.

Understanding Influencer Marketing and Its Challenges:

The realm of influencer marketing revolves around collaborations with individuals who wield significant influence over their online communities. These influencers possess the capacity to shape consumer opinions and guide purchasing decisions, making them indispensable assets for modern brands. Consequently, businesses are increasingly channeling resources into influencer marketing to tap into these engaged audiences. Nevertheless, this surge in demand has ushered in a new set of complexities. Escalating influencer costs, propelled by the surging demand for their services, can strain marketing budgets. Furthermore, accurately quantifying the return on investment derived from influencer campaigns remains a complex puzzle due to the intricate interplay of diverse metrics.

The Power of Marketing Mix Modeling (MMM):

Marketing Mix Modeling (MMM) stands as a time-tested analytical technique renowned for dissecting the effectiveness of diverse marketing strategies. By scrutinizing the impact of assorted variables on business outcomes, MMM provides invaluable insights into the contribution of marketing endeavors to sales and ROI. Traditionally, MMM has been harnessed to analyze a spectrum of marketing channels encompassing television, radio, print, and digital advertising. Now, the innovative lightweight MMM model from Google introduces a streamlined and agile approach that champions simplicity, speed, and scalability. This renders it an ideal companion for navigating the fluid landscape of influencer marketing.

Applying Lightweight MMM to Influencer Marketing:

The adaptability of the lightweight MMM model to influencer marketing heralds a structured framework for cost optimization and ROI maximization. To operationalize this model effectively, several pivotal variables and data inputs come into play:

Influencer Costs: A meticulous record of expenses linked to influencer collaborations serves as the bedrock for cost analysis.
Engagement Metrics: Metrics encompassing likes, comments, shares, and click-through rates offer insights into the reach and resonance of influencer-generated content.
Sales/Conversions: Tracing the correlation between influencer-driven campaigns and tangible sales or conversions provides a tangible gauge of their impact.
Audience Demographics: A deep comprehension of the demographics and preferences characterizing an influencer’s audience facilitates precise targeting of key consumer segments.
Competitive Landscape: Scrutinizing how influencer efforts measure up against competitors’ strategies confers a strategic edge.

Case Study: A Glimpse Into Practical Implementation:

To underscore the practicality and impact of Google’s lightweight MMM in influencer marketing, a recent project serves as a compelling case study. Tasked by a client to optimize influencer-related expenditures and maximize ROI, the challenge encompassed identifying the most influential brand-associated figures driving substantial sales. Additionally, the project required strategic allocation of the weekly budget across these influential figures and other pivotal social media channels. This intricate puzzle found its solution through the precision of Marketing Mix Modeling, specifically leveraging Google’s innovative “Lightweight Marketing Mix Modeling” algorithm.

Step-by-Step Execution: From Data to Insights:

Embarking on the path towards optimizing influencer costs involves a structured sequence of steps that I am excited to guide you through. I’ve successfully navigated this journey using the Lightweight Marketing Mix Modeling (LMMM) approach. To provide you with a practical insight into this process, I’ve prepared an illustrative example that you can access through the provided link to the Google Colab Notebook.

This notebook serves as a virtual guide, enabling you to grasp the essence of each step with ease. Allow me to lead you through these steps in a straightforward yet informative manner.

01. Data Collection and Preparation:

The consolidation of data from Facebook, Google Ads, and TikTok was accomplished using Airbyte, optimizing cost-effectiveness.
The creation of a comprehensive data warehouse within Big Query facilitated seamless analysis.
A unified dataset comprising influencer metrics, media spend, payouts, impressions, and sales was curated.

02. Model Development:

The Google Colab environment provided an interactive space to explore, comprehend, and apply the Lightweight MMM model.

!pip install - upgrade git+https://github.com/google/lightweight_mmm.git
!pip uninstall -y matplotlib
!pip install matplotlib==3.1.3

Libraries such as JAX, NumPyro, and Matplotlib were harnessed to build a robust analysis environment.

import jax.numpy as jnp
import numpyro
import pandas as pd

Essential libraries, including lightweight_mmm, preprocessing, and others, were imported to empower the analysis.

from lightweight_mmm import lightweight_mmm
from lightweight_mmm import optimize_media
from lightweight_mmm import plot
from lightweight_mmm import preprocessing
from lightweight_mmm import utils

03. Uploading Sample dataset and defining the data variables

Upload the dataset in to the colab and load into the environment

csv="/content/sample_data.csv"
df=pd.read_csv(csv) #, index_col=0)
df

Define the variables

media_data = df[['Influ_1_views', 'Influ_2_views','Influ_3_views', 'Influ_4_views', 'Influ_5_views','Google_Impressions', 'Facebook_Impressions']].to_numpy()
target = df[['Sales']].to_numpy()
costs = df[['Influ_1_s', 'Influ_2_s','Influ_3_s', 'Influ_4_s', 'Influ_5_s','fb_s', 'gads_s']].sum().to_numpy()

Media data refers to the metrics that showcase how well a marketing channel is performing, including aspects like impressions, views, clicks, and more. On the other hand, the target variable represents the source of revenue generation. This could involve actual sales or other conversions that contribute to a company’s income. While there might be a technical distinction between revenue and conversion, for practical purposes, we treat them similarly, and the model accommodates this approach effectively.

Additionally, it’s essential to establish the costs associated with each of the media channels mentioned earlier. If you find yourself in a situation where you have information about costs but not about specific media performance, you can still work with the model. In this case, you would fill the media data variable with the cost data. A critical point to note is that the length of the media data variable and the cost data variable should always match. This alignment ensures the accuracy and reliability of the analysis.

04. Data Scaling and Splitting:

Splitting the dataset into training and testing subsets ensured through model evaluation.

# Split and scale data.
split_point = data_size - 30
# Media data
media_data_train = media_data[:split_point, …]
media_data_test = media_data[split_point:, …]
# Target
target_train = target[:split_point].reshape(-1)

Media and target data underwent scaling to enhance model accuracy and normalize costs.

media_scaler = preprocessing.CustomScaler(divide_operation=jnp.mean)
target_scaler = preprocessing.CustomScaler(divide_operation=jnp.mean)
cost_scaler = preprocessing.CustomScaler(divide_operation=jnp.mean)
media_data_train = media_scaler.fit_transform(media_data_train)
target_train = target_scaler.fit_transform(target_train)
costs2 = cost_scaler.fit_transform(costs)

Subsequently, the data requires normalization to facilitate the optimal functioning of LMMM. This procedure enhances data accuracy and integrity while simplifying the model’s interpretation. In simpler terms, data normalization ensures consistency in the appearance, interpretation, and utility of your data across all records, resulting in a more coherent and user-friendly modeling process.

05. Model Configuration and Fitting:

The lightweight MMM model was initialized with the “carryover” method, capturing lagged effects of media on sales.

mmm = lightweight_mmm.LightweightMMM(model_name="carryover")

Next, a pivotal decision comes into play: the selection of the media saturation method. Understanding that the influence of a media channel on sales can exhibit a lagged effect that gradually diminishes over time, the Lightweight MMM architecture introduces three distinct approaches to capture this phenomenon. It is highly recommended for users to assess and compare these approaches, ultimately opting for the one that yields optimal results.

The three approaches are as follows:

Adstock: This method employs an infinite lag mechanism, progressively reducing the weight of a media channel’s impact as time advances.
Hill-Adstock: Here, a sigmoid-like function is applied to model diminishing returns in the output of the adstock function. This approach accommodates the nuanced effects of media influence.
Carryover: Operating with a causal convolution, this approach assigns greater weight to values closer in time and gradually decreases this weight for more distant values.

For the sake of simplicity and to uphold accuracy, we’ve chosen the “carryover” method. This approach harmonizes with our study’s objectives, facilitating a clear and effective analysis of influencer marketing impact.

Warmup and sampling parameters were tuned to balance accuracy and processing efficiency.

number_warmup=100
number_samples=100

Next, it’s important to determine the number of warmup iterations and samples. Generally, opting for larger values enhances the accuracy of the model. However, it’s important to note that increasing these values also extends the time required for the analysis to complete. So, while a higher number of warmups and samples yield more precise results, they come at the cost of increased processing time. Finding the right balance ensures a robust analysis within a reasonable timeframe.

Model fitting was executed using training data, encompassing media, media_prior, and target inputs.

mmm.fit(
media=media_data_train,
media_prior=costs2,
target=target_train,
number_warmup=number_warmup,
number_samples=number_samples,
number_chains=1,
)

06. Evaluation and Visualization:

The model’s accuracy and efficacy were evaluated through visualization tools and metrics like R-squared and MAPE.

Using the following command, we can generate visual representations of the effectiveness of individual media channels. When the plot is skewed towards the left, it indicates a lower level of effectiveness for the respective media channel. This visualization offers a straightforward way to comprehend and compare the impact of various media channels in influencing desired outcomes.

plot.plot_media_channel_posteriors(media_mix_model=mmm,channel_names=media_names)

The assessment of our model’s performance on the training data is represented by two important metrics: R-squared (R2) and Mean Absolute Percentage Error (MAPE). These metrics provide valuable insights into how well our model aligns with the actual data.

R-squared (R2): This metric gauges the proportion of the variation in the target variable (e.g., sales) that our model can explain. A higher R2 value signifies that our model captures a significant portion of the data’s variability, which is desirable.
Mean Absolute Percentage Error (MAPE): This metric measures the average difference between our model’s predictions and the actual values, expressed as a percentage of the actual values. A lower MAPE indicates that our model’s predictions are closer to the actual values.

In general, industry standards suggest that an R2 value exceeding 0.8 and a MAPE value below 15% are indicative of a well-performing model. However, it’s important to note that these benchmarks can vary based on the unique characteristics of each business scenario. These metrics serve as valuable guides in evaluating our model’s effectiveness, helping us make informed decisions about its reliability and suitability for our specific needs.

Model fit plots, out-of-sample assessments, and media channel effectiveness posteriors provided actionable insights.

Once our model has been trained and fine-tuned, it’s crucial to assess how well it performs. We achieve this through visualization and evaluation.

plot.plot_model_fit(mmm, target_scaler=target_scaler)

Above function generates a graph that illustrates how closely our model’s predictions align with the actual data we used for training. In essence, this graph showcases the accuracy of our model in capturing real-world patterns.

Moreover, we’re not limited to only the data we’ve used for training. We can also gauge how our model fares with unseen data using the below function.

plot.plot_out_of_sample_model_fit(out_of_sample_predictions=new_predictions, out_of_sample_target=target_scaler.transform(target[split_point:].squeeze()))

This function produces a similar graph, but this time it demonstrates the model’s performance on data it hasn’t encountered during training. This is essential to ensure that our model can generalize well to new situations.

Once we’re confident in our model’s performance, we can transition into the exciting phase of gaining insights from the Lightweight MMM (LMMM). These insights offer a deeper understanding of how different media channels contribute to our goals, enabling us to make informed decisions that optimize our influencer marketing strategy.

07. Deriving Insights and Optimization:

The model’s outcomes facilitated the derivation of media insights, including contribution percentages and ROI evaluations.

By using the following command, we can determine the extent to which each media channel contributes to our overall marketing efforts.

a. Understanding Media Contribution:

By executing the command,

plot.plot_bars_media_metrics(metric=media_contribution, metric_name="Media Contribution Percentage", channel_names=media_names)

we gain a clear understanding of each media channel’s contribution as a percentage. This information helps us discern the relative impact of different channels on the overall marketing strategy. Identifying the media channels with the highest contribution percentage guides informed decision-making regarding resource allocation.

b. Analyzing ROI Potential:

The command,

plot.plot_bars_media_metrics(metric=roi_hat, metric_name="ROI hat",channel_names=media_names)

enables us to assess the potential Return on Investment (ROI) from each media channel. By evaluating the “ROI hat” metric, we can pinpoint the media channels that hold the promise of delivering the best returns. This aids in prioritizing channels that align with our strategic goals and yield optimal financial outcomes.

c. Revealing Saturation Curves:

Employing the command,

plot.plot_response_curves(media_mix_model=mmm, target_scaler=target_scaler)

we gain access to saturation curves for individual media channels. These curves offer a visual representation of how media channels perform concerning normalized spending. By studying these curves, we uncover which channels exhibit maximum effectiveness with increasing investment. This insight empowers organizations to strategically direct their marketing efforts towards channels that yield the greatest returns, ultimately maximizing their investment outcomes.

In essence, these commands serve as analytical tools that distill complex data into actionable insights. By visualizing media contribution, assessing ROI potential, and understanding saturation curves, businesses can make well-informed choices about resource allocation and prioritize channels that offer the highest likelihood of success.

Saturation curves were plotted, guiding effective allocation of resources for maximum impact.

plot.plot_pre_post_budget_allocation_comparison(media_mix_model=mmm,
kpi_with_optim=solution['fun'],
kpi_without_optim=kpi_without_optim,
optimal_buget_allocation=optimal_buget_allocation,

Conclusion: Pioneering a New Era in Influencer Marketing Optimization:

In a rapidly evolving landscape, influencer marketing stands as a powerful avenue for brands to engage audiences. The application of Google’s lightweight MMM introduces a new era of strategic optimization, equipping businesses with the tools to make informed decisions. Through meticulous data handling, precise model development, and actionable insights, influencer cost optimization emerges as a reality, enabling brands to unlock the full potential of influencer partnerships while amplifying their returns on investment. As businesses navigate the ever-changing digital ecosystem, the innovation inherent in lightweight MMM proves to be an invaluable ally, propelling influencer marketing towards unprecedented success.