Time-Series Market Mix Modeling for Sales Forecasting vs Advertising Expenses using Azure Blob Storage

12 min readJun 26, 2024

Introduction to Market Mix Modeling and Forecasting

Market mix modeling is an analytical approach used to evaluate the effectiveness of various marketing strategies on sales and demand. This method considers a combination of factors like advertising expenses, consumer behavior metrics, and external economic indicators. In this project, we utilize a comprehensive dataset from Shenzhen city, which includes data on demand, sales, supply, Point of Sale (POS) data, and advertising expenditures across multiple channels such as Mobile SMS, Newspaper Ads, Radio, TV, Posters, and the Internet. These metrics are supplemented by Gross Rating Points (GRP), which measure the effectiveness of advertising campaigns.

Understanding how various advertising expenses impact sales is crucial for optimizing marketing budgets. In this blog post, we’ll walk through a comprehensive approach to analyze and predict sales based on different advertising platforms using a market mix model. This involves several steps including data preprocessing, feature engineering, model training, and providing actionable insights.

Step 1: Utilizing Azure Blob Storage Container to Store and Load Data into the Model

First, we ensure our data is securely stored and easily accessible. Azure Blob Storage is a great solution for this, allowing us to load the data directly into our model for analysis. Here’s a step-by-step guide on how to use Azure Blob Storage:

Create an Azure Storage Account:

Go to the Azure Portal and create a new storage account.
Choose a unique name and select the appropriate region and performance options.

Create a Blob Container:

Navigate to your storage account and select “Blob service” -> “Containers”.
Create a new container, give it a name, and set the access level (private, blob, or container).

Upload Data to the Blob Container:

Use the Azure Portal to upload your dataset to the newly created blob container.

Access Data from Google Colab:

Install Azure Storage SDK in your Colab environment using

!pip install azure-storage-blob.

from azure.storage.blob import BlobServiceClient

account_name = 'marketmixmodel'
account_key = 'Replace it with your unique key generated'
container_name = 'datacontainer'

connect_str = f"DefaultEndpointsProtocol=https;AccountName={account_name};AccountKey={account_key};EndpointSuffix=core.windows.net"

blob_service_client = BlobServiceClient.from_connection_string(connect_str)

blob_name = 'MMM_data.xlsx'

blob_client = blob_service_client.get_blob_client(container=container_name, blob=blob_name)
with open(blob_name, "wb") as download_file:
    download_file.write(blob_client.download_blob().readall())

import pandas as pd

data = pd.read_excel(blob_name)
data.head()

Step 2: Relationship Analysis Summary

We analyzed the relationships between demand, CPI, CCI, POS, unit price, and sales for each marketing platform. Here are the key findings:

Positive Correlations:

Demand and POS/Supply Data (0.92): Higher supply data is associated with higher demand.
Demand and Sales (0.92): As sales increase, demand also increases.
Sales and POS/Supply Data (0.998): Direct relationship between supply data and sales.

Negative Correlations:

Demand and CCI (-0.48): Higher consumer confidence might be associated with lower demand.
Sales and CCI (-0.47): Similar negative correlation.

Weak Correlations:

Demand and CPI (0.29): Weak positive relationship.
Demand and Unit Price (-0.02): Very weak negative relationship.
Advertising expenses and GRP metrics: Generally weak correlations with demand and sales.

Understanding Pearson Correlation:

The Pearson correlation coefficient measures the linear relationship between two variables, ranging from -1 to 1:

+1: Indicates a perfect positive linear relationship.
-1: Indicates a perfect negative linear relationship.
0: Indicates no linear relationship.

The p-value tests the null hypothesis that the correlation is zero (no linear relationship). A low p-value (typically ≤ 0.05) indicates that the observed correlation is statistically significant.

from scipy.stats import pearsonr

def pearson_correlation_test(x, y):
    corr, p_value = pearsonr(x, y)
    return corr, p_value

variables = ['Consumer Price Index (CPI)', 'Consumer Confidence Index(CCI)', 'POS/ Supply Data', 'Unit Price ($)', 'SALES ($)']
for var in variables:
    corr, p_value = pearson_correlation_test(data[var], data['DEMAND'])
    print(f'Pearson Correlation between DEMAND and {var}: {corr:.2f} (p-value: {p_value:.2e})')

    corr, p_value = pearson_correlation_test(data[var], data['SALES ($)'])
    print(f'Pearson Correlation between SALES and {var}: {corr:.2f} (p-value: {p_value:.2e})')

Interpretation of Statistical Test Output:

Let’s interpret each Pearson correlation and p-value from the provided output:

DEMAND and Consumer Price Index (CPI):

Correlation: 0.29
P-value: 5.72e-52
Interpretation: There is a weak positive linear relationship between demand and CPI. The relationship is statistically significant (p-value < 0.05).
Prediction: As CPI increases, demand tends to increase slightly.

SALES and Consumer Price Index (CPI):

Correlation: 0.23
P-value: 1.94e-32
Interpretation: There is a weak positive linear relationship between sales and CPI. The relationship is statistically significant (p-value < 0.05).
Prediction: As CPI increases, sales tend to increase slightly.

DEMAND and Consumer Confidence Index (CCI):

Correlation: -0.48
P-value: 3.84e-150
Interpretation: There is a moderate negative linear relationship between demand and CCI. The relationship is statistically significant (p-value < 0.05).
Prediction: As CCI increases, demand tends to decrease

SALES and Consumer Confidence Index (CCI):

Correlation: -0.47
P-value: 4.73e-145
Interpretation: There is a moderate negative linear relationship between sales and CCI. The relationship is statistically significant (p-value < 0.05).
Prediction: As CCI increases, sales tend to decrease.

DEMAND and POS/Supply Data:

Correlation: 0.92
P-value: 0.00e+00
Interpretation: There is a very strong positive linear relationship between demand and POS/Supply Data. The relationship is statistically significant (p-value < 0.05).
Prediction: As supply data increases, demand significantly increases.

SALES and POS/Supply Data:

Correlation: 1.00
P-value: 0.00e+00
Interpretation: There is an almost perfect positive linear relationship between sales and POS/Supply Data. The relationship is statistically significant (p-value < 0.05).
Prediction: As supply data increases, sales significantly increase.

DEMAND and Unit Price ($):

Correlation: -0.02
P-value: 2.60e-01
Interpretation: There is a very weak negative linear relationship between demand and unit price. The relationship is not statistically significant (p-value > 0.05).
Prediction: Changes in unit price have little to no effect on demand.

SALES and Unit Price ($):

Correlation: 0.04
P-value: 5.51e-02
Interpretation: There is a very weak positive linear relationship between sales and unit price. The relationship is not statistically significant (p-value > 0.05).
Prediction: Changes in unit price have little to no effect on sales.

DEMAND and SALES ($):

Correlation: 0.92
P-value: 0.00e+00
Interpretation: There is a very strong positive linear relationship between demand and sales. The relationship is statistically significant (p-value < 0.05).
Prediction: As sales increase, demand significantly increases.

Step 3: Feature Engineering

Based on the output from Step 1, we’ll focus on creating new features to enhance our dataset. Feature engineering involves creating new variables from existing data to improve model performance and capture more information about the relationships between variables.

Goals for Feature Engineering:

Create Lag Features: Capture the effect of past values on current demand and sales. Lag features can help capture the effect of past values on current demand and sales. We’ll create lag features for up to 7 days (a week).

# Creating lag features for DEMAND and SALES
for lag in range(1, 8):  # Lagging up to a week
    data[f'DEMAND_LAG_{lag}'] = data['DEMAND'].shift(lag)
    data[f'SALES_LAG_{lag}'] = data['SALES ($)'].shift(lag)

data = data.dropna()

Create Rolling Mean Features: Capture trends and seasonality by averaging values over a window of time. Rolling mean features help in capturing trends and seasonality by averaging values over a 7-day window.

# Creating rolling mean features for DEMAND and SALES over a week
data['DEMAND_ROLLING_MEAN_7'] = data['DEMAND'].rolling(window=7).mean()
data['SALES_ROLLING_MEAN_7'] = data['SALES ($)'].rolling(window=7).mean()
data = data.dropna()

Create Interaction Features: Capture the combined effect of advertising expenses and GRP metrics. Interaction features can capture the combined effect of two or more variables. We’ll create interaction features between advertising expenses and GRP metrics.

# Creating interaction features
data['SMS_GRP_INTERACTION'] = data['Advertising Expenses (SMS)'] * data['GRP(SMS)']
data['Newspaper_GRP_INTERACTION'] = data['Advertising Expenses(Newspaper ads)'] * data['GRP (NewPaper ads)']
data['Radio_GRP_INTERACTION'] = data['Advertising Expenses(Radio)'] * data['GRP(Radio']
data['TV_GRP_INTERACTION'] = data['Advertising Expenses(TV)'] * data['GRP(TV)']
data['Internet_GRP_INTERACTION'] = data['Advertising Expenses(Internet)'] * data['GRP(Internet)']

Normalize/Standardize Features: Ensure they are on the same scale, which is important for many machine learning algorithms. Normalizing or standardizing features ensures they are on the same scale, which is important for many machine learning algorithms.

from sklearn.preprocessing import StandardScaler

# Selecting numeric features
numeric_features = data.select_dtypes(include=[float, int]).columns

# Standardizing the features
scaler = StandardScaler()
data[numeric_features] = scaler.fit_transform(data[numeric_features])

Step 4: Understanding and Calculating Gross Rating Point (GRP)

What is the Gross Rating Point (GRP)? Gross Rating Point (GRP) is a measure used in advertising to gauge the strength of an ad campaign. It is calculated by multiplying the percentage of the target audience reached by an advertisement by the frequency with which they see the ad.

Essentially: GRP=Reach (%)×Frequency

Reach: The percentage of the target audience exposed to the ad at least once.
Frequency: The average number of times the target audience is exposed to the ad.

GRP helps advertisers understand the total exposure of their campaign, which can then be used to evaluate the effectiveness of different advertising channels.

Calculation of GRP: Since we already have GRP values provided in our dataset for different advertising platforms (SMS, Newspaper ads, Radio, TV, and Internet), we don’t need to calculate it manually. Instead, we will use these values to analyze the advertising campaign's effectiveness.

Using GRP for Campaign Effectiveness: Steps to Analyze Advertising Campaign Effectiveness:

Visualize GRP Over Time: Plot the GRP for each advertising platform over time to understand their trends.
Correlation Analysis Between GRP and Sales: Calculate the correlation between GRP values for each platform and sales to understand their impact.
Interaction Analysis Between GRP and Advertising Expenses: Create interaction features between GRP and advertising expenses to capture their combined effect on sales.

# Advertising Expenses and GRP over time for different platforms
fig = make_subplots(rows=5, cols=2, subplot_titles=[
    'Advertising Expenses (SMS)', 'GRP (SMS)',
    'Advertising Expenses (Newspaper ads)', 'GRP (Newspaper ads)',
    'Advertising Expenses (Radio)', 'GRP (Radio)',
    'Advertising Expenses (TV)', 'GRP (TV)',
    'Advertising Expenses (Internet)', 'GRP (Internet)'
])

# SMS
fig.add_trace(go.Scatter(x=data['DATE'], y=data['Advertising Expenses (SMS)'], mode='lines', name='Advertising Expenses (SMS)'), row=1, col=1)
fig.add_trace(go.Scatter(x=data['DATE'], y=data['GRP(SMS)'], mode='lines', name='GRP (SMS)'), row=1, col=2)

# Newspaper ads
fig.add_trace(go.Scatter(x=data['DATE'], y=data['Advertising Expenses(Newspaper ads)'], mode='lines', name='Advertising Expenses (Newspaper ads)'), row=2, col=1)
fig.add_trace(go.Scatter(x=data['DATE'], y=data['GRP (NewPaper ads)'], mode='lines', name='GRP (Newspaper ads)'), row=2, col=2)

# Radio
fig.add_trace(go.Scatter(x=data['DATE'], y=data['Advertising Expenses(Radio)'], mode='lines', name='Advertising Expenses (Radio)'), row=3, col=1)
fig.add_trace(go.Scatter(x=data['DATE'], y=data['GRP(Radio'], mode='lines', name='GRP (Radio)'), row=3, col=2)

# TV
fig.add_trace(go.Scatter(x=data['DATE'], y=data['Advertising Expenses(TV)'], mode='lines', name='Advertising Expenses (TV)'), row=4, col=1)
fig.add_trace(go.Scatter(x=data['DATE'], y=data['GRP(TV)'], mode='lines', name='GRP (TV)'), row=4, col=2)

# Internet
fig.add_trace(go.Scatter(x=data['DATE'], y=data['Advertising Expenses(Internet)'], mode='lines', name='Advertising Expenses (Internet)'), row=5, col=1)
fig.add_trace(go.Scatter(x=data['DATE'], y=data['GRP(Internet)'], mode='lines', name='GRP (Internet)'), row=5, col=2)

fig.update_layout(title='Advertising Expenses and GRP Over Time for Different Platforms', height=2000, showlegend=False)
fig.show()

Step 5: Visualizing Demand, Supply, and Sales Profit Trends for Each Marketing Platform

In this step, we will create visualizations to analyze trends in demand, supply, and sales profit. We will also identify which marketing platform impacts sales the most on a daily basis.

Steps to Visualize and Analyze Trends:

Visualize Demand, Supply, and Sales Over Time: Plot demand, supply, and sales over time to understand their trends.
Analyze Marketing Platform Impact on Sales: Create scatter plots to visualize the relationship between sales and advertising expenses for each marketing platform.
Visualize Advertising Expenses and GRP for Each Platform: Plot advertising expenses and GRP for each platform over time to understand their trends and combined effects.

Interpretation and Insights from Visualizations

Demand, Supply, and Sales Over Time:

Observations: Demand shows significant fluctuations with noticeable peaks around 2010 and 2011. There was a period of relatively low demand from 2013 to 2015, with a noticeable increase starting in 2016. Supply mirrors the demand pattern closely, suggesting a direct relationship between supply and demand. Sales follow a similar trend to both demand and supply, with significant fluctuations around the same periods.
Insights: The close mirroring of demand, supply, and sales indicates a strong relationship between these variables. The peaks in demand are met with corresponding peaks in supply and sales, highlighting efficient supply chain management. The period of low activity (2013–2015) may warrant further investigation to understand underlying causes, such as market conditions or changes in advertising strategies.

Sales vs. Advertising Expenses for Each Marketing Platform:

Observations: SMS shows a spread of sales values across different levels of SMS advertising expenses. There is no clear trend, indicating a weak or no direct relationship. Similar observations are made for newspaper ads, radio, TV, and internet expenses.
Insights: The scatter plots suggest that advertising expenses alone do not directly drive sales. This implies that other factors or a combination of factors (including advertising effectiveness, market conditions, and consumer behavior) play a significant role. A more detailed analysis using interaction terms or advanced modeling techniques may be needed to uncover the true impact of advertising expenses on sales.

Advertising Expenses and GRP Over Time for Different Platforms:

Observations: SMS and GRP (SMS) both show a cyclical pattern, with regular peaks and troughs, indicating consistent advertising efforts over time. Similar patterns are observed for newspaper ads, radio, TV, and the internet.
Insights: The cyclical patterns in both advertising expenses and GRP suggest well-planned and consistent advertising campaigns across all platforms. The synchronization of peaks in expenses and GRP indicates that higher spending directly translates to higher audience reach and frequency (as measured by GRP). The pronounced peaks in GRP for the internet suggest high effectiveness in reaching and engaging the audience, even with relatively lower advertising expenses.

Final Step: Creating a Comprehensive Market Mix Analysis

We trained a linear regression model to predict sales based on advertising expenses and provided actionable insights based on the model’s coefficients.

import pandas as pd
from sklearn.linear_model import LinearRegression
from sklearn.preprocessing import StandardScaler
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_absolute_error, r2_score

data = pd.read_excel('MMM_data.xlsx')
data['DATE'] = pd.to_datetime(data['DATE'])

features = [
    'Consumer Price Index (CPI)', 'Consumer Confidence Index(CCI)', 'Producer Price Index (PPI)',
    'Unit Price ($)', 'POS/ Supply Data', 'Advertising Expenses (SMS)', 'Advertising Expenses(Newspaper ads)',
    'Advertising Expenses(Radio)', 'Advertising Expenses(TV)', 'Advertising Expenses(Internet)'
]
target = 'SALES ($)'

X = data[features]
y = data[target]

scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Splitting the data into training and testing sets
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.2, random_state=42)

# Training the linear regression model
model = LinearRegression()
model.fit(X_train, y_train)

y_pred = model.predict(X_test)

mae = mean_absolute_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)

print(f"Mean Absolute Error: {mae}")
print(f"R-squared: {r2}")

coefficients = pd.DataFrame({
    'Feature': features,
    'Coefficient': model.coef_
})

# Calculating the effectiveness of each advertising expense
coefficients['Effectiveness'] = coefficients['Coefficient'] * scaler.scale_
print(coefficients)

def predict_sales_and_recommend(budget_input):
    sample_input = pd.DataFrame({
        'Consumer Price Index (CPI)': [1.0], 
        'Consumer Confidence Index(CCI)': [1.0],
        'Producer Price Index (PPI)': [1.0],
        'Unit Price ($)': [1.0],
        'POS/ Supply Data': [1.0],
        'Advertising Expenses (SMS)': [budget_input['Advertising Expenses (SMS)']],
        'Advertising Expenses(Newspaper ads)': [budget_input['Advertising Expenses(Newspaper ads)']],
        'Advertising Expenses(Radio)': [budget_input['Advertising Expenses(Radio)']],
        'Advertising Expenses(TV)': [budget_input['Advertising Expenses(TV)']],
        'Advertising Expenses(Internet)': [budget_input['Advertising Expenses(Internet)']]
    })
    
    sample_input_standardized = scaler.transform(sample_input)
    
    predicted_sales = model.predict(sample_input_standardized)
   
    coefficients = pd.DataFrame({
        'Feature': features,
        'Coefficient': model.coef_
    })
    coefficients['Effectiveness'] = coefficients['Coefficient'] * scaler.scale_
    recommendation = coefficients.loc[coefficients['Effectiveness'].idxmax()]
    
    return predicted_sales[0], recommendation

# Sample budget input for prediction
budget_input = {
    'Advertising Expenses (SMS)': 1000,
    'Advertising Expenses(Newspaper ads)': 1000,
    'Advertising Expenses(Radio)': 1000,
    'Advertising Expenses(TV)': 1000,
    'Advertising Expenses(Internet)': 1000
}

# Predicting sales and providing recommendation
predicted_sales, recommendation = predict_sales_and_recommend(budget_input)

print("Predicted Sales:", predicted_sales)
print(f"Recommendation: Investing more in {recommendation['Feature']} might improve the sales.")

Conclusion

This journey into market mix modeling has been a valuable learning experience, revealing how different advertising expenses impact sales. By analyzing relationships, creating new features, and training a predictive model, we’ve developed a reliable framework for making informed decisions on advertising budgets. We’ve discovered the significant role of supply data in driving demand and sales, and we’ve gained insights into how consumer indices and advertising strategies interact.

Our predictive model, backed by detailed analysis, allows us to recommend the most effective advertising channels for maximizing sales. This project has deepened my understanding of data-driven decision-making and its power to optimize marketing efforts. The skills and knowledge gained here will be invaluable for future projects, helping to ensure that every marketing dollar is used wisely and effectively.

Dataset referenced from — https://github.com/jamesrawlins1000/Market-mix-modelling-data/blob/master/README.md