Marketing Mix Modeling vs. Multi-touch Attribution: differences, advantages and disadvantages

Published in

DP6 US

5 min readOct 18, 2023

Introduction

Whenever the idea is to compare different methods of evaluating a given reality, we can be sure that we will come across the term “it depends”. This is because different methods can be more or less advantageous depending on the context in which they are applied.

The title of this text, therefore, may seem misleading, as we cannot say that there is a universal advantage of the Marketing Mix Modeling (MMM) method over attribution (MTA) or lift methods. In fact, there are specific contexts in which this method can prove more advantageous.

We’re not going to go into the workings of MMM here, as this has already been described in another blog post. We’ll deal with this topic in a more objective way that will help us understand the differences between the two techniques mentioned above.

Technical differences

The fundamental difference between MMM and MTA is the perspective each has on the data. MMM is a technique that uses machine learning to understand the relationship, over time, between investment in media and sales. To do this, MMM organizes the data into a timeline, grouping investments and sales into days or weeks within the analysis period.

The MTA seeks to take advantage of the possibility of more detailed observation of each step the customer takes up to the point of purchase, seeking to distribute the responsibility for this purchase over the different media channels that were accessed by this customer. Therefore, MTA data is organized on the basis of customer journeys, which identify the points of contact with the media up until the moment of purchase.

It is clear, therefore, that there is a difference in the approach to the problem for each method. These differences have an impact on two fundamental issues: a more technical one, which concerns the data required for each application, and a more contextual one, which takes into account the timing of the business.

Data requirements

On the one hand, MMM requires both simpler data collection and organization, since we only need to know investments and sales on a daily or weekly basis. Most of the time, this data is available and doesn’t need much processing. However, there are some complexities.

1. Need to include each relevant event: If there is an external factor that has an impact on sales, it is important that it is taken into account in the data, otherwise it could confuse the model. This is because, while the model takes into account total sales, regardless of the effects that caused these sales, it only looks at the events that we include in it. If a relevant external event is left out, the model won’t be able to attribute its impact, forcing it to distribute this effect over the events it has available.

2. Limit on independent events: As this is a machine learning model, there is a limit to the number of variables we can include depending on the amount of data. This can force us to group channels with different behaviors into a single channel, making it difficult for the model to understand the size of the impact of these channels on sales.

3. Collection time: Depending on the complexity of the media mix, this model may require considerable collection time. In simpler scenarios, we can usually consider 2 years to be enough, but this is a case-by-case study, and certain scenarios may require more than that.

The MTA, on the other hand, has the advantage of being able to carry out relevant analyses with just 2 months of data, as well as allowing media to be observed at a more granular level, without such strict limits regarding their grouping. Despite these advantages over the MMM, there are some complexities related to data collection:

1. Limits to cookie-based collection: reprocessing or blocking cookies often prevents the collection of journeys. Find out more about the end of cookies here and here.

2. Inability to observe relevant events: by relying on clicks to collect data, we miss out on relevant events such as ad impressions. As a result, the effect of some top-of-funnel channels is often overlooked.

3. Loss of relevant events outside the collection environment: As this is first-party data, we were unable to observe occasional conclusions from journeys through marketplaces or offline purchases.

Context

In general, the most suitable contexts for using MTA are those in which:

• Most investments are made in digital media

• The focus of the business is on sales on the website

• There is little time for data collection

• There is great diversity in the media mix

As for the MMM, we can list contexts in which:

• There is a significant investment in off-air media and radio or TV

• There are several points of sale

• There is a relevant collection time

• There is a media mix that can be grouped from relatively homogeneous channels

Combination of analyses

Although both analyses focus on the media, they present different possibilities. MMM is geared towards a more managerial analysis, providing a general and periodic overview of media impacts. MTA, on the other hand, can provide more useful insights at the media operation level, such as performance or the relationship between specific channels, or differences in short-term results.

The best scenario here is a combination of the two analyses, since with the advantages of one we can mitigate the limitations of the other. However, both techniques presented here have their technical complexities and data maturity requirements. A careful evaluation is therefore required, taking into account both the possibilities and the analysis needs of the moment. It is important to bear in mind that if the minimum requirements in terms of data and resource maturity are not met, the results of the analysis may not be reliable. Therefore, in these cases, a more data-engineering-oriented process would be appropriate, seeking to create the conditions for using the chosen technique.

Conclusion

As I said at the beginning, we can’t point to a universal advantage of one technique over the other; there are contexts, related to data or the business itself, which make one or the other technique more advantageous. Furthermore, when possible, it can be very advantageous to combine both techniques.

Count on DP6 to help you identify which technique is most advantageous according to your data maturity and business context.