MTA vs Sales Lift: Understanding the Differences in Measurement Models

Multi-touch Attribution (MTA) and Sales Lift are both ad measurement methodologies intended to pinpoint the amount of sales driven by an advertising campaign. Given that, it would be easy to think that these methods are interchangeable. However, they answer different questions and are the right tool in the marketer’s toolbox for different situations.

Sales Lift

Sales Lift measures the percentage increase in sales caused by a marketing campaign. Put another way, Sales Lift will answer the question: How many more sales did I have than I would have if I had not run the ad campaign? As such, it is a type of counterfactual analysis: we are making a prediction about a world we can not observe.

Traditionally, a control and exposed methodology is used to measure sales lift. That is, we compare sales among people who saw an ad to people who did not see an ad. The most famous type of exposed and control methodology that we’re all familiar with is a drug study. In the classic drug study, you have, say, 100 sick people. They are randomly sorted into two groups, 50 sick people get the drug being studied, and 50 get a placebo. At the end of the study, we examine how many of each group got better. Let’s say 30 people who got the placebo got better, while 40 people who got the drug under study got better. So, the drug made an incremental 10 people better. A study of this type is called a Randomized Control Trial (RCT) and it is considered the gold standard for studies of this type.

Unfortunately, it is often not feasible to measure an advertising campaign with an RCT, especially if the campaign is cross-channel. For example, there is no way to suppress an audience from a Linear TV buy. You may also not want to have a large holdout, because that limits the reach of your buy in your target, and a small holdout may be too hard to measure. The good news is that a technique called causal inference has been developed to cover the multitude of situations when an RCT is not feasible. For example, let’s say you wanted to perform an epidemiological study to understand if living in a house next to a freeway was bad for your health. You can not randomly assign people to live next to the freeway, so you must do an observational study instead. However, houses next to the freeway are often less expensive than houses that are not next to the freeway, so how can we be sure that any differences in outcomes we observe are caused by living next to the freeway, and not caused by preexisting socioeconomic differences? Causal inference gives us mathematical techniques for adjusting for these preexisting differences, called confounding variables, or simply confounders.

The major strength of Sales Lift is in its measure of incremental sales. It not just shows that someone both saw an ad and purchased, it measures what sales were caused by the advertising. A classic example of what happens when you don’t measure incrementality comes from eBay and their study of branded keywords. Obviously people who search for “eBay” are likely to go to eBay and make purchases, so when buying the keyword the ad will correlate with sales. The question is does the ad cause purchases that wouldn’t have happened otherwise? When eBay did an incrementality study, they found that it rarely did.

Sales lift is neither perfect nor for everyone, however. First, it can be difficult in practice to have data about confounding variables. An ecommerce company may have data on hand about impressions and conversions, but it may not know factors like gender, household income, etc. about people who both do and do not see ads. Because of the data availability problem, lift measurement is often done using panels. Panel data is often limited in scale, so it can be difficult to get statistically significant measurement below the campaign level, such as about specific media tactics.

You can read more about how to implement Sales Lift on Snowflake using causal inference.

Multi-touch Attribution

Multi-touch Attribution (MTA) seeks to divy credit between advertising touch-points on the customer’s path to purchase. “First touch” attribution would argue that the credit for a transaction goes to the ad that initially introduced the brand to the customer. “Last touch” attribution, on the other hand, would argue that the touch point closest to the purchase was the most responsible. Most marketers, however, would agree that all touch points deserve at least some credit, and that is where multi-touch comes in.

There are many ways to perform MTA. Some models would give equal credit to all touch points, while others give outsized credit to the first and last touch points (U shaped), and still others give credit based on a time decay, with the most recent touchpoint given the most credit. The most sophisticated models, however, use machine learning (often Shapley Regression) to allow the data to tell us which touch points were most valuable. Shapley Regression comes from cooperative game theory. If a team of players in a game cooperate, how can they fairly account for the different contributions of each player? The applications to an ad campaign are obvious, as different channels are working together to try to get consumers to convert.

A major advantage of MTA is its simplicity. You need only data about the advertising and about conversions — no information about confounders or about those not exposed to the advertising is required. In addition, the output is not merely a measurement of the advertising campaign on the whole, you get detailed information about which advertising channels seem to be correlating more with purchases.

The simplicity sword has a second edge, however. Attribution can leave your campaign vulnerable to fraud. Uber famously sued several of its advertising partners for fraud, after turning off the advertising merely resulted in app installs being reclassified from paid to free, not in a drop in installs. After all, there is a fine line between advertising to those most likely to be interested in your products or services and advertising to those who you predict will purchase anyway.

Which tool should marketers use

The tool a marketer should use depends on both the data they have available as well as the position in market of the brand.

Understanding the incrementality of advertising becomes more important as a brand gets bigger, as a higher percentage of sales are driven by brand awareness or ongoing brand loyalty, and a lower percentage are directly driven by advertising. Some specialty online retailers, for instance, may sell products that last a long time and therefore are rarely purchased. As such, most customers may be driven by paid search, with low organic traffic. In this case, the simplicity of MTA is a big advantage, and the downside of MTA is low. A nationally known brand or one with a big presence in physical retail, on the other hand, has a much higher baseline probability of purchase. As such, attributing purchases of the brand to marketing touch points without establishing causality may lead to misleading results. For example, it may overstate the impact of things like branded keywords, and understate the impact of long-term brand advertising.

Sales lift requires the execution of an RCT or on enough data to perform causal analysis. That is, data is required about those who saw an ad and on those who did not see the ad. Further, you need to understand how the ads are targeted and have data about these confounders, again for those who both saw the ad and for those that did not. Data availability can lead people to choose MTA, where the only data required is impressions and sales on those who see an ad.

How Snowflake can help

Whether you choose MTA or Sales Lift, Snowflake can help speed development.

--

--