An In-Depth Look at Attribution Modeling in Digital Marketing

A.K.A. Why your marketing ROI is almost always inflated

Published in

Analytics for Humans

12 min readSep 17, 2018

If you are a marketer, return on investment from various advertising platforms is probably the most important metric you have to pay attention to on a weekly, or even daily, basis.

In most cases, you probably will log into your advertising platforms such as Facebook or Google Ads, and obtain a pre-made number calculated by these platforms and call it a day.

For some platforms, maybe you will have to do the math manually and use the following formula Total Cost / Total Conversions, but nothing more complex than that.

Well, I’m really sorry to tell you, but those numbers you obtain from your platforms do not capture your real marketing ROI accurately, and in many cases, they will be significantly inflated.

This article will show you the reason of this inflation, and explain to you why you should incorporate attribution modeling in your marketing ROI calculation to gain more insights on which platform is really effective in making your customers buy on your website.

Join our growing community of Data-driven marketers now

Why are my marketing ROIs inflated?

To understand why your marketing ROIs are inflated, let’s consider a user journey of a user named Mike.

Mike came to know about your brand via a Facebook ad that shows you the benefit of your product.

He visited your website, became very interested in it, even added the product to cart, but was called away by his girlfriend before able to complete the purchase.

Real-life kicked in, Mike forgot about your product completely, until he one day later saw a display ad of your product on the sidebar of his favorite blog (Google Display).

Reminded by the great encounter, he recovered his cart and made the purchase of your product.

While many you may think Mike’s experience is isolated from most of your audiences. In fact, it is getting a lot more common in recent years.

Researches have shown that over 92% of the customers do not have the intention of purchase when they first visit a brand or company’s website (link), and our personal experiences have shown that customer converting after visiting a single advertisement is becoming more and rarer.

For Mike, how is his purchase (let’s say its 100 dollars) reflected in your advertising platforms?

Most of the platforms over the internet use a “conversion window” model when calculating conversions. This means, as long as a customer buys or converts within a certain number of days, this ad will take full credit for that conversion.

In this case, Facebook will recognize Mike’s purchase via a pixel code embedded on your website, and since it is within the 30 days usually conversion period, it will take full, 100 dollar credit for that conversion.

Google, with the same logic, will also take the 100 dollars full credit for the conversion — which creates an insane double-counting problem when you are looking at ROI for either platform, creating the inflation I explained at the beginning of this article.

To solve this double-counting problem, and truly assign deserved credit to the correct channel (I would assign Facebook 80 and Google 20 here), we need the help of attribution models.

What is an attribution model?

Attribution modeling describes various methods marketers use to properly break up and assign conversion credits to various different channels in case users take multiple website visits, via multiple channels, to arrive at the ultimate conversion behavior.

A valid attribution model for your business must cover ALL channels your users might visit your website from, or else your calculation will be inaccurate and mostly inflated.

This warning is placed because most advertising platforms, such as Facebook and Google, offers you various ways to analyze the attribution model for data WITHIN their platforms.

For example, Google Ads can tell you which specific keyword search contributed most to the specific conversion of a user if the user visited multiple keywords within Google Ads. However, they do not offer you a way to figure out your attribution channel across multiple platforms such as Facebook and Instagram.

This lack of cross-platform integration is primarily due to the data barrier between various advertising platforms. Google, being a rival of Facebook, will not share its advertising data with Facebook, and vice versa due to a conflict of interests — making cross-platform attribution impossible.

Unless one day Google and Facebook decide to open up their data to each other, it is unlikely this cross-channel attribution barrier will be breached anytime soon.

The best free tool I can identify for attribution modeling is Google Analytics (don’t be fooled by the name). I like Google Analytics because it 1) offers various pre-built attribution models for you to choose from, and 2) is one of the most popular web analytics solutions out there with a very good supporting community.

The weakness of Google Analytics is that it does not offer a quick way for you to attribute all of your traffic down to the level of a single advertisement (you can accomplish this by carefully structure your UTM code and adding custom dimensions, but that’s really complex).

However, ad-level attribution is rarely required for most small and medium-sized businesses since it simply offers you way too much raw attribution data to make sense of given the analytics resources of your company — making it still a good tool to start attribution modeling at your company.

Attribution Model Examples

Sitting at the core of attribution modeling are (not surprisingly) attribution models — the logic you will use to assign credit to various traffic sources of your customers’ conversion journey.

Here are some typical attribution models you will encounter along the way interacting and using Google Analytics, and we are going to use the example of Mike above to showcase each of those.

Last Interaction Attribution

The last interaction attribution model is the default conversion model of Google Analytics.

It gives credit ONLY to the very last traffic source that resulted in the conversion of a user.

In Mike’s case, Google Analytics will completely ignore the fact that Mike performed his first visit via Facebook, and assign the full conversion credit ($100) to Google.

While this is the simplest way to deal with the misattribution issue, it does not guarantee accurate data.

From Mike’s example above, a common-sense marketer would probably assign to Facebook most of the credit for the conversion.

This is because the visit from Facebook was what made Mike become interested in our product, and the visit thru Google was merely a final nail that sealed the deal.

Unfortunately, a simple model like the last interaction model is not complex enough to cover the case of Mike, leaving much to be desired.

Variations of Last Interaction Attribution

A derivative of the “Last Interaction Attribution” model is the “Last Non-Direct Click” attribution and “Last Google Ads Click” attribution model.

The “Last Non-Direct Click” model assigns all credit to the last visit of the users that is not from the “direct” channel, as it is very hard to measure user intention from that specific channel.

The “Last Google Ads Click” attribution model assigns all credit to the last visit of the users that is a result of Google Ads. Honestly, I don’t know why this attribution model is even a thing — it is basically Google trying to show you how important they are in making you money — and I venture to say that in most cases this model carries little value.

Both of those models are considered slightly better than the original last interaction model. However, the same criticism applies — they are too simple for the need of modern marketers.

First Click Attribution

The first click attribution model is very similar to the last interaction attribution model.

Instead of giving all credits to the last channel user visited through, it gives all credits to the first channel.

In Mike’s case, Facebook will get the full $100 credit, while Google Ads gets none.

Being another simple way to deal with the misattribution issue, it runs into a similar problem as the last click model.

Even though Google Ads’ role in making the user convert is less significant than Facebook, we are not saying that it is completely useless in making Mike buy — which is what the first click attribution will lead our data to conclude.

Linear Attribution

Now let’s move onto few more complex attribution models, starting with the Linear attribution model.

What Linear attribution model does is that it attributes credit to all traffic sources that are involved in the conversion process evenly.

For example, if there are 10 traffic sources involved in the ultimate conversion of a user of $100, then each of the traffic sources will get a $10 credit.

In the case of Mike, it would mean that Facebook and Google Ads splits the attribution value 50/50.

Compare with the previous model, this is a vast improvement in terms of reporting accuracy.

However, in many cases, the credit should NOT be distributed evenly.

What if Mike was unimpressed by his first visit via Facebook and left the homepage of the website without any interaction with your product or your brand, and only to come back impressed by your display ads on Google?

In this case, Facebook should get much less credit than Google Ads, and the even split made by the linear attribution model does not seem fair.

Time Decay Attribution

Time decay attribution is a more advanced variation of linear attribution. It gives more credit to the traffic sources that are closer (in time) to the ultimate conversion.

In Mike’s case, since the first visit is through Facebook and final visit through Google, Google will get slightly more credit than Facebook because it is zero time distance from the ultimate conversion.

The detail of how this is computed is a mathematical problem that I don’t want to get into, but Google Analytics most likely is using some sort of time decay function to gradually reduce the weight they place on visits further away from the point of conversion.

Compare with all other conversion attribution models, this is perhaps one of the more “science” one, but I still have a few problems with it.

First of all, it doesn’t entirely resolve the concern we have about assigning too much credit to useless visits.

If the user conducts a bounced visit right before the day they convert, that visit is getting way too much credit than it deserves.

Secondly, I also do not believe in the statement “the closer a visit is from conversion, the more important it is”.

Let’s take Mike’s case, for example, he completely trusted the product after the first visit via Facebook, and is merely carrying through the motion of conversion later on via channels such as Google Ads.

In his case, an earlier visit should be considered a lot more important than a later visit — which is in contrary to the time decay model.

This example shows us that engagement on the website is a much more important factor compared with time distance to conversion — and it is not reflected at all in this model.

Position Based Attribution

I have to say that this is my favorite pre-build attribution model in Google Analytics.

The position based attribution model gives 40% credit to the first and last interaction of the entire conversion journey, while linearly distribute the remaining 20% to rest of the visits.

In case of only two visits, it acts very similarly to linear attribution model, and attribute 50% to both the first and last visit.

So in Mike’s case, both Google and Facebook get $50.

While this model still does not make up for the two major weaknesses of the previous model, the logic of it (“first and last visit are more important than all intermediary visits”) seems most sound out of the previous models.

Why? For the entire journey of our customers, there are the two most important interaction marketers care the most — 1) the first time they get to know and trust our brand, and 2) the first time they make a purchase.

Even though it may be argued that the first website visit is sometimes not a very good illustration of the first time they “trust” our brand (this might take multiple visits), but it is still the best proxy we have in measuring that moment without additional attribution setup.

With similar logic, even though in many cases the last visit might not be the time your customers decide to make a purchase, it is still a fairly good proxy of that moment of decision.

Therefore, I would recommend you using this attribution model in Google Analytics if you don’t have access to any other advanced attribution product, and are too occupied to do the modeling yourself.

The Humanlytics Take On Attribution Model

Since attribution modeling is one of my biggest passion when comes to web analytics, and our team at Humanlytics are spending a lot of time developing our own attribution model to cover the weakness of the existing models in Google Analytics, I think it would be a good opportunity to share out thinking on attribution modeling with you guys using the last section of this article.

This is not designed to be an advertisement for our product (in fact, this feature we are talking about is still pretty far away from being released), but merely to serve as a food for thought on one view of attribution.

For us, the most important factor that determines the importance of a visit is what users actually did during that visit.

For example, let’s say the user’s conversion journey follows the path of source 1, source 2, source 3, and source 4 (in which the user converted on).

For source 1, the user bounced right away from the front page, without any further engagements.

Even though it should be considered the “first impression” user had with the brand, we should really not consider users being “acquired” by that brand specifically since he didn’t even engage in a valid session with the company.

For source 2, the user visited 4 pages on the website, and spent 1 minute on his entire session, and added some items to cart (engagement conversion).

This is when the real engagement between the user and the brand started, and at the point, the user should be considered “acquired” by the company.

For source 3, the user visited his cart, visited the checkout page (engagement conversion), but fall just short of converting and buying your product.

This is where the user deepened his engagement with the company by completing more “engagement objectives”.

For source 4, the user recovered the cart from previous cookies and finished his purchase (business conversion).

Here, the user completed his conversion and achieved the “business objective”.

Try the Humanlytics platform today for free.

Now, we should really think about each of the visits in terms of the following factors:

How intense was the engagement between the brand and the users? This can be measured by a combination of engagement metrics (such as pages/session and session duration) and completion of what we call “engagement objectives”.
Did the visit result in a business conversion? This is measured by the number of “business conversions” accomplished at each visit, such as buying a product.

In our initial hypothesis, we compare those two factors evenly. However, this weight will be readjusted both automatically and manually as we progress with more testing and experimentation.

Based on our scoring system, the first source will get no credit at all, since the engagement and conversion score are both zero.

The second source will get a pretty high score as it ranks high on both compositions of the “engagement factors”

The third source will rank slightly lower as it only accomplishes one of the engagement factors.

The final source will rank slightly higher than the second score, but not too much as it does not have any engagement scores.

Because most of the data we use to calculate our score is going to be obtained on an aggregate level to avoid user privacy concerns, most of the attribution scores are going to be calculated based on the overall user experience of a specific traffic source, instead of the action of an individual user.

While this method may reduce the accuracy of our result, the noise is not large enough in most cases to sway business decisions or produce an unreasonable estimate.

Furthermore, since we integrate data from platforms such as Facebook and Google Ads with Google Analytics, we can not only attribute conversions to a specific channel, we can attribute it to specific adgroups, campaigns, or even ads if we want to, making the attribution a lot more detailed and accurate — which is super exciting for me.

If you would like to try the Humanlytics platform to improve your marketing, here is how you can get started.

This wraps up an overview of attribution modeling. Hope this in-depth look at attribution modeling can help you understand the techniques and thoughts behind this common but vital digital analytics practice.

Next, we are going to show you how to actually do attribution modeling in Google Analytics, with practical workflows on how you can set up attribution modeling for your company — stay tuned!

This article was produced by Humanlytics. Looking for more content just like this? Check us out on Twitter and Medium, and join our Analytics for Humans Facebook community to discuss more ideas and topics like this!