Marketing Attribution Modelling at Cazoo

António Lima
Cazoo Technology Blog
12 min readDec 15, 2021

Why do we need an attribution model

Marketing Attribution Modelling is a hot topic — no doubt about it. A quick search on Google returns 7 ads from technology providers to marketing agencies and content generators. These companies are paying for ads because they know there’s enough appetite from marketers, data professionals and senior leadership to successfully implement an attribution model for their companies. Why is that though? These companies are spending, thousands, hundreds of thousands and sometimes millions of pounds in advertising and they see attribution (I’ll start to use attribution to refer to marketing attribution modeling from now on) as a way to:

  • Understand their ROAS (Return On Ad Spend) and prove they are making profitable and sensible investments with their marketing budget;
  • Inform how they can optimise their marketing spend by shifting ad spend from the least performing channels and campaigns into the better performing ones.

How does attribution work

Attribution is the process of distributing the credit for sales/revenues/leads (I’ll use conversions from now on to refer to all of these) amongst the different channels that customer interacts with from the start of their journey through to when they convert.

An attribution model will distribute no more and no less than 100% of total conversions across all of the channels. This is key and one of the main reasons why one shouldn’t take attribution within a single advertising platform at face value. As multiple advertising platforms will claim credit for the same conversions you’ll soon see that the total conversions attributed will be more than 100% of your total conversions.

Attribution is also a deterministic process in the sense that it assumes that all of a customer’s touchpoints explain all of the reasons that led them to eventually convert — it assumes causation equals correlation (later in the article I’ll explain some of the pitfalls that might come with this and some complementary methods and techniques to attribution but for now we’ll accept this).

*Note: Whilst touchpoints are mostly clicks to website/app, they can also be other interactions like calls, emails, in-person visits, impressions, etc.)

Now, how do we distribute those conversions amongst the different channels?

Here’s where we introduce different types of attribution modelling.

At Cazoo we use first touch, last touch, linear and position based attribution. There are others including time-decay, markov chains, algorithmic, etc. but I will only dive into the ones we use here at Cazoo:

  • First touch gives 100% of credit to the channel responsible for the customer first touchpoint.
  • Last touch gives 100% to the last touchpoint.
  • Linear attribution distributes credit equally amongst all touchpoints.
  • Position-based attribution gives 40% of the credit to the first touchpoint, 40% to the last touchpoint and 20% to all touchpoints in between.
Marketing Attribution Models
Cazoo’s Attribution Models

In addition to these 4 models in their purest form we created variations to specifically meet the Cazoo needs — we sell cars (usually anyone’s 2nd biggest purchase after buying a house) so we have longer customer journeys than most businesses. For that reason we have modified these models in two ways.

  • Lead generation based:

In the ‘to lead’ attribution we only distribute credit to the channels that played a part in turning a customer into a lead (we generally consider a lead someone we have contact details for). In these to-lead variations everything that happens between us getting the contact details and the user placing an order is disregarded from an attribution perspective as the marketing objective of generating a lead has been met.

  • Non-direct touchpoints based:

In the ‘non-direct’ attribution direct touchpoints are not given attribution credit. This is because we have long customer journeys so a customer will visit us directly multiple times before converting. And if we attributed credit to the direct touchpoints that would steer considerable credit away from the marketing channels. Often marketing channels are still responsible for driving direct interactions, they just happen in different follow-up sessions. The only exception to still give credit to direct touchpoints in this variation would be when Direct is your first touchpoint as in such a case it denotes an absence of marketing channel responsible for your customer journey.

This means we end up with a total of 13 attribution models: First touch is always first touch regardless, and then 4 variations (“to order”, “to lead”, “to order non direct”, “to lead non direct”) for last touch, linear and position based.

I like to think of position-based to lead non-direct as a good candidate for a company default model, but the truth is we build it in a way we can be flexible about it and flick between attribution models as one wishes. Let’s actually take a look into a real-life example at Cazoo (PII protected obviously):

Touchpoint is Lead = True means the customer became a lead on its 2nd touchpoint (after clicking on a FB Ads ad); Touchpoint is Order = True means that the customer converted (ordered a car) on its 21st touchpoint (on a session originated by an Organic Search). 

So, given this customer journey how is credit distributed amongst the different channels?

  • Generic Paid Search being the first touchpoint will naturally be credited 100% of the conversion on a first touch basis and Organic Search being the last touchpoint will be credited 100% on a last touch basis;
  • However if we use the variation last touch to lead it will be Facebook ads receiving 100% of the credit as it was the last touchpoint before the customer became a lead;
  • As for linear attribution, in its original form Email gets most of the credit (most of the touchpoints in the journey are originated by email) and Direct gets ~14% of the credit whereas the other 3 channels only get ~5% of it. Although, on the to lead variation Facebook Ads and Generic Paid Search get 50% of the credit each and I think this illustrates why the to lead variation can be so useful from a marketing point of view (Email was just mostly working with the lead that FB and Google “gave” to it, so it makes more sense to credit the channels responsible for the acquisition of this customer);
  • Finally, the non-direct variations strip Direct from any credit (still on the linear attribution example, the linear attribution to order non direct gives 0% credit to Direct when its original variation was attributing ~14% of the credit to Direct);
  • The table below summarises how credit is distributed amongst the different channels for each of the attribution models.

If you’ve reached this far you can now understand how attribution solve for the problems mentioned at the beginning; we now know much credit is due to each channel and we also know how much are we spending on those channels (not only in £ ad spend but also in terms of effort/people) so we can understand if the return is worth the investment. Now, if you were mostly interested from a business point of view you can go directly into the last bit of the article (i.e. final thoughts), however if you are interested in learning about the data and tech parts of this project and how to implement an attribution model in real-life in a world of imperfect data and tracking then stick around for the next chapter.

Actually implementing attribution modelling in your organisation

Building an attribution model involves a wide range of technical challenges which at the same time are important to mention but some could also take its own article to dive deep into them, so in the interest of reading time I’ll try to be thorough and succinct at the same time. These are roughly the steps we had to take in order to make it happen here at Cazoo:

  1. Event Tracking — This is not necessarily part of an attribution modelling project, but it’s definitely a pre requisite. When above we talk about touchpoints, in order for us to actually capture them we need some front end event tracking solution. There are several solutions and possibilities out there but our setup is quite simple yet really powerful. We use Google Tag Manager to track events into Segment which in turn disseminates those events into several other platforms, namely google analytics which we then seamlessly connect to our Big Query data warehouse (as they are both Google products).
Cazoo’s Font-End Event Tracking

2. Put together a holistic customer journey — customers cross device tracking. This one is particularly challenging for Cazoo, although a key step in the whole process. As our customers are not logged in when they are browsing our website we don’t have the same weapons to tackle identity resolution as other companies do, hence we need alternatives. We have mostly two ate the moment: 1) Our best one leverages Segment’s personas product: when you submit your email (or click on an email that takes you to cazoo website) on multiple devices we know that those two devices belong to the same customer; 2) We use some internal Ids to tie together cross device journeys — whenever you submit a finance application we attribute you a finance checkout id. Then if you happen to resume that journey on a different device by for example clicking on an email that leads you to that same finance checkout id we would understand that those two devices belong to the same customer.

And why is this important? Take a look into the image below, we see a customer who did most of its research and getting familiar with the website on a mobile device but eventually places an order on a desktop. If we were not to consider this cross device behaviour we would have wrongly assumed that this customer’s journey was a single session one originated by an Organic Search when in fact their journey started 43 days earlier via Generic Paid Search.

3. Put together a holistic customer journey — customers cross platform tracking. This one is also very particular to Cazoo and related to the fact of us advertising on aggregator websites like Autotrader. It is also probably the one that made the biggest difference in terms of fully understanding the impact of a channel — without it we’d have a completely different read on the importance of these aggregators into our marketing mix. When you are browsing Autotrader and you come across one of our ads/listings you have three options to reach Cazoo: Click on a link into our website (that one will be tracked like any other channel), email us directly from the Autotrader platform or call us. When you email us we can then compare that against any email that you submitted on our website (for example: when you submit a finance application with us or subscribe to our stock alerts). Finally, when you call us about an Autotrader listing we will at a later stage try to match against the phone number you submit on our website when placing an order.

Again, let’s look into the image below to understand why this is important. If we were not to attempt to put the customers contacting us in aggregator websites together with their journeys on Cazoo’s website we would assume that in this particular case their journey had started via email in our website (which is impossible because you can only receive an email from Cazoo if you have been in touch with us). However, when we understand that you first reached out to us via the email form in Autotrader your whole journey suddenly makes sense and we can give credit where it’s due: to the aggregator listing.

Note: we do this in a completely anonymised way, we never actually look into users' emails or phone numbers but an hashed version of it

4. Data Manipulation, ETLing and lots of SQL code. In this stage we start by basically take the work that has been done with putting together holistic customer journeys and we collate it in one big table we will call customer_touchpoints (i.e. a place where we store all relevant customer interactions from a marketing point of view — for us at Cazoo this is mostly website sessions and form submissions in aggregator websites like autotrader). From here, the goal is to reach a version in which such touchpoints have been attributed its due credit for each conversion but in the middle there is lots of data manipulation, window functions and ranking of touchpoints (This is the way a computer reading SQL in an automated way has to make sense of touchpoints’ order vs when a human looks into them in a spreadsheet manually) and ETLing using DBT (massive time saver, you should check it out!). In case you are interested in the more technical details of it you can find some code snippets below using the same example in the intro chapter but now working it out in a more SQLish way.

The output table (attributed_conversions) structure is basically still a big collection of touchpoints with each of them referencing the due credit they’re owed for their influence on the path to conversion. Such table is quite granular which allows us to then calculate an aggregate sum of attributed conversions at whichever levels we need to (by channel, date, device type, campaign, etc.).

5. Democratise it across the organisation — The final stage is both a critical but a relatively simple step in the process. It’s critical because without it our stakeholders in the marketing teams wouldn’t benefit from the project; simple because the heavy lifting of data manipulation and tracking has already been done by now. The last step is to make your attributed_conversions table available in a visualisation tool (we use Looker at Cazoo) so that the marketing teams can explore this data without the need to run any SQL queries at all.

Final Thoughts

Ok great we built an attribution model, so what? What is the business impact so far?

  1. Knowledge is power, you can’t optimise what you don’t measure — The first impact has to be the fact that all our marketing teams can now better understand the impact that their channels and campaigns are having into the bottom line. In addition, they can monitor this in nearly real time in charts similar to the one below (here we are comparing channels monthly, but we could also be looking into totals per channel, campaign, device type, we could be looking into it daily, weekly, etc.)
Attributed Orders per Channel in Looker

2. The Performance Marketing team has already made some investment decisions from the back of it. We have combined the attributed conversions data with the advertising platforms data (things like impressions, clicks, spend) and by doing it so we can get to a cost per attributed conversion per channel (or per campaign, ad group, etc.). Once we have that, we were able to understand which channels had higher cost per attributed conversion and move some of its budget into other channels and it also provide evidence to negotiate a contract with one of the aggregator platforms we list on (the leads they were sending us were of much lower quality after all and they were not resulting in many conversions leading to an extremely high cost per attributed conversion when compared with other aggregators)

I want to leave a final paragraph just to say that although attribution has been a major enhancement to the way we measure our marketing activity at Cazoo, that it shouldn’t be the end of it. As I mentioned earlier, attribution is deterministic and in real life that is not always the case, particularly when we do so much Offline marketing activity like TV. We are now a well recognised brand so we must know that some of the conversions we get from some digital channels would potentially still happen had we not run that digital activity. How much, that we don’t know for sure, and that’s why we should include in our roadmap of marketing measurement some incrementality studies (basically running some tests to understand how many conversions would we still get were we not to run activity in a given digital marketing channel at all). Finally, another (less obvious) caveat is that attribution measures average cost per conversion but it tells us nothing about our marginal performance (i.e. it doesn’t tell us by how much our cost per attributed order would have been had we invested 10% more or less in a given channel — and the answers would likely not be by 10% either direction. Curious about this topic? More in here.

--

--

António Lima
Cazoo Technology Blog

Experienced Data & Analytics Leader || 20 Rising Stars in Data & Analytics 2024 || Best Use of Data @ UK Search Awards 2023 || LinkedIn: https://bit.ly/3baKxFU