A Friendly Introduction to Data-Driven Marketing for Business Leaders

James Le
James Le
Oct 22 · 7 min read


Marketing attribution has been around for many years, and as the number of available advertising channels continues to shift and expand, so do the strategies employed by teams to leverage those channels.

In this blog post, I want to dive deep specifically into using machine learning models as opposed to heuristic models for marketing attribution across digital channels. Hopefully, the post can teach you what it means to use data science for marketing attribution, as well as how this can make the difference in scaling efforts to reach customers with more customized targeting whether in B2C or B2B.

Data Science in Marketing Attribution

Marketing attribution is the process of measuring campaign effectiveness by quantifying the influence those campaigns have on the desired outcome (e.g., starting a free trial, making a purchase, etc.). By understanding which channels or what content leads to a higher conversion rate to these desired outcomes, marketing teams can better optimize spend and messaging.

Today, Machine Learning and Artificial Intelligence allow marketing teams to go far beyond the methods of attribution introduced in the previous decade. For example, they can build ideal customer journeys down to granular user segments or, in some cases, down to the individual level for hyper-personalization — which generally translates to more desired actions. With algorithms handling data from multiple sources and giving near real-time feedback on the most effective channels, they can scale their efforts to reach more people more effectively, ideally by spending less money. Furthermore, tight integration with a customer relationship manager (CRM) or ad platforms can reduce manual processes and introduce more automation. Finally, with all of the hard work being done by machine learning (ML) models, marketers are freer to get creative and experiment when it comes to channels and messaging — especially with real-time feedback on effectiveness to pivot if needed.

The 7 Steps to Build a Marketing Attribution Machine Learning Model

As with any data science project, marketing attribution must begin on the business side. Before diving into the data, the team needs to take a step back and answer the following questions (preferably with business/marketing and data teams together):

How are we currently doing marketing attribution? Before starting a new project, it’s important to understand what teams are already doing to address the question of channel attribution. Every member of the team tackling marketing attribution should know how it’s currently being done, why it’s being done that way, how it works, the results it’s delivering, and who is using those results. This will provide a more clear picture of needs.

How many different types of campaigns do we have, and what is the desired action for each? For some businesses or for particular campaigns, the desired action might be making a purchase. For others, it might focus on awareness, so a potential customer simply visiting the website would be considered the goal action. In any case, the desired action for each marketing campaign must be defined in specific. Different attribution models might work better or worse with certain campaigns, so mapping this out clearly before getting started is critical.

What is the ideal way to deliver results that will have a real business impact? Failing to define deliverables before kicking off a marketing attribution project sets the stage for failure — especially when data scientists and marketing teams aren’t aligned and the result is something the marketing team can’t make use of.

Coming up with a good data science solution for a business question starts with properly scoping out the business needs, but once that’s finished, the second most essential component is good data.

The first step is to map out all channels and touch-points along the customer journey to be sure that no channels are forgotten. From there, good data means the prerequisite tracking of all user actions on each targeted channel. Moreover, it means understanding exactly what data is attached to each touchpoint and where the data comes from as well as what limitations might exist. Understanding attribution data is not only fundamental to the accuracy of models, but it’s also essential for business teams and leaders to trust model outcomes.

After identifying all the right data sources, no matter what algorithm is ultimately chosen for the attribution, the next step in all cases is to ensure the data is clean and in the right format. This requires that the user sessions are constructed and well-defined. It is at this point in the process that one may discover channels where data is missing altogether.

Should this process uncover holes in the data, the best approach is to stop and address the problem. It’s not possible to build an accurate attribution model with missing data, so taking the time to fix the issue to ensure data is attributed properly before moving forward is critical.

It is at this point that it’s necessary to define the model that will be used for the project. This is because all the subsequent steps of working with the data are dependent upon which model is being used for attribution.

Unlike other types of machine learning models (for example: churn, predictive maintenance, or anomaly detection) where it’s possible to split data into train and test sets to compare the model’s predictions to actual outcomes, the only way to actually test a marketing attribution model is to use it. Unlike these other models, marketing attribution isn’t truly a predictive model, so there are no “actual” outcomes with which to compare before making the model live.

Traditionally in a data project, once data is clean and prepared, predictive models can be applied. In the case of marketing attribution, nothing is actually being predicted. Instead, the outcome of the model will be a percentage or score for each channel.

Of course, visualizations can be useful when it comes to marketing attribution to illustrate the distribution of the conversions for the channels themselves. This might be a bar chart showing conversions (or percentage of conversions) per channel for all time. Or it could be a line chart showing conversions per channel over time, which can be useful to see if there is fluctuation. Fluctuation could either indicate seasonality or, more likely, that the algorithm is unstable, which is a good sign that iteration is necessary.

Deploying a marketing attribution project can mean any number of things depending on the predefined deliverables with the business and marketing teams. But at a very minimum, it means having a model working on actual data and updating regularly based on current data. This should have been pre-defined in the deliverables agreed up with marketing. Depending on their needs and the nature of the business, it could be daily, weekly, monthly, or even annually.

Marketing attribution is unique as a data science project in that the only way to see its effects is to deploy the model, update marketing spending accordingly, and observe the change on the business side. In other words, based on the model and adjusting spend, look at the number of conversions — how did the allocation of less budget to a specific channel effect those conversions overall?

By repeating this process for different channels and measuring the resulting business outcome, marketing teams will be able to identify the optimal balance.


It is true that there is traditionally no way to test marketing attribution models before using them in a real business context, seeing as it’s not possible to compare the algorithm’s output to some source-of-truth-data. However, that being said, marketing attribution is still an evolving discipline, and data scientists are exploring possible ways of testing these models for a possible look into the performance before applying them to real-time data and doing a sort of real-life testing by moving marketing budget around.

Attributing advertising channel conversions is perhaps the biggest — yet also most complex — challenge that today’s marketing teams face. And there is no magic bullet solution; though employing data science and machine learning techniques can significantly lower the time spent and deliver better results than traditional heuristic models, it’s still not a one-and-done deal. Marketing teams must continuously evaluate channels, and the use of those channels, at regular intervals to understand and address shifts in consumer behavior over time.

What’s more, this landscape will continue to grow more complex over time as new avenues for reaching potential customers emerge. Taking an algorithmic approach to attribution is just the beginning of driving change by moving toward a more detailed, data-driven approach in marketing.

— —

If you enjoyed this piece, I’d love it if you hit the clap button 👏 so others might stumble upon it. You can find my own code on GitHub, and more of my writing and projects at https://jameskle.com/. You can also follow me on Twitter, email me directly or find me on LinkedIn. Sign up for my newsletter to receive my latest thoughts on data science, machine learning, and artificial intelligence right at your inbox!

Cracking The Data Science Interview

Your Ultimate Guide to Data Science Interviews

James Le

Written by

James Le

Blue Ocean Thinker (https://jameskle.com/)

Cracking The Data Science Interview

Your Ultimate Guide to Data Science Interviews

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade