Attribution Modeling using Markov Chain

Akanksha Anand (Ak)
5 min readJan 12, 2024

For the last post, we used Google Analytics 4 to see the impact of different marketing channels on the final conversion and revenue using various attribution models. It’s time to try a probabilistic approach.

With the advent of the internet, customer goes through a long journey of researching the product, looking for good deals before finally buying a product. This makes the Marketing Team track the customer journey through the different marketing channels to analyze which contributed the most in conversion and strategize the marketing campaigns to lure more customers. Tracking the performance of each channel involved in the customer’s journey helps create robust data which ultimately helps in analysis and allocating the marketing budget according to the performance of each channel.

But how will Markov Chain help us with demystifying the customer’s journey? Before going to that, let’s take a step back and learn what the Markov Chain methodology is and how it can help us explore the channel value.

Markov Chain

A Markov chain is a mathematical model that represents a system that undergoes transitions from one state to another, with the assumption that the future state depends only on the current state and not on the sequence of events that preceded it. This concept is known as the Markov property.

In a Markov chain, the system is described by a set of states, and the transition probabilities between these states. The basic idea is that the probability of transitioning from one state to another depends only on the current state and not on how the system arrived at that state.

Here are some key terms associated with Markov chains:

  1. States: The distinct conditions or situations that the system can be in. These are the possible values that the system can take.
  2. Transitions: The movement from one state to another. Each state has associated probabilities for transitioning to other states.
  3. Transition Probabilities: The probabilities assigned to each possible transition. If the system is in state i, the probability of moving to state j is represented by a transition probability P(i, j).
  4. Markov Property: This property implies that the probability of moving to any particular state depends only on the current state and not on the system's history.

Imagine creating a model to understand a baby’s actions using the Markov chain. In this model, you’d consider activities like “playing,” “eating,” and “crying” as different states that the baby can be in. Together with other behaviors, these states make up what we call a ‘state space’ — basically, a list of all possible situations the baby might be in. Now, the Markov chain helps us figure out the likelihood of the baby moving from one state to another. For example, it could tell us the chance that a baby, who is currently playing, will start crying in the next five minutes without eating first.

Markov State Diagram

Markov Chain for Marketing Attribution

Let’s dive into the intriguing world of Markov chains, where the key lies in unraveling the sequence of a customer’s journey through various marketing channels. To decode the intricate dance between interactions and conversions, we need pathing data that captures the dynamic order of encounters. Understanding not just the impact of individual channels but the enchanting choreography of how they work together to lead customers to conversion.

The graph below illustrates the detailed web of customer journeys. Every step from one channel to another, clearly presented in the pathing dataset, is assigned a probability. This creates a compelling story of customer interactions, providing practical insights into how these probabilities evolve.

Markov Chain for Marketing Attribution

Cosider a scenario based on the above graph, A customer initially receives a product-related email highlighting its features. After a few days, they encounter a display ad for the same product, strategically placed as part of the marketing campaign, while perusing their favorite blog. Later that evening, while casually scrolling through social media, they stumble upon a video by a social media influencer providing detailed insights into the product along with a discount code. This ultimately convinces the customer to make the purchase.

A crucial aspect to grasp is that Markov chains operate on the probability of one interaction influencing the next, solely based on the current interaction. Imagine a scenario where a customer, having just experienced paid social, is predicted to interact with paid search next. This prediction relies solely on the recent encounter with paid social, excluding the wealth of prior interactions.

Removal Effect

The Removal Effect is a method to measure how much each marketing channel contributes to generating conversions. We achieve this by systematically taking out each channel from the overall picture and observing the resulting impact on conversions. The greater the impact, the more value is assigned to that specific channel. This process is repeated for each channel, allowing us to gauge their influence on conversions.

To compute the Removal Effect, we begin by figuring out the likelihood of all paths leading to conversions. The table below illustrates this, including the overall probability of conversion considering all possible paths. When we add up all the probabilities, we find that the total likelihood of conversion is approximately 64%.

Path Probability

Once the conversion Probability has been calculated, we can start removing each channel one by one to calculate the Removal Effect using the below formula:

Removal Effect = 1-(Path probability/Total Probability)

Removal Effect Calculation

Based on the above table, it can be stated that if Email is removed from the marketing channel, it will affect 79% of the conversions. Similarly, the removal of Display and Social will incur 46% on the conversions.

Simplifying the interpretation, we can standardize the Removal Effect. The outcome then directly indicates the percentage of the overall value assigned to the channel.

Channel Value

While computing this example was manageable using traditional methods, standard campaigns involve a higher number of channels and intricate connections. This complexity is compounded by scenarios where customers repeatedly engage with the same channel, introducing more intricate calculations.

I hope you found my exploration of Markov Chains insightful in this blog. Next week, we’ll delve into the practical aspects with a focus on the Python implementation of Markov Chains. I’m eager to hear your thoughts and welcome any feedback you might have. Your insights will play a crucial role in shaping my upcoming content.

Stay tuned for more exciting content and thank you for being part of my journey!

References:

i)Markov Chains: https://setosa.io/ev/markov-chains/

--

--

Akanksha Anand (Ak)

Data @CIAI, Marketing Media Analytics for Life Science and Healthcare