Measuring Shopping Page Performance with Markov Chain Attribution Modelling

Published in

Bukalapak Data

6 min readDec 5, 2018

In online shopping web/apps, various pages could be passed through by users to find products that they desire. So the company needs to measure page performance to find out which page contributes the most to overall purchase.

In this article, we are going to cover how web/apps page performance is measured in my company.

It was started when my Product Manager asked me a question :

Beth, when a buyer had a journey path visited homepage then clicked an item A then clicked an item B that we provided on the page of the previous item then bought item B. How do we should measure the performance of the homepage and the first item (item A)? It is because, without these actions, the buyer wouldn’t found item B. In other words, visiting homepage and click item A from homepage had a contribution to the buyer for finding item B.

Then we found Channel Attribution (Google Analytics approach) which is a method to give a credit to touch point in conversion path.

Linear Attribution Model

The basic idea in Linear Attribution is when I visited our homepage app then clicked product, then clicked another product again before eventually purchased, the model gives credit to each touchpoint.

Journey : visit homepage → click product → click product → purchase

With linear attribution, each touch point will get :

homepage = 1/3
clicked product = 2/3

However, what i don’t like about this approach are it shares the equal credit to each touch point (pages) and has no consideration of sequence steps in the journey.

Markov Attribution Model

In Linear Attribution model, a touchpoint will have less value if a journey is longer, even though the page has a high conversion. But, Markov Attribution handles this problem because it has more concern with the conversion.

Markov is a Stochastic model that describes a sequence directed graph which is suitable for the touchpoint in the customer journey. The touchpoint will be present as the vertex and the edges define the transition probability from the source node to the destination node. When you look closely, it seems like a Markov chain, which is why it is named Markov attribution.

What makes Markov attribution different from other attributions is that Markov applied Removal Effect to measure the importance of a path.

Let’s assume we have customer journeys:

visited homepage (home)
clicked a product A from trending section in homepage
clicked a product B from recommendation for you section in product detail page (pdp) product A
finally purchased product B

Then we can convert his journey to a session or single row. But, it should be noted that is only done to sessions that end up with purchasing.

Journey of A User

Why merge the popular and recommendation section to their landing pages? Why not make popular / recommendation as a separate path? Because they are part of the landing page. If we make them an independent path, they will reduce the attribution value of their landing pages, which is contrary to what we believe (recommendation will add more value to the landing page). So in analyzing the process, we will consider paths of “home,” “home-popular,” and “home-recommendation” as one path. Finally, we also can know how much the contribution of different product sections on a page.

Another example of users’ journey :

home → search → pdp → purchase
home → search → pdp → pdp-recommendation → purchase

How Markov Attribution Works

Before we jump into Markov stage, we should sessionize the journey. A session is defined when the user opens the web/apps until they purchase a product eventually. The journey after purchasing is considered as a new session.

Let’s jump into how the journeys are processed in Markov. As an example, have five unique paths (home, home-popular, search, pdp, pdp-recommendation)with three sessions.

Then, the probability of each transition from touchpoint to another touchpoint is needed to be calculated. Next, we create an edge for each starting point to endpoint.

the probability of each edges = N of pair occurrence / N of source occurrence.

Each transitions will get value as follows :

start → home-popular = 1/3 = 33.3%
start → home = 2/3 = 66.7%
home-popular → pdp-recommendation = 1/1 = 100%
home → search = 2/2 = 100%
pdp-recommendation → pdp = 1/2 = 50%
pdp-recommendation →purchase = 1/2 = 50%
pdp → pdp-recommendation = 1/3 = 33.3%
pdp → purchase = 2/3 = 66.7

After that, we estimate credit every touchpoint based on the Removal Effect approach. The basic idea is to measure the conversions that could be gained if a particular touchpoint is gone. For example, if we remove pdp from the graph, there is only one path from start to purchase, which is :

start → home-popular → pdp-recommendation → purchase

Then to calculate the removal effect, we will multiply the probability along the way and divide it with the overall probability of conversion :

Removal Effect

After we calculate the removal effect for each touchpoint, we calculate the attribution value by normalizing all the removal effect by the division of removal effect with the sum of removal effect.

Let’s build the Markov attribution in R

Luckily R has a package for Markov Attribution and provided by Davide Altomare and David Loris which is called ChannelAttribution.

It’s effortless; we need to input the data with > as the separator of touchpoints.

Load/ install package.
library(ChannelAttribution)
library(dplyr)
Use the markov_model function to build the attribution.

model_markov$result to get the credit of touch points.

normalize the total_conversions, divide by sum of total conversions of all touchpoints.

How to Interpret The Result

The total conversion can be read as the path’s contribution in the form of ratio toward purchase. We can say that the search contribution is twice as significant as home-popular. Also, if we lose our pdp-recommendation, we could lose 0.44 out of 3 purchases. It is particularly useful when we have a malfunctioning page, and we want to measure the impact to assign the right emergency level quickly.

This ratio can also be used to measure indirect metrics (GMV, revenue, and sales) for each touchpoint. As an example, when we have a daily GMV of $1 Billion and a search page with attribution 0.3 out of 1, we can say that the search page worth $300 Million daily.

It turns out that..

It is indeed that we should consider indirect metrics for particular pages because some users may visit them, but they do not buy directly from that page. However, they get some inspiration to buy other products from them. For example, it would be unfair to measure the recommender system’s performance by its direct purchases, because users usually browse a lot of product detail / recommendation to compare product alternatives. After all, the most important thing is users eventually find a product that they want.