Cartoon of the network graph representing customer buying journeys.
Image by Espen Klem

Predictive Customer Journey

Tatiana
Teradata
Published in
8 min readApr 1, 2024

--

Ok, you have the data on your customers, such as how much money they spend with you, what their frequency of visiting is, etc, and you may have a few predictive models running.

Great! But do you include their behavioural features in these models? If the answer is “no” or “some of them” then this blog is for you. Let me show how to create a likelihood of a customer doing something using their behaviour.

A little while ago, Monica Woolmer, one of Teradata’s most senior Business Consultants, wrote a blog about the Power of Path Analysis. In it, she described the potential business applications of the path and the underlying Vantage functionality that enables the analysis.

What stood out to me after reading her article was how the output of the nPath was a natural lead way to Markov Chain Analysis. I was immediately reminded of how, in one of our conferences, a colleague of mine presented the evolution of the nPath to show its great predictive power. I was struck by the simplicity and elegance of it, and that’s why I wanted to share this flash of insight with a wider audience.

Another reason I wanted to write this blog is that many data scientists don’t see relational databases as a viable choice of tool to do Data Science. I think this is a great loss to the community. I am a true believer in minimising the number of technologies to do analytics. This is due to the larger the number of analytical tools used, the bigger the technical debt accumulated, and the bigger the data sprawl across these tools. Frankly, if I can accomplish what I need to do in the place where data is managed, then this is “gold” to me.

What is a Markov Chain?

To quote Wikipedia: “A Markov chain, or Markov process, is a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event.” [1]

Well, what exactly does that mean? If you think in business terms, then the Markov Chain is a predictor of what is the next step a customer may take.

If you want to understand an algorithm and you are a visual person, I found Victor Powell’s explanation [2] simple to understand. In my words, a Markov Chain defines a mathematical model that describes a chronological journey from one state or event to another. Think of it like a more complicated old-fashioned hopscotch game, where players hop from one square to another. The probability of a player being able to hop to the next square depends on which square the player is currently located.

If you like mathematical descriptions, consider, for any positive integer n and positive events e_1,e_2,…,e_n the following statement to be true:

Mathematical equation to describe a probability of a transition.

The main building block of the Markov Chain is the transitional probability. Assuming you have several options for transition from your current state to the next one, ask yourself what is the probability that you are going to pick a particular transition.

How do we find the transitional probability from one state to another? Well, that depends on the problem that you are trying to solve. For example, if you roll a six-sided die, then the transitional probability from one dice roll to another is 1/6. That means there’s a one-in-six chance that the die will land on a particular number.

Let’s return to our simple hopscotch game, where squares follow one after another. Then the probability that you can hop to the next square might be 98%, accounting for the slight chance of making a mistake like falling.

The transitional probabilities between each state are recorded in a transition matrix.

For example, if you are rolling your six-sided die again, then your matrix can be a 6x6 grid with a 1/6 entry in every cell. If you are still playing our simple hopscotch game, then the matrix has a 0.98 entry in a row for consecutive squares and a 0.02 entry for the non-consecutive squares, which is equal to falling.

To sum things up in an elegant mathematical definition, we define the (i, j) element in the Transition Matrix(P) as follows:

Mathematical equation describing an entry in the transitional probability matrix.

This means that the probability of being in the next state depends on the previous state.

What is nPath?

nPath is a function that is specific to Teradata [3]. I call it a “regular expression on a time series.” The simplest application of nPath is to sequence events in a customer journey before an outcome. That leads us to another question. What is the outcome? Well, it could be anything, such as death, divorce, or moving. You can define a regular expression to find any pattern of customer behaviour.

By this point, you might be wondering how nPath works exactly. Let’s look at a simple example.

Network graph representing how many customer are transitioning from one webpage to another.

Imagine that we have 5 customers buying from a very simple website, with only 6 pages (Start, C1, C2, C3, C4, Conversion). The desired outcome, in our case, is the Conversion event. The whole customers' journeys are represented by the figure above. We want to examine what most of our customers are doing before the Conversion event.
In the database, the following information is stored in the table shown below:

Table that describes what the data looks like. There are 3 columns —  customer id, date_tb and event.

What we need to do is accumulate all the events into a row representing the customer’s journey:

Table that shows the output of the nPath function. There are 2 columns: cust_id, last event, and the page path column. The last column is the sequence of events in order.

This is where nPath helps us. In Teradata Vantage, the call to the function looks like this:

Picture showing the example of the nPath code.

As you can see it is SQL-like but a bit different. Let's dive a little deeper into the syntax:

Image of the nPath code and explanation of the inputs.

Applications for the customer journey

I hope by now you have a pretty good intuition on how nPath and Markov chain can be applied to your customers’ journeys. But we aren’t done yet. Now, let’s examine how we combine a Markov chain and nPath.

We’ll use the same data as in the above nPath example. What do we need to do to create our model?

  1. Create the Transition Probabilities Matrix:
Table showing the probabilities of transitions for pairwise events.

2. Fill in probabilities if the pair of events does not exist: Laplacian Smoothing

For example, transmission between Start and the C_2 page does not exist in the successful journeys, and it appears in 50% of the journeys that did not end in the conversion. Similarly, the transition between C_1 and C_4 only exists in the journeys that ended in the outcome. The problem with it is that in the next step, we will need to calculate the log-odds ratio.

Formula describing how log odds ratios are calculated.

If the probability of one of the transitions is zero, then we get:

Log odds formula with 0 in numerator, which gives you a log of 0. Or NO!

or

Log odds ration formula with division by 0, Or NO!
Screaming face emoji.
Image by iEmoji

Laplacian smoothing lets us give a bit more weight to transitions that do not exist and take a bit more weight away from the existing transitions. It is calculated by the following formula:

Formula that describes how Laplacian smoothing is calculated.

In our example, the number of unique transitions in both journeys (V) is 7.

Table show the detail calculation of the laplacian smoothing for each pairwise transitions.

3. Calculate logs-odds ratio:

Table shows the detail calculation for log odds ratio for for each pairwise transitions.

4. Score New Journeys
Now that we have our model, scoring new journeys becomes a trivial task. Let’s assume we have a new customer whose journey looks like this:

Start → C1 → C3 → C4.

What is the likelihood of the customer reaching the outcome? To calculate this, we need to add log odds ratios of pairwise transitions.

Table show the calculation of a new journey. The final sum of the log odds ratio is -0.42, and it is very unlikely that a customer will reach the outcome. In general, the larger the sum of log odds ratios, the more likely a customer will reach an outcome.

The final sum of the log odds ratio is -0.42, and it is very unlikely that a customer will reach the outcome. In general, the larger the sum of log odds ratios, the more likely a customer will reach an outcome.

Code: Teradata SQL code can be found on GitHub (here).

Additional resources:

  1. Markov Chain — Wikipedia
  2. Markov Chains — Setosa
  3. Apczyski M., Biaowas S. (2013) Discovering Patterns of Users’ Behaviour in an Eshop Comparison of Consumer Buying Behaviours in Poland and Other European Countries, Studia Ekonomiczne, nr 151, La societe de l’information: perspective europeenne et globale: les usages et les risques d’Internet pour les citoyens et les consommateurs, p. 144–153 (Link)

4. Clickstream Data for Online Shopping (2019). UCI Machine Learning

5. Teradata Path and Pattern Analysis Functions Documentation

Questions and comments
We’d love to hear from you! Please leave any comments, questions, or ideas in the comment section below. Additionally, we encourage you to try the example in this blog post for free using ClearScape Analytics Experience, and to explore the Teradata Developer Portal and Teradata Developer Community.

Author: Dr. Tatiana Bokareva is the Principal Lead Data Scientist for the international region at Teradata. She leads a team of technology experts and leaders in data science. Her mission is to help businesses realize real value from ML and AI projects, given the rapid change and evolution in the analytics landscape. Tatiana has extensive experience helping clients with analytical vision, roadmap, and architecture, all the way through to the delivery of analytical projects. She works with clients ranging from federal and state government agencies to enterprises in FSI, telco, and retail. She helps clients identify analytical opportunities and accelerate their monetization. Her leadership, management, and communication skills have been shaped by over 20 years of in-depth industry experience, working for Fortune 500 companies and hands-on data science expertise. Connect with Tatiana on LinkedIn.

--

--