Markov Chains

Published in

Design and Development

4 min readDec 13, 2022

Markov chains or Markov processes is an important topic in the fields of data science and machine learning. This concept is applicable in a wide range of areas, including reinforcement learning.

Prerequisites to Understand Markov Chains:

To udnerstand this part of statistics, you would need to be familiar with probability and matrices. If you do not have working knowledge of either one of these areas, I suggest that you take some time to learn the basics before you get into Markov process. And to understand Markov’s property’s applications, you need to have a solid understanding of Markov property.

Probability Basics for Machine Learning & Data Science in 10 Minutes

One needs to learn probability and statistics to learn data science and machine learning. There is no machine learning…

medium.com

The Markov Property:

To explain the Markov property in simple words, the current state of a process or a chain of events depends only on its immediate past and not on any other events from the past.

When the Markov property is satisfied in a chain of events, you can call it a “Markov chain” or a “Markov process”.

A bit of history about the Markov property: This part is evidently not necessary for you to understand how the concept works mathematically and all that, but I’m just adding this here for those of you who are curious to learn.

What Markov wanted to prove was that in the evolution of some processes, knowing just the present is enough to predict the future, and knowledge of the past is unnecessary, which is often referred to as “memorylessness”. By this, in case of what are called Markov processes, he proved that these results would be the same as the results you would obtain knowing the entire history of the chain as well. This was in response to another mathematician who argued that independence is a requirement that needs to be fulfilled for the weak law of large numbers to be true.

“States” in Markov Chains: As the term implies, by “state”, what we mean is the state in which a process is. A process can take different directions or paths and go from one state to another or even remain in the same state as the process continues to run. You could consider each state as a set of features that define the state, and when these are fulfilled, you understand that the process is in a certain state. Let’s not complicate this much. Take for example “sunny day” and “rainy day”; now each of these is a “state”. Similarly, take the 7 days of the week, and here we have 7 states, but they strictly follow a cycle or an order, which is also possible.

Or take for example the relationship status of a person in your neighbourhood. They could either be “single” or “in a relationship” or “married” or “divorced” or “separated” or “widowed”. If you notice, from the state “married”, one can go to 3 possible states. One continue being single forever or move to a different state.

But what you must bear in mind is that since we are talking about probabilities, all the different probabilities of transitioning from one state to any of the next few states should always add up to 1.
And all these states are mutually exclusive; that is, a process can’t be in two states at once.
“Probabilities” in this context refer to the different probabilities associated with the different transition possibilities at each state.

Transition Matrix:

The different probability values are listed out according to the current states and state transitions in what’s called a transition matrix.

Let us consdier this example. There is a fictional story which comes in two parts. There are 2 books published (B1 and B2) and 2 movies made (M1 and M2) based on the two parts of the story. Now, let us say that the starting state is reading B1, and the probabilities of the user’s next states are as follows: B1->M1 = 0.4, B1->B2 = 0.4, B1->M2 = 0.1, and finally, B1->B1 = 0.1.

This chain can be expressed as a sketch in the following way:

Transition Probabilities in a Markov Chain

Similarly, you can add your own probability values for different states being the initial states. And when you put all these together in the form of a matrix, it would look similar to this:

Note that all the probabilities in each row add up to 1. Similarly, on the diagram, too, the probabilities on all outgoing arrows from any state would have to add up to 1.

The Purpose of Markov Chains:

Okay, fine, so we have a Markov chain, but what’s the point of learning all this? Well, if a process is a Markov’s process, you can calculate the probability the process being a certain state at a given time.

You can read more about Markov Chains here.

Markov Chains

Probability Basics for Machine Learning & Data Science in 10 Minutes

One needs to learn probability and statistics to learn data science and machine learning. There is no machine learning…

Written by Lakshmi Prakash