Introduction to Markov Chains: An Intersection of Probability and Linear Algebra

Sruthi Subramanian
Stamatics, IIT Kanpur
5 min readJul 13, 2024

What will the weather be tomorrow?

Whether it rains or not is a concern a lot of us have in our everyday life. A heavy rain can change how the rest of our day goes. Hence weather prediction and knowing how likely it is to rain is something that we all are interested in, and a lot of times, depend on to plan our schedules! Let us now dive into a simple probability problem that revolves around this.

If you knew the probabilities of the weather being Rainy, Nice or Snowy tomorrow(or one day later), given the weather today(either Rainy, Nice or Snowy), how would you go about finding out the probability of it being Rainy the day after tomorrow, given that it is Nice today(or if it is Snowy or Rainy today)?

Let our example’s probabilities be as follows:

The probability that it is Rainy tomorrow, given that it is Rainy today = 0.5

The probability that it is Nice tomorrow, given that it is Rainy today = 0.25

The probability that it is Snowy tomorrow, given that it is Rainy today = 0.25

The probability that it is Rainy tomorrow, given that it is Nice today = 0.5

The probability that it is Nice tomorrow, given that it is Nice today = 0

The probability that it is Snowy tomorrow, given that it is Nice today = 0.5

The probability that it is Rainy tomorrow, given that it is Snowy today = 0.25

The probability that it is Nice tomorrow, given that it is Snowy today = 0.25

The probability that it is Snowy tomorrow, given that it is Snowy today = 0.5

Given today’s weather (say Rainy), observe that the sum of the probabilities of it being Rainy, Nice and Snowy tomorrow is 1 (0.5+0.25+0.25). Why must this be the case?

(Note that we assume that these probabilities don’t change from day to day, and remain same for any two consecutive days)

Now the next question that arises is, given today’s weather, what are the probabilities for it to be Rainy/Nice/Snowy two days later? Or three days later? Or to generalize this problem, how about n days later?

First let us look at the specific problem: What is the probability that it is Rainy two days later, given that it is Nice today?

Using the basic concepts of probability, we can approach this problem as follows.

There are three possible cases that arise here:

  • It is Rainy tomorrow and Rainy two days later, given that it is nice today

For which the probability is

P(Rainy tomorrow|Nice today) * P(Rainy day after tomorrow|Rainy tomorrow)

= P(Rainy tomorrow|Nice today) * P(Rainy tomorrow|Rainy today)

= 0.5*0.5 = 0.25

  • It is Nice tomorrow and Rainy two days later, given that it is nice today

For which the probability is

P(Nice tomorrow|Nice today) * P(Rainy day after tomorrow|Nice tomorrow)

= 0*0.5 = 0

  • It is Snowy tomorrow and Rainy two days later, given that it is nice today

For which the probability is

P(Snowy tomorrow|Nice today) * P(Rainy day after tomorrow|Snowy tomorrow)

= 0.5*0.25 = 0.125

Hence the probability that it is Rainy the day after tomorrow given that it is nice today, is the sum of these three probabilities, which is 0.375.

Now this seems like a fairly good method to use to find the probability with a day gap. What if we want to find the probability for 4 days later? Taking cases will become a very tedious and time-consuming process, and could also be very confusing. Do observe that while this is likely to be taxing for a human, this is the kind of problem that a computer would do very well!

Now, using a matrix approach to solve this problem, will allow us to keep track of all the probability values and take care of all these cases in a much more organized manner. This is where the idea of Markov Chains comes in!

How can we solve this problem with matrices?

Let us try to represent the given probabilities in a matrix form. Let the (ij)th element of the matrix represent the probability that it is in the state (s_j) tomorrow given that it is in the state (s_i) today.

Note that the states, in this case are the possible weathers it can be:

s_1 = R (Rainy). s_2 = N (Nice). s_3 = S (Snowy).

The row of the element represents the weather today, and the column represents the weather tomorrow(or one day later), for which we are looking at the probability.

Constructing such a matrix, we get:

Let this matrix be P. If we take P^2, we have:

Remember that our problem was to find the probability that it is Rainy two days later, given that it is Nice today.

Observe that the element in the second row, representing that it is Nice today, and the first column, representing that it is Rainy (two days later?), is 0.375, which is the same that we calculated previously, by taking cases.

How does this value come?

  • From matrix multiplication, we know that we calculate the corresponding element of the matrix, P^2 , by taking the inner product of the second row of P with the first column of P, which is:

(0.5*0.5 + 0*0.5 + 0.5*0.25) = 0.375

  • See that the first term represents the probability such that the intermediate day is Rainy, while the next two terms do the same for Nice and Snowy.
  • Writing this out, and taking all the possible cases, we can show that, in general, the (ij)th element of P^2 represents the probability that the weather two days later(or the day after tomorrow) is s_j, given that it is s_i today.

What is a Markov Chain?

The sequence of the matrices P^n (n>0) is called a Markov Chain. The Matrix, P, is called the Transition matrix. In a general Markov Chain, the states can be anything, and we can have any number of states. (Do observe that as the number of states increases, the computation for finding the powers of the matrix P, becomes more complicated).

Let our Markov Chain have n many states. Then the transition matrix, will be a (n x n) matrix, where the (ij)th element, represents the probability of the state s_j occurring tomorrow(or one day later) given that s_i is today’s state. If the transition matrix is P, then in P^n, the (ij)th element represents the same probability for the n days later.

There are various types of Markov Chains, including Absorbing Markov Chains, and Regular Markov Chains, and they have various interesting properties. Those are some interesting things to look forward to in the upcoming articles! Do stay tuned :)

This series is inspired by Prof. Jaideep Mulherkar during the duration of the course, ‘Calculus 3 — Linear Algebra’, MITES Summer 2024

Reference:

prob.pdf (dartmouth.edu)

--

--