# Markov Chains with Python

*Learn about Markov Chains and how to implement them in Python through a basic example of a discrete-time Markov process in this guest post by Ankur Ankan, the coauthor of **Hands-On Markov Models with Python**.*

# What is a Markov Chain?

A Markov chain is a type of Markov process in which the time is discrete. However, there is a lot of disagreement among researchers on what categories of Markov process should be called **Markov chain**. But, most commonly, it is used to refer to discrete-state-space Markov processes. So, a Markov chain is a stochastic process over a discrete state space satisfying the Markov property.

More formally, a discrete-time Markov chain is a sequence of random variables **X1**, **X2**, **X3**, … that satisfy the Markov property — the probability of moving from the current state to the next state depends solely on the present state.

In terms of probability distribution, given that the system is at time instance **n**, the conditional distribution of the states at the next time instance, **n + 1**, is conditionally independent of the state of the system at time instances **{1, 2, . . ., n-1}**.

This can be written as follows:

# Markov Chain Graph Representation

Markov chains are often represented using directed graphs. The nodes in the directed graphs represent the different possible states of the random variables, while the edges represent the probability of the system going from one state to the other in the next time instance.

As a simple example, take a look at predicting the weather to understand this representation better. Consider that there are three possible states of the random variable **Weather = {Sunny, Rainy, Snowy}**, and the possible Markov chains for this can be represented as shown in **Figure 1.1**:

One of the main points to understand in Markov chains is that you’re modeling the outcomes of a sequence of random variables over time. The nodes of the above graph represent the different possible states **Weather, **and the edges between them show the probability of the next random variable taking different possible states, given the state of the current random variable. The self-loops show the probability of the model staying in its current state.

In the above Markov chain, consider that the observed state of the current random variable is **Sunny**. Then, the probability that the random variable at the next time instance will also take the value **Sunny **is **0.8**. It could also take the value **Rainy **with a probability of **0.19**, or **Snowy** with a probability of **0.01**. One thing to note here is that the sum of all the probability values on all the outward edges from any state should equal 1, since it’s an exhaustive event.

# Coding the Markov Chain Example

It’s time now to try coding this simple Markov chain. Start by defining a simple `MarkovChain`

class:

**import** **numpy** **as** **np**

**class** **MarkovChain**(object):

**def** __init__(self, transition_prob):

*"""*

* Initialize the MarkovChain instance.*

* *

* Parameters*

* ----------*

* transition_prob: dict*

* A dict object representing the transition *

probabilities in Markov Chain.

Should be of the form:

{'state1': {'state1': 0.1, 'state2': 0.4},

'state2': {...}}

* """*

self.transition_prob = transition_prob

self.states = list(transition_prob.keys())

**def** next_state(self, current_state):

*"""*

* Returns the state of the random variable at the next time *

* instance.*

* *

* Parameters*

* ----------*

* current_state: str*

* The current state of the system.*

* """*

**return** np.random.choice(

self.states,

p=[self.transition_prob[current_state][next_state]

**for** next_state **in** self.states]

)

**def** generate_states(self, current_state, no=10):

*"""*

* Generates the next states of the system.*

* *

* Parameters*

* ----------*

* current_state: str*

* The state of the current random variable.*

* *

* no: int*

* The number of future states to generate.*

* """*

future_states = []

**for** i **in** range(no):

next_state = self.next_state(current_state)

future_states.append(next_state)

current_state = next_state

**return** future_states

Now, try out the **Weather** example with this MarkovChain class:

`>>> transition_prob = {'Sunny': {'Sunny': 0.8, 'Rainy': 0.19, `

'Snowy': 0.01},

'Rainy': {'Sunny': 0.2, 'Rainy': 0.7,

'Snowy': 0.1},

'Snowy': {'Sunny': 0.1, 'Rainy': 0.2,

'Snowy': 0.7}}

>>> weather_chain = MarkovChain(transition_prob=transition_prob)

>>> weather_chain.next_state(current_state='Sunny')

'Sunny'

>>> weather_chain.next_state(current_state='Snowy')

'Snowy'

>>> weather_chain.generate_states(current_state='Snowy', no=10)

['Snowy', 'Snowy', 'Snowy', 'Rainy', 'Snowy', 'Snowy', 'Rainy',

'Rainy', 'Snowy', 'Snowy']

# Parameterization of Markov chains

The code for the Markov chain in the previous section uses a dictionary to parameterize the Markov chain that had the probability values of all the possible state transitions. Another way of representing state transitions is using a **transition matrix**. The transition matrix, as the name suggests, uses a tabular representation for the transition probabilities.

The following table shows the transition matrix for the Markov chain shown in **Figure 1.1**. The probability values represent the probability of the system going from the state in the row to the states mentioned in the columns:

The transition matrix represents the same information as in the dictionary, but in a more compact way. For this reason, the transition matrix is the standard way of representing Markov chains.

The `MarkovChain`

class can be modified as follows so that it can accept a transition matrix:

**import** **numpy** **as** **np**

**class** **MarkovChain**(object):

**def** __init__(self, transition_matrix, states):

*"""*

* Initialize the MarkovChain instance.*

* *

* Parameters*

* ----------*

* transition_matrix: 2-D array*

* A 2-D array representing the probabilities of change of *

* state in the Markov Chain.*

* *

* states: 1-D array *

* An array representing the states of the Markov Chain. It*

* needs to be in the same order as transition_matrix.*

* """*

self.transition_matrix = np.atleast_2d(transition_matrix)

self.states = states

self.index_dict = {self.states[index]: index **for** index **in**

range(len(self.states))}

self.state_dict = {index: self.states[index] **for** index **in**

range(len(self.states))}

**def** next_state(self, current_state):

*"""*

* Returns the state of the random variable at the next time *

* instance.*

* *

* Parameters*

* ----------*

* current_state: str*

* The current state of the system.*

* """*

**return** np.random.choice(

self.states,

p=self.transition_matrix[self.index_dict[current_state], :]

)

**def** generate_states(self, current_state, no=10):

*"""*

* Generates the next states of the system.*

* *

* Parameters*

* ----------*

* current_state: str*

* The state of the current random variable.*

* *

* no: int*

* The number of future states to generate.*

* """*

future_states = []

**for** i **in** range(no):

next_state = self.next_state(current_state)

future_states.append(next_state)

current_state = next_state

**return** future_states

Running this code should also give similar results to what you got in the previous section.

Using a transition matrix might not seem like a good idea because it requires you to create extra variables to store the indices. However, in cases with hundreds of states, using a transition matrix is much more efficient than using the simple dictionary implementation.

In the case of a transition matrix, you can simply use NumPy indexing to get the probability values in the next_state method. Whereas in the previous implementation, you were looping over all the state names:

`>>> transition_matrix = [[0.8, 0.19, 0.01],`

[0.2, 0.7, 0.1],

[0.1, 0.2, 0.7]]

>>> weather_chain = MarkovChain(transition_matrix=transition_matrix,

states=['Sunny', 'Rainy', 'Snowy'])

>>> weather_chain.next_state(current_state='Sunny')

'Sunny'

>>> weather_chain.next_state(current_state='Snowy')

'Sunny'

>>> weather_chain.generate_states(current_state='Snowy', no=10)

['Snowy', 'Rainy', 'Rainy', 'Rainy', 'Rainy', 'Rainy',

'Rainy', 'Rainy', 'Sunny', 'Sunny']

Markov chains are important mathematical tools that effectively aid the simplification of predicting stochastic processes by viewing the future as independent of the past, given the present state of the process.

*Hope you found this article interesting. If you want to learn more about Hidden Markov Models and leveraging Python to implement them, you can explore **Hands-On Markov Models with Python**. Replete with deep theoretical insights and numerous practical implementations, the book is a comprehensive guide to help you implement probabilistic models for learning complex data sequences using the Python ecosystem.*