Hidden Markov Models — Part 2: the Decoding Problem

7 min readJan 3, 2019

In my previous article, I have introduced the concept of Hidden Markov Model and solved the first Likelihood problem with the Forward and Backward algorithms.

If you haven’t still read it, I advise you to do so because what I’m going to explain in this article has foundations on the previous.

We have initialized Lea’s HMM as such:

And the model’s properties:

Let’s change what Lea has been doing in the past four days. Suppose she’s been shopping, cleaning, biking and painting. Our observation sequence would be:

Given this new observation sequence, we’re going to be looking at the second problem of HMMs: the Decoding problem.

Problem 2 — Decoding

The Decoding problem is asking us to find the best sequence of hidden states Q given the observation sequence O.

So, if Lea’s been shopping, cleaning, biking and painting, how was the weather like during these four days?

Q = ???

To find the solution to this problem we’re going to make use of the Viterbi Algorithm.

The Viterbi Algorithm

Like the forward algorithm, Viterbi is a kind of dynamic programming
that makes uses of a dynamic programming trellis.
Dynamic programming: it refers to simplifying a complicated problem by breaking it down into simpler sub-problems in a recursive manner.

And this is why, the Viterbi algorithm features similar steps to the forward algorithm: Initialization, Recursion and Termination, but also the Backtracking step to find the sequence of hidden states.

Initialization

Viterbi Initialization Equation

If you’ve read attentively the previous article, the equation above should look quite familiar. It’s the same as the initialization equation of the Forward Algorithm.

Here we multiply the initial probability of state i by the emission probability from state i to observable O at time t = 1 (Shop).

For the Sunny state, we multiply its initial probability, 0.6, with its emission probability to the observable Shop, 0.2. Hence, 0.12.

For the Rainy state, we multiply its initial probability, 0.4, with its emission probability to the observable Shop, 0.2. Hence, 0.08.

To retrieve the state sequence we also need to keep track of the argument which maximized for each t and j, i.e. the previous state that maximized the result of the equation. We do this via the array ψ, and in the initialization step the first ψ variable of every state will be equal to 0 because no specific argument coming from the initial probability maximized the value of the first state.

Viterbi Initialization Array Equation

ψ(Sunny) = [ 0 ]
ψ(Rainy) = [ 0 ]

Recursion

Unlike in the Forward and Backward algorithms, in the Viterbi algorithm we’re not going to sum up the results of all the multiplications. Instead we’re going to find the maximum value among the multiplication results and assign that to the new Viterbi variable.

The multiplication consists of the previous Viterbi variable of state i, times the transition probability from state i to j, times the emission probability from state j to observation O.

In the case above, we’re choosing the highest value of two multiplications:

Previous Viterbi variable of the previous Sunny state, 0.12, times the transition probability from Sunny to Sunny, 0.8, times the emission probability from Sunny to Clean, 0.1.
0.12 * 0.8 * 0.1 = 0.0096
Previous Viterbi variable of the previous Rainy state, 0.08, times the transition probability from Rainy to Sunny, 0.4, times the emission probability from Sunny to Clean, 0.1.
0.08 * 0.4 * 0.1 = 0.0032

0.0096 > 0.0032 hence, δ = 0.0096

As for the ψ array, we add the argument defined by the following equation.

In the case above, the argument that maximized the Viterbi variable was the previous Sunny state that gave the result of 0.0096 for the Viterbi variable of the next Sunny state. Consequently, this Sunny state will be added to the array ψ of Sunny:

ψ(Sunny) = [ 0, Sunny ]

Similarly:

ψ(Rainy) = [ 0, Rainy ]

And so on until we don’t have all the Viterbi variables.

ψ(Sunny) = [ 0, Sunny, Rainy, Sunny ]
ψ(Rainy) = [ 0, Rainy, Rainy, Sunny ]

Termination

This equation represents the probability of the entire state sequence up to point T + 1 having been produced given the observation and the HMM’s parameters.

So, we need to find the maximized value among all Viterbi variables at time T, i.e. all the variables of every state at the end of the observation sequence. Hence, in our example above:

At time T the Sunny state has a Viterbi variable equal to 0.00082944, and the Rainy state has a Viterbi variable equal to 0.00015552.

0.00082944 > 0.00015552 => P = 0.00082944

And the final element of the ψ arrays is Sunny because the argument that maximizes the following equation is Sunny:

ψ(Sunny) = [ 0, Sunny, Rainy, Sunny, Sunny]
ψ(Rainy) = [ 0, Rainy, Rainy, Sunny, Sunny]

Backtracking

The start of the backtrace corresponds to the last state of the hidden state sequence, and is given by the ψ equation at the termination step above, therefore, the Sunny state.

So, our hidden state sequence Q looks like this:
Q = [ ?, ?, ?, Sunny ]

The equation below is meant to find the hidden state sequence by backtracing through the ψ arrays.

This might look very, very odd, but it’s explanation is very, very simple once you get the hang of it.

The q*s are the states of the state sequence we’re trying to find. So, when t+1 is equal to T (the end of the sequence i.e. 4) the state q* is the last state Sunny.

Now, to find q* at time t we’re basically searching in the array ψ of the state q*t+1 at time t+1.

Hold up, this is tricky.

Let’s look at our arrays ψ:

ψ(Sunny) = [ 0, Sunny, Rainy, Sunny, Sunny ]
ψ(Rainy) = [ 0, Rainy, Rainy, Sunny, Sunny ]

The last state q* at time t=4 is Sunny, so to find the hidden state at time t=3 we search the array ψ of Sunny at time t+1 i.e. 3+1=4 . In this case, it’s Sunny again.

ψ(Sunny) = [ 0, Sunny, Rainy, Sunny, Sunny ]

So, our q* at time t=3 is Sunny.
Q = [ ?, ?, Sunny, Sunny ]

Now, if we want to find q* at time t = 2, we have to search the array ψ of q*t=3 (Sunny) at time 3. In this case instead, it’s Rainy.

ψ(Sunny) = [ 0, Sunny, Rainy, Sunny, Sunny ]

So, q* at time t=2 is Rainy.
Q = [ ?, Rainy, Sunny, Sunny ]

Now instead, if we want to find q* at time t = 1, we search the array ψ of q*t=2 (Rainy) at time 2. This gives us Rainy again.

ψ(Rainy) = [ 0, Rainy, Rainy, Sunny, Sunny ]

So, q* at time t=1 is Rainy.

Consequently, if Lea’s observations sequence was:
Shop => Clean => Bike => Paint
Thanks to the backtracking step we now know the sequence of hidden states:
Q = [ Rainy, Rainy, Sunny, Sunny ]

Hidden State Sequence derived from Backtracking

Conclusion & What’s next

So, I know this was probably very hard, especially getting your head round the backtracking step, but I hope you guys managed to understand all of it. If not, let me know in the responses what doesn’t look clear.

Bonus tip for developers

Like in the previous article, I would like to share to every developer who’s reading this how to calculate the Viterbi algorithm with my npm package called mary-markov .

You can download the npm package here, and see how I wrote the JavaScript code here.

Once you have installed it using npm install --save mary-markov you can write the following code in your index.js file and run it. This will solve you the Decoding problem by calculating the Viterbi Algorithm and backtracking through the hidden states.

*********

In the next article we’re going to be looking at the Learning problem of HMMs and how to solve it using the Baum-Welch algorithm.

I’ll see you next time!