Introduction to Hidden Markov Models (HMM)

4 min readMay 24, 2024

Hidden Markov Models (HMMs) are powerful statistical models. Widely used in fields ranging from finance to speech recognition, understanding HMMs can significantly enhance your analytical abilities. Before diving into HMMs, I recommend to familiarize yourself with the basics of Markov Chains, as they form the foundation for understanding HMMs.

Definition

Let’s start by giving definition of Hidden Markov Model. It is just a Markov Chain with the only difference that each state emits output based on some probability distribution. Lets see an example.

So in the example HMM we have two states Sunny and Rainy, and three possible outputs (called emissions) namely Golf, Boxing and Football. Notice that each of the states has its emit probability distribution. Emit probabilities should not be mistaken for the transition probabilities.

In the example, the transition probability distribution is:

If currently Sunny the chance of staying so is 0.7 (70%) and the chance of transition to Rainy is 0.3 (30%).
If currently Rainy the chance of staying so is 0.6 (60%) and the chance of transition to Sunny is 0.4 (40%).

In the example, the emit probability distributions are:

If it is Sunny then there is 0.45 (45%) chance to golf, 0.2 (20%) chance to do boxing and 0.35 (35%) chance to play football.
If it is Rainy then there is 0.2 (20%) chance to golf, 0.5 (50%) chance to do boxing and 0.3 (30%) chance to play football.

This example HMM is a weather model, based on the activities observations (Golf, Boxing and Football) we can try to deduce what was the weather (Sunny or Rainy).

Goal of the HMM

Hidden Markov Models (HMMs) are used to model systems where you observe sequences of events and want to understand the underlying processes that are not directly observable. These hidden states influence the sequence of observable events, and HMMs help predict these hidden states based on the observed data. So in short, HMMs are used to model a system and predict a system’s hidden states based on observable events. Typically the hidden states and the transitions of the HMM are treated as black-box and are not-known. That’s why the states are referred to as hidden states when HMM is concerned. The only thing that is known are the emissions. So in the common case we have sequence of emissions and we try to deduce information about the hidden states and their transitions contained in the HMM.

The three fundamental HMM problems

In the context of Hidden Markov Models (HMMs), there are three classic types of problems that are commonly addressed:

Evaluation Problem (Likelihood Problem)

This involves determining the probability of observing a given sequence of emissions from a HMM. Let’s take our example HMM and define evaluation problem for it as follows:

“Calculate the probability of the emission sequence [Boxing, Golf, Golf, Football, Boxing].”

The solution of such type of problem helps in understanding how well the observed sequence of events fits the model. The Forward and Backward algorithms are typically used to efficiently compute this probability.

Decoding Problem

The goal here is to find the most likely sequence of hidden states that led to the observed sequence of emissions. Let’s take our example HMM and define decoding problem for it as follows:

“Given the observed emission sequence [Boxing, Golf, Golf, Football, Boxing], determine the most likely sequence of hidden states.”

Solving such type of problem is essential for interpreting the underlying states that are not directly observable. The Viterbi Algorithm is the most common method used to solve this problem, as it efficiently finds the most probable path through the hidden states.

Learning Problem

In this scenario, the task is to estimate the HMM parameters (transition probabilities between hidden states, emission probabilities from hidden states to observed emissions, and initial state probabilities) that maximize the likelihood of the observed sequence of emissions.

“Given the observed sequence of emissions [Boxing, Golf, Golf, Football, Boxing], estimate the most likely set of state transition and emission probabilities for the HMM.”

This problem is often tackled using the Baum-Welch Algorithm, a specific case of the Expectation-Maximization (EM) algorithm, which iteratively estimates these probabilities.

These three problems cover the main functionalities of HMMs in various applications, from speech recognition and bioinformatics to finance and weather prediction, providing a robust framework for dealing with temporal data where one wishes to infer hidden states from observable data.

In follow-up articles I will examine how each of these three fundamental problems are solved using algorithms in details. Stay tuned!