Hidden Markov Models: The Secret Sauce in Natural Language Processing

18 min readDec 30, 2023

Part 9: The Viterbi Algorithm in Action: Practical Examples

Let’s consider a simple HMM with two states: ‘Rainy’ and ‘Sunny’. Let’s say we have an observation sequence [‘Play’, ‘Study’, ‘Work’]. We want to find the most likely sequence of states that could have resulted in this observation sequence.

And the initial state probabilities are:

P(Rainy) = 0.6
P(Sunny) = 0.4

Let’s discuss how we can solve this problem using the Viterbi algorithm. The Viterbi algorithm follows a recursive approach to find the most probable sequence of hidden states.

Here are the main steps of the Viterbi algorithm.

Initialization: The initialization step involves setting up the initial probabilities for the algorithm. This is done for each of the possible states in the HMM. The algorithm starts by initializing the probabilities of the most likely sequences ending in each state for the first observation. These probabilities are calculated as the product of the initial state probabilities and the emission probabilities of the first observation given each state.

This means that for each state si , we calculate the product of the initial probability of that state and the probability of emitting the first observation given that state. This gives us the initial highest probability of an observation sequence ending at time 1 (the first observation) in state si.

The initialization step in the Viterbi algorithm is the first step in finding the most likely sequence of hidden states for a given observation sequence in a Hidden Markov Model (HMM). During this step, we calculate the initial path probabilities for each state at the first time step.

Let’s break down the process using the HMM example. Let’s consider last example where we have an HMM that models the weather (Rainy or Sunny) and the corresponding activities (Play, Study, Work).

First, initialize the Delta and Psi matrices:

Delta = [[0, 0],

[0, 0]]

Psi = [[0, 0],

[0, 0]]

The observation at the first time step is ‘Play’. The emission probabilities for ‘Play’ given each state are:

P(‘Play’|’Rainy’) = 0.1
P(‘Play’|’Sunny’) = 0.7

For the first observation, calculate the initial path probability for each state. This is done by multiplying the initial state probability by the emission probability of the first observation given the state. For ‘Rainy’, this is the product of the initial state probability and the emission probability of the first observation given ‘Rainy’. For ‘Sunny’, this is the product of the initial state probability and the emission probability of the first observation given ‘Sunny’.

· For Rainy, the initial path probability is P(Rainy) * P(Play | Rainy) = 0.6 * 0.1 = 0.06

· For Sunny, the initial path probability is P(Sunny) * P(Play | Sunny) = 0.4 * 0.7 = 0.28

Then, fill in the Delta and Psi matrices:

For the first observation (O[1] = ‘Play’):

Delta[‘Rainy’][1] = P(‘Rainy’) * B[‘Rainy’][‘Play’] = 0.6 * 0.1 = 0.06

Delta[‘Sunny’][1] = P(‘Sunny’) * B[‘Sunny’][‘Play’] = 0.4 * 0.7 = 0.28

Psi[‘Rainy’][1] = ‘Rainy’

Psi[‘Sunny’][1] = ‘Sunny’

After processing the first observation, update the Delta and Psi matrices. Here’s what the matrices would look like after the first step:

Delta Matrix:

Delta = [[0.06, 0],

[0.28, 0]]

In this matrix, each row corresponds to a state (‘Rainy’ and ‘Sunny’), and each column corresponds to an observation (in this case, there’s only one observation, so there’s only one column). The value at Delta[i][j] represents the maximum probability of reaching state i at time j.

Psi Matrix:

Psi = [[‘Rainy’, 0],

[‘Sunny’, 0]]

In this matrix, each row corresponds to a state, and each column corresponds to an observation. The value at Psi[i][j] represents the state that had the maximum probability at time j.

At this stage of the algorithm, since we’re at the first observation, the Psi matrix doesn’t contain any information yet. It will be filled in as we progress through the observation.

Explanation: The highlighted lines are part of the initialization phase of the Viterbi algorithm. They calculate the initial probabilities for each state and store them in the Delta matrix. The state probabilities are calculated by multiplying the initial state probabilities (Pi) by the corresponding emission probabilities (B) for the first observation.

The Psi matrix is also initialized here. Each entry in the Psi matrix represents the state that had the maximum probability at the corresponding time step. Since we’re at the first time step, the only possible previous state is the initial state itself.

Let’s break down each line:

Delta[‘Rainy’][1] = P(‘Rainy’) * B[‘Rainy’][‘Play’] = 0.6 * 0.1 = 0.06

This line calculates the initial probability for the ‘Rainy’ state. The probability of being in the ‘Rainy’ state at the first time step (time t=1) is calculated by multiplying the initial probability of being in the ‘Rainy’ state (P(‘Rainy’)) by the emission probability of observing ‘Play’ given that the weather is ‘Rainy’ (B[‘Rainy’][‘Play’]). The result is stored in Delta[‘Rainy’][1].

Delta[‘Sunny’][1] = P(‘Sunny’) * B[‘Sunny’][‘Play’] = 0.4 * 0.7 = 0.28

Similarly, this line calculates the initial probability for the ‘Sunny’ state. The probability of being in the ‘Sunny’ state at the first time step is calculated by multiplying the initial probability of being in the ‘Sunny’ state (P(‘Sunny’)) by the emission probability of observing ‘Play’ given that the weather is ‘Sunny’ (B[‘Sunny’][‘Play’]). The result is stored in Delta[‘Sunny’][1].

Psi[‘Rainy’][1] = ‘Rainy’

This line sets the initial state for the ‘Rainy’ state to ‘Rainy’. Since we’re at the first time step, the only possible previous state is the initial state itself.

Psi[‘Sunny’][1] = ‘Sunny’

Similarly, this line sets the initial state for the ‘Sunny’ state to ‘Sunny’

These initial path probabilities represent the probabilities of being in each state at the first time step, given the initial state probabilities and the emission probability of the first observation. These values represent the likelihood of being in the Rainy state and observing Play, and in the Sunny state and observing Play, respectively, at the first time step. The higher the value, the more likely it is that the state occurred. The output of the initialization step of the Viterbi Algorithm represents the initial path probabilities for each state at the first time step. They represent the likelihood of being in each state at the first time step, given the first observation.

Recursion:

Next, for each subsequent observation, the algorithm computes the maximum probability of the most likely sequence ending in each state.

The recursion step in the Viterbi algorithm is crucial as it calculates the maximum path probability for each state at each subsequent time step. This is done by selecting the maximum value obtained by considering all possible previous states and computing the product of three factors:

The maximum probability of the most likely sequence ending in the previous state (δ[j][t-1]).
The transition probability from the previous state to the current state (A[j][i]).
The emission probability of the current observation given the current state (P(ot | i)).

The max function in the Viterbi algorithm is used to select the maximum path probability among all possible previous states for each state at each time step. It is defined as:

max(path_probability[state][time-1] * transition_probability[state][prev_state] * emission_probability[state][observation])

Here’s what each component means:

path_probability[state][time-1] represents the maximum path probability up to the previous time step for the given state.
transition_probability[state][prev_state] is the transition probability from the previous state to the current state.
emission_probability[state][observation] is the emission probability of the current observation given the current state.

The formula for this calculation is:

delta[i][t] = max(δ [j][t-1] * A[j][i] * P(ot | i)) for all j

Here, δ [i][t] is the maximum path probability for state i at time t, δ [j][t-1] is the maximum path probability for state j at time t-1, A[j][i] is the transition probability from state j to state i, and P(ot | i) is the emission probability of the t-th observation given state i.

The state that contributes most to the maximum path probability for state i at time t is recorded in the ψ matrix:

ψ[i][t] = argmax(delta[j][t-1] * A[j][i]) for all j

Here, psi[i][t] is the state that contributed most to the maximum path probability for state i at time t, and argmax() is the function that returns the argument that maximizes the expression inside the parentheses.

The goal is to maximize this value over all possible previous states for each state at each time step. This is done because the most likely sequence of states is the one that maximizes the joint probability of the observed sequence and the hidden states. By doing this for each observation, the algorithm builds up a table of maximum probabilities for each state at each time step. At the end, the most likely sequence of states is the one that ends with the highest maximum probability. This recursive computation of the maximum probabilities allows the Viterbi algorithm to efficiently find the most likely sequence of hidden states, even when there are many possible states and observations

For the second observation (O[2] = ‘Study’):

Delta[‘Rainy’][2] = max(Delta[‘Rainy’][1] * A[‘Rainy’][‘Rainy’] * B[‘Rainy’][‘Study’], Delta[‘Sunny’][1] * A[‘Sunny’][‘Rainy’] * B[‘Rainy’][‘Study’]) = max(0.06 * 0.7 * 0.3, 0.28 * 0.4 * 0.3) = 0.0432

Delta[‘Sunny’][2] = max(Delta[‘Rainy’][1] * A[‘Rainy’][‘Sunny’] * B[‘Sunny’][‘Study’], Delta[‘Sunny’][1] * A[‘Sunny’][‘Sunny’] * B[‘Sunny’][‘Study’]) = max(0.06 * 0.3 * 0.2, 0.28 * 0.6 * 0.2) = 0.0256

Psi[‘Rainy’][2] = argmax(Delta[‘Rainy’][1] * A[‘Rainy’][‘Rainy’] * B[‘Rainy’][‘Study’], Delta[‘Sunny’][1] * A[‘Sunny’][‘Rainy’] * B[‘Rainy’][‘Study’]) = ‘Rainy’

Psi[‘Sunny’][2] = argmax(Delta[‘Rainy’][1] * A[‘Rainy’][‘Sunny’] * B[‘Sunny’][‘Study’], Delta[‘Sunny’][1] * A[‘Sunny’][‘Sunny’] * B[‘Sunny’][‘Study’]) = ‘Sunny’

For each subsequent observation, calculate the maximum path probability for each state. To illustrate this, let’s continue from the previous example where we have an HMM with two states: ‘Rainy’ and ‘Sunny’. The observation at the second time step is ‘Study’.

After processing the second observation, update the Delta and Psi matrices:

Delta = [[0.06, 0.0432],

[0.28, 0.0256]]

Psi = [[‘Rainy’, ‘Rainy’],

[‘Sunny’, ‘Sunny’]]

These maximum path probabilities represent the probabilities of being in each state at the second time step, given the initial state probabilities, the emission probabilities of the first and second observations, and the transition probabilities between states.

Explanation: In the second step of the Viterbi algorithm, we calculate the maximum probability of reaching each state at time 2. This is done by considering all possible paths from all states at time 1 to each state at time 2, multiplying the probabilities along each path, and selecting the maximum value.

For the ‘Rainy’ state at time 2:

Delta[‘Rainy’][2] is calculated by considering two possible paths: one where we stay in the ‘Rainy’ state and another where we transition from the ‘Sunny’ state. Each path is weighted by its initial state probability (Delta[‘Rainy’][1] or Delta[‘Sunny’][1]), the transition probability from the initial state to the final state (A[‘Rainy’][‘Rainy’] or A[‘Sunny’][‘Rainy’]), and the emission probability of the current observation given the final state (B[‘Rainy’][‘Study’]). The maximum of these two values gives the maximum probability of being in the ‘Rainy’ state at time 2.

Similarly, for the ‘Sunny’ state at time 2:

Delta[‘Sunny’][2] is calculated by considering two possible paths: one where we stay in the ‘Sunny’ state and another where we transition from the ‘Rainy’ state. Each path is weighted by its initial state probability (Delta[‘Rainy’][1] or Delta[‘Sunny’][1]), the transition probability from the initial state to the final state (A[‘Rainy’][‘Sunny’] or A[‘Sunny’][‘Sunny’]), and the emission probability of the current observation given the final state (B[‘Sunny’][‘Study’]). The maximum of these two values gives the maximum probability of being in the ‘Sunny’ state at time 2.

The argmax() function is used to identify the state that led to the maximum probability. In this case, Psi[‘Rainy’][2] and Psi[‘Sunny’][2] store the state that gave the maximum probability at time 2 for each state. These will be used in the next step of the algorithm to trace back the most probable sequence of states

The Psi matrix is used to keep track of the state that achieved the maximum path probability at each time step. The Psi value for the ‘Rainy’ state at the second time step is determined as the state that achieves the maximum value in the above calculation, and similarly for the ‘Sunny’ state.

For the second observation (O[2] = ‘Study’), we calculate the maximum probability of being in each state (‘Rainy’ or ‘Sunny’) and update the Delta and Psi matrices accordingly:

Delta[‘Rainy’][2] is calculated as the maximum of the product of the current state probability (Delta[‘Rainy’][1]), the transition probability from the current state to itself (A[‘Rainy’][‘Rainy’]), and the emission probability of the current observation given the current state (B[‘Rainy’][‘Study’]). This gives us the maximum probability of being in the ‘Rainy’ state after observing ‘Study’, assuming we were in the ‘Rainy’ state at the previous time step.
Similarly, Delta[‘Sunny’][2] is calculated as the maximum of the product of the current state probability (Delta[‘Sunny’][1]), the transition probability from the current state to itself (A[‘Sunny’][‘Sunny’]), and the emission probability of the current observation given the current state (B[‘Sunny’][‘Study’]). This gives us the maximum probability of being in the ‘Sunny’ state after observing ‘Study’, assuming we were in the ‘Sunny’ state at the previous time step.
The Psi matrix is updated similarly to keep track of the state that led to the maximum probability at each time step. For example, Psi[‘Rainy’][2] is the state that led to the maximum probability of being in the ‘Rainy’ state after observing ‘Study’, and Psi[‘Sunny’][2] is the state that led to the maximum probability of being in the ‘Sunny’ state after observing ‘Study’.

The goal of the Viterbi algorithm is to find the most probable sequence of hidden states that leads to the observed sequence of outputs, given the model’s parameters. This is achieved by recursively calculating the maximum probability of being in each state at each time step, considering all possible previous states and their corresponding transition probabilities.

For the third observation (O[3] = ‘Work’):

Delta[‘Rainy’][3] = max(Delta[‘Rainy’][2] * A[‘Rainy’][‘Rainy’] * B[‘Rainy’][‘Work’], Delta[‘Sunny’][2] * A[‘Sunny’][‘Rainy’] * B[‘Rainy’][‘Work’]) = max(0.0432 * 0.7 * 0.6, 0.0256 * 0.4 * 0.6) = 0.14592

Delta[‘Sunny’][3] = max(Delta[‘Rainy’][2] * A[‘Rainy’][‘Sunny’] * B[‘Sunny’][‘Work’], Delta[‘Sunny’][2] * A[‘Sunny’][‘Sunny’] * B[‘Sunny’][‘Work’]) = max(0.0432 * 0.3 * 0.1, 0.0256 * 0.6 * 0.1) = 0.008944

Psi[‘Rainy’][3] = argmax(Delta[‘Rainy’][2] * A[‘Rainy’][‘Rainy’] * B[‘Rainy’][‘Work’], Delta[‘Sunny’][2] * A[‘Sunny’][‘Rainy’] * B[‘Rainy’][‘Work’]) = ‘Rainy’

After processing the third observation, update the Delta and Psi matrices:

Delta = [[0.06, 0.0432, 0.14592],

[0.28, 0.0256, 0.008944]]

Psi = [[‘Rainy’, ‘Rainy’, ‘Rainy’],

[‘Sunny’, ‘Sunny’, ‘Rainy’]]

Explanation: For the third observation (O[3] = ‘Work’), the Viterbi algorithm calculates the maximum probability of being in each state (‘Rainy’ or ‘Sunny’) given the current observation. It does this by considering the current state probability, the transition probability from the current state to itself, and the emission probability of the current observation given the current state.

Delta[‘Rainy’][3] is calculated as the maximum of the product of the current state probability (Delta[‘Rainy’][2]), the transition probability from the current state to itself (A[‘Rainy’][‘Rainy’]), and the emission probability of the current observation given the current state (B[‘Rainy’][‘Work’]). This gives us the maximum probability of being in the ‘Rainy’ state after observing ‘Work’, assuming we were in the ‘Rainy’ state at the previous time step.
Similarly, Delta[‘Sunny’][3] is calculated as the maximum of the product of the current state probability (Delta[‘Sunny’][2]), the transition probability from the current state to itself (A[‘Sunny’][‘Sunny’]), and the emission probability of the current observation given the current state (B[‘Sunny’][‘Work’]). This gives us the maximum probability of being in the ‘Sunny’ state after observing ‘Work’, assuming we were in the ‘Sunny’ state at the previous time step.
The Psi matrix is updated similarly to keep track of the state that led to the maximum probability at each time step. Psi[‘Rainy’][3] is the state that led to the maximum probability of being in the ‘Rainy’ state after observing ‘Work’.

By performing these calculations, the Viterbi algorithm builds up a table of maximum probabilities and the states that lead to them. At the end of the process, the state with the highest probability in the final column of the Delta matrix is the most probable state for the last observation, and the Psi matrix can be used to trace back the most probable sequence of states.

Termination: Finally, trace back from the state with the highest probability in the last column of the Delta matrix using the Psi matrix:

Maximum probability = max(Delta[‘Rainy’][3], Delta[‘Sunny’][3]) = max(0.14592, 0.008944) = 0.14592

State with maximum probability = argmax(Delta[‘Rainy’][3], Delta[‘Sunny’][3]) = ‘Rainy’

Traceback path:

State ‘Rainy’ at time 3 was reached from state ‘Rainy’ at time 2, because Psi[‘Rainy’][3] = ‘Rainy’

State ‘Rainy’ at time 2 was reached from state ‘Rainy’ at time 1, because Psi[‘Rainy’][2] = ‘Rainy’

So, the most probable path is [‘Rainy’, ‘Rainy’, ‘Rainy’]

Explanation: After filling the Delta and Psi matrices, the Viterbi algorithm proceeds to the termination phase. This phase involves finding the most probable sequence of states that led to the observed sequence of outputs.

First, identify the state with the highest probability in the last column of the Delta matrix. In this case, the maximum probability is 0.14592, which corresponds to the ‘Rainy’ state.
Then, trace back from this state to find the sequence of states that led to it. This is done by looking at the Psi matrix. The state ‘Rainy’ at time 3 was reached from state ‘Rainy’ at time 2, because Psi[‘Rainy’][3] = ‘Rainy’. Similarly, the state ‘Rainy’ at time 2 was reached from state ‘Rainy’ at time 1, because Psi[‘Rainy’][2] = ‘Rainy’.
Continue this process until you reach the start of the sequence. The resulting sequence of states is the most probable sequence of states that led to the observed sequence of outputs. In this case, the most probable path is [‘Rainy’, ‘Rainy’, ‘Rainy’].

In conclusion, The Viterbi algorithm provides a way to find the most likely sequence of hidden states in a Hidden Markov Model given a set of observations. It does this by building up a table of maximum probabilities and the states that lead to them, then tracing back from the state with the highest probability to find the most probable sequence of states.

Let’s discuss another example. Consider the following HMM:

State Space (S) = {“Happy”, “Sad”, “Energetic”, “Tired”}

Observation Space (O) = {“Painting”, “Cleaning the house”, “Biking”, “Shopping for groceries”}

Initial probabilities (π) = |0.6 0.4|

Let’s take the initial example about Om’s activity in four days. The observation sequence is as follows:

shopping, cleaning, biking and painting. Our observation sequence would be -

The Decoding problem is asking us to find the best sequence of hidden states Q given the observation sequence O.

If Om has been shopping, cleaning, biking, and painting, how was the weather like during these four days? Q = ???

To solve this problem, we’ll apply the Viterbi Algorithm. Viterbi algorithm features similar steps to the forward algorithm: Initialization, Recursion and Termination, but also the Backtracking step to find the sequence of hidden states.

Initialization:

The equation above should look familiar if you’ve carefully read the previous article. It mirrors the initialization equation of the Forward Algorithm.

Here, we multiply the initial probability of state i by the emission probability from state i to the observable O at time t = 1 (Shop).

For the ‘Sunny’ state, we multiply its initial probability, 0.6, with its emission probability to the observable ‘Shop’, 0.2. This gives us 0.12.

For the ‘Rainy’ state, we multiply its initial probability, 0.4, with its emission probability to the observable ‘Shop’, 0.2. This gives us 0.08.

To obtain the state sequence, we also need to keep track of the argument that maximized for each t and j, i.e., the previous state that maximized the result of the equation. We achieve this via the array ψ. During the initialization step, the first ψ variable of every state will be equal to 0 because no specific argument coming from the initial probability maximized the value of the first state.

ψ(Sunny) = [ 0 ]
ψ(Rainy) = [ 0 ]

Recursion:

Unlike the Forward and Backward algorithms, in the Viterbi algorithm, we don’t sum up the results of all the multiplications. Instead, we find the maximum value among the multiplication results and assign that to the new Viterbi variable. The multiplication involves the previous Viterbi variable of state i, the transition probability from state i to j, and the emission probability from state j to the observation O.

In this situation, we’re selecting the higher value from two multiplications:

The previous Viterbi variable of the ‘Sunny’ state, 0.12, multiplied by the transition probability from ‘Sunny’ to ‘Sunny’, 0.8, and then multiplied by the emission probability from ‘Sunny’ to ‘Clean’, 0.1. The result is 0.0096.
The previous Viterbi variable of the ‘Rainy’ state, 0.08, multiplied by the transition probability from ‘Rainy’ to ‘Sunny’, 0.4, and then multiplied by the emission probability from ‘Sunny’ to ‘Clean’, 0.1. The result is 0.0032.

Since 0.0096 is greater than 0.0032, δ = 0.0096.

Regarding the ψ array, we add the argument defined by the following equation.

In the case above, the argument that maximized the Viterbi variable was the previous Sunny state that gave the result of 0.0096 for the Viterbi variable of the next Sunny state. Consequently, this Sunny state will be added to the array ψ of Sunny:

ψ(Sunny) = [ 0, Sunny ]

Similarly:

ψ(Rainy) = [ 0, Rainy ]

And so on until we don’t have all the Viterbi variables.

ψ(Sunny) = [ 0, Sunny, Rainy, Sunny ]

ψ(Rainy) = [ 0, Rainy, Rainy, Sunny ]

Termination:

The equation above signifies the probability of the entire state sequence up to point T + 1 being produced given the observation and the HMM’s parameters. So, we need to find the maximum value among all Viterbi variables at time T, i.e., all the variables of every state at the end of the observation sequence. In our example above:

At time T, the ‘Sunny’ state has a Viterbi variable equal to 0.00082944, and the ‘Rainy’ state has a Viterbi variable equal to 0.00015552.

Since 0.00082944 is greater than 0.00015552, P = 0.00082944.

The final element of the ψ arrays is ‘Sunny’ because the argument that maximizes the following equation is ‘Sunny’

ψ(Sunny) = [ 0, Sunny, Rainy, Sunny, Sunny]

ψ(Rainy) = [ 0, Rainy, Rainy, Sunny, Sunny]

Backtracking:

The beginning of the backtrace corresponds to the last state of the hidden state sequence, which is given by the ψ equation at the termination step above, hence, the ‘Sunny’ state.

So, our hidden state sequence Q looks like this:

Q = [ ?, ?, ?, Sunny ]

The equation below is used to find the hidden state sequence by backtracing through the ψ arrays.

This might seem unusual, but it’s quite straightforward once you understand it.

The qs represent the states we’re trying to find in the state sequence. When t+1 equals T (the end of the sequence, i.e., 4), the state q is the last state ‘Sunny’.

To find q* at time t, we essentially search in the array ψ of the state q*t+1 at time t+1.

Let’s look at our ψ arrays:

ψ(Sunny) = [ 0, Sunny, Rainy, Sunny, Sunny ] ψ(Rainy) = [ 0, Rainy, Rainy, Sunny, Sunny ]

The last state q* at time t=4 is ‘Sunny’, so to find the hidden state at time t=3 we search the array ψ of ‘Sunny’ at time t+1 i.e., 3+1=4. In this case, it’s ‘Sunny’ again.

So, our q* at time t=3 is ‘Sunny’. Q = [ ?, ?, Sunny, Sunny ]

If we want to find q* at time t = 2, we have to search the array ψ of q*t=3 (‘Sunny’) at time 3. In this case, it’s ‘Rainy’.

So, q* at time t=2 is ‘Rainy’. Q = [ ?, Rainy, Sunny, Sunny ]

If we want to find q* at time t = 1, we search the array ψ of q*t=2 (‘Rainy’) at time 2. This gives us ‘Rainy’ again.

So, q* at time t=1 is ‘Rainy’.

So, if Om’s observation sequence was: Shop => Clean => Bike => Paint Thanks to the backtracking step, we now know the sequence of hidden states: Q = [ Rainy, Rainy, Sunny, Sunny ]

Ending note- As we conclude our eighth installment, I trust that you have enjoyed the journey we’ve taken together. I am deeply grateful for your engagement. Your feedback fuels my passion for creating content that resonates with you. However, our adventure is far from over! There’s more to come, and I’m excited to continue sharing our journey with you in Part 10: The Learning Problem and the Baum-Welch Algorithm in HMM. So, if you haven’t already, please mark your calendars and get ready for another thrilling episode.

In the meantime, I invite you to revisit the previous parts, reflect on what we’ve covered, and share your thoughts in the comments section. Let’s make this a community of learners and grow together. Lastly, don’t forget to subscribe and hit the notification bell icon if you haven’t done so. This way, you won’t miss any future posts.

Keep seeking answers, keeping up with trends, and staying updated!!

Hidden Markov Models: The Secret Sauce in Natural Language Processing

Written by om pramod