Lyrics/Text Prediction using Markov Chain from scratch in Python

Dhruv Shrinet
4 min readAug 8, 2020

--

Markov chain model is a very simple model which works on the probability distribution , depending upon the past events the future events are predicted and so on, Many text editor use this model for further text prediction.

A very simple example here, for example if we have a X which has “Hello” as the value in it and we want to predict what can come after it , in the example below there are four cases and out of which “<space> ” came three times, and rest of the words came only once so the probability of “<space> ” is the highest

Let’s check out the probability distribution for the above text of sequence of length 5(k=5,which will make a distribution dictionary of every k characters and then store their frequency)

Let’s make a class and then proceed

I have used defaultdict() here just because it doesn’t throw key value error if the key is not found here in this case it will return 0 (because we have initialized it with int),This function goes through the text and picks X as the first k characters and Y as the k+1 character and if it’s not in the dictionary , initialize it with new dictionary and like inside the “if” statement and then initialize with 1, and if its already present increment by 1

Let’s see how it looks after making an object and passing our text

Each of the text on left side if of 5 characters(k=5) long, along with spaces also and on the right side is the frequency like “<space>” were three times after “Hello” and so on..

Now after the frequency distribution we need to probability distribution for the same so we can use it to predict the word.

This method iterates over the freq_dist keys first and from keys it goes through its values and sums up for example , here the function goes through ‘ello ’ and from there it goes to the values in it which have keys such as ‘t’,’D’ and ‘W’ it takes the sum of all values in it which comes up 3 and the place it in the original dictionary by averaging the (frequency/total sum) in place of their frequency values

This results in , here after dividing 1/3 which is the probability distribution of each word.

After creating the probability distribution we need to pick words depending upon the highest probability and if they are same then pick the random one.

Thanks to numpy we have function np.random.choice() which let us do the work.It returns single character

Lets try with “Hello “ , it gives “W” as it has the probability of 0.333.

what if we try for a word not in the dictionary

Now let’s create a method which makes a sentence out of a word , which keeps on predicting the word till it makes a sentence or passes a threshold value.

Here maxlen is the value of characters till which to predict, start is the first k letter word from which it will start predicting, the loop keeps iterating and adds to sentence which creates a whole new sentence from the previous k characters.

Let’s implement this model over lyrics , I have chosen Boulevard of dreams as the song (what’s better than that?).

Little per-processing

Final Result!!

The hyper-parameters such as k and maxlen can be adjusted as per your own wish for better result hit and try different values.

--

--