Markov Chain One Gram Model

6 min readFeb 23, 2023

In this post, I want to share how to use Markov Chain to build a one gram model, as well as code in python.

Introduction

Markov Chain describes a chain process that next output depend on the previous. For instance, you want to know what tomorrow’s weather. Then you might want to consider today’s weather and historical weather when making this prediction. Say I collected past 7 days weather, and they are:
Day1: Sun, Day2: Cloudy, Day3: Rain, Day4: Rain, Day5: Sun, Day6: Sun, Day7: Cloudy.

Then we could create a query:
Sun → Cloudy → Rain → Rain → Sun → Sun → Cloudy

From this past query, we could develop a transition matrix:

Transition Matrix: each cell is the count of next event (column) occurred based on previous event (row)

This is a 3*3 square matrix since we have 3 outcomes — Sun, Rain, Cloudy. In each cell, we write down how many time this pattern — two events occur one after another happens. In this case, row is the previous event and column is the next event.

For first row, the first cell (Sun → Sun), it records how many time sunny day happened after sunny day. It only happened once (Day6 after Day5). The second cell (Sun → Rain), it records how many times a raining day happened after a sunny day, it happened 0 times given data. The third cell (Sun → Cloud), it records how many times a cloudy day happened after a sunny day, it happened 2 times (Day2 after Day1, Day7 after Day6).

Then we can compute the probabilities based on the transition matrix:

Each cell is the probability of next event (column) occurred based on previous event (row)

Given a day was sunny, there were 3 outcomes for next day — 1 sunny day and 2 cloudy days, then the probability of next day’s weather given today is sunny can be calculated as:

the probability of next day is sunny | today is sunny = 1/3;
the probability of next day is cloudy | today is sunny = 2/3.

It is twice likely to have a cloudy day tomorrow given today is sunny based on the data we collected.

A random sampling process will be performed given this probability. Let’s sample one value from a uniform distribution 0 to 1:

Sample = a random sample from uniform distribution of 0 and 1.
if sample is in [0, 1/3], then the next day is sunny;
if sample is in (1/3, 1], then the next day is cloudy.

To answer what’s the weather for Day8, first identify what is the weather in Day7 — Cloudy. Since there is only outcome after cloudy — Rain with probability of 1, the random sample process can be defined as:

Sample = randam.sample.unifomr(0,1) 
Say sample = 0.34,
if sample is in [0, 1], then the next day is Rain,
--> Day8 is Rain.

Then we could conclude that tomorrow (Day 8) will be Rain with probability of 1.

This is a very simple example of Markov Chain, we can build one gram model to predict the next possible word.

One Gram Model

For example, I have a sentence: it was a hot day, I had a popsicle

Next step is generating transition matrix:

Transition Matrix, row is previous event and column is next event

Then compute probability matrix:

Probability Matrix, row is previous event and column is next event

Let’s build a new sentence start with it (max 5 words for this sentence). The next words:

Start: it

Sample = randam.sample.unifomr(0,1),
Say sample = 0.34,
if sample is in [0,1], then next word is 'was' 
sample = 0.34 in [0,1]
next word = 'was'

it → was

Sample = randam.sample.unifomr(0,1),
Say sample = 0.67,
if sample is in [0,1], then next word  'a'
sample = 0.67 in [0, 1],
next word = 'a'

it → was → a

Sample = randam.sample.unifomr(0,1),
Say sample = 0.24,
if sample is in [0,0.5], then next word  'hot'
if sample is in (0.5,1], then next word  'popsicle'
sample = 0.24 in [0, 0.5],
next word = 'hot'

it → was → a → hot

Sample = randam.sample.unifomr(0,1) 
Say sample = 0.86
if sample is in [0,1], then next word  'day,'
sample = 0.86 in [0, 1]
next word = 'day,'

Final output: “it was a hot day,”.

One Gram in Python

I load couple of positive sentences as my training text:

“

Just for the record darling, not all positive change feels positive in the beginning. Forgiveness is not an occasional act, it is a constant attitude. Nothing in the world is more dangerous than sincere ignorance and conscientious stupidity. We must accept finite disappointment, but never lose infinite hope. Sometimes, you face difficulties not because you are doing something wrong, but because you are doing something right. Strength does not come from winning. Your struggles develop your strengths. When you go through hardships and decide not to surrender, that is strength. Whenever you find yourself doubting how far you can go, just remember how far you have come. At any given moment you have the power to say: This is not how the story is going to end.

”

First build the transition matrix:

class MarkovChain:

    def __init__(self):
        self.memory = {}

    def _learn_key(self, key, value):
        if key not in self.memory:
            self.memory[key] = []

        self.memory[key].append(value)

    def learn(self, text):
        tokens = text.split(" ")
        bigrams = [(tokens[i], tokens[i + 1]) for i in range(0, len(tokens) - 1)]
        for bigram in bigrams:
            self._learn_key(bigram[0], bigram[1])

m = MarkovChain()
m.learn(text)
wordmap = m.memory
print(wordmap)

Out[97]: 
{'Just': ['for'],
 'for': ['the'],
 'the': ['record', 'beginning.', 'world', 'power', 'story'],
 'record': ['darling,'],
 'darling,': ['not'],
 'not': ['all', 'an', 'because', 'come', 'to', 'how'],
 'all': ['positive'],
 'positive': ['change', 'in'],
 'change': ['feels'],
 'feels': ['positive'],
 'in': ['the', 'the'],
 'beginning.': ['Forgiveness'],
...}

Second build the probability matrix:

from collections import Counter
wordprob={}
for key in wordmap.keys():
    c = Counter(wordmap[key])
    s = sum(c.values())
    wordprob[key]={k: round(v/s,2) for k, v in c.items()}

print(wordprob)

Out[100]: 
{'Just': {'for': 1.0},
 'for': {'the': 1.0},
 'the': {'record': 0.2,
  'beginning.': 0.2,
  'world': 0.2,
  'power': 0.2,
  'story': 0.2},
 'record': {'darling,': 1.0},
 'darling,': {'not': 1.0},
 'not': {'all': 0.17,
  'an': 0.17,
  'because': 0.17,
  'come': 0.17,
  'to': 0.17,
  'how': 0.17},
 'all': {'positive': 1.0},
 'positive': {'change': 0.5, 'in': 0.5},
...}

Third create the function for random sample:

import random
def random_sample(prob_dict):
    cuts = [0]
    for i in range(len(prob_dict)):
        key = list(prob_dict.keys())[i]
        prob = prob_dict[key]
        cuts_add = prob + cuts[i]
        cuts.append(cuts_add)
    cuts.append(1)
    sample = random.uniform(0,1)
    idx = cuts.index([x for x in cuts if x >= sample][0])
    return(list(prob_dict.keys())[idx-1])

Last create the function to generate sentence:

def create_sentence(start_word, max_length):
    sentence = [start_word]
    while len(sentence) <= max_length:
        next_word = random_sample(wordprob[start_word])
        sentence.append(next_word)
        start_word = next_word
    return(' '.join(sentence))

Let’s create a sentence starts with ‘you’ with max 15 words:

print(create_sentence('you', 15))

Out[111]: 'you go through hardships and decide not all positive in the beginning. Forgiveness is not how'

This is all for this post :)

Hope you enjoyed reading.

Special thanks to this post from Kevin — https://sookocheff.com/post/nlp/ngram-modeling-with-markov-chains/ . I learned a lot from it :)

Markov Chain One Gram Model

Introduction

One Gram Model

One Gram in Python

Written by Keru Chen