Markov Chains: A simple powerful tool

Aditya Singh
Future Vision

--

Want cool Future Vision Merch? Check out our store here

About a century ago a Russian mathematician A. A. Markov founded a new branch of probability theory by applying mathematics to poetry. He jumped into the text of Alexander Pushkin’s novel in verse Eugene Onegin , Markov spent hours sifting through patterns of vowels and consonants. Markov proved that dependent variable too can converge to predictable distributions. Before him, another scientist of his time Pavel Nekrasov claimed that independence is the necessary condition for law of large numbers. Law of large numbers was stated by Bernoulli. He said if an experiment if conducted an infinite number of times, it would converge to a predefined expected value. It was a clash of two different ideas and is a very interesting story (Read about it).

So what are Markov chains?

Informally speaking, if you want to understand Markov chains consider the below mentioned example.

You have a glass of clean water. Then drop by drop put blue ink into it. Ink would mix with water. A time would come when the water will become completely blue and even if we add more blue ink drops it won’t appear to our naked eyes i.e. the water has reached its saturation or expected value.

Formal definition is :

A Markov chain is a stochastic model describing a sequence of possible events in which the probability of each event depends only on the state attained in the previous event.

It might look a simple tool but it have a huge presence in our day to day life. Some of its use cases are :

  • Google Page Rank Algorithm
  • Predicting market trends
  • Text Autocomplete on your phones
  • Chat bots
  • Weather prediction
  • Compression

The list goes on. I collected manifestos of one of the major national political party of India, Congress and tried to generate a random manifesto. I collected a data of about 3200 words and tried generating a manifesto of 200 words. (The dataset was small, but again it was just made to understand how the Markov chains work.). Markov chains are consist of states. There is probability distribution at each state which dictates the chances of going to the next state i.e. the future state depends on the present state, it has nothing to do with the past state.

I implemented Markov chain using C++. It do not require any powerful tool, so you can implement it in any language.

  • Include the following libraries
  • Create a Markov class. It would consist of two private functions and one public functions. The two private functions are textGen() and createGraph(). We have chosen the prefix length as two which would determine the suffix. So the createGraph() would help in creating a directed graph from prefix node of 2 words to suffix. The more any prefix goes to a suffix higher would be its probability of getting selected when generating text.
  • The public methods would only take file name, the length of prefix+suffix and also the length of text to be generated.
  • The main function would initialise the object of class
  • One of the manifesto generated is shown here.

If you are into machine learning and have understood what Markov Chains are you must be thinking how are they different. One basic difference is that Markov chains are pure mathematics where there is no knowledge of past whereas in ML you have knowledge of past, in-fact you learn from it. A version of Markov chain is used in ML.

For better understanding Markov you can see these videos :

This is just a simple use of Markov chains or you can say a way to understand the concept by doing. If you find this article interesting don’t forget to clap. Happy Coding :)

--

--

Aditya Singh
Future Vision

Competitive Programmer | Technology buff | Thinker