Creating Markov Chain Based Sonnets

2 min readJan 14, 2018

Markov Chains have always fascinated me. Not because they allow for super useful technologies like speech recognition and web searches, but because of their ability to create results that appears human and ‘creative’. Inspired by prior art (here (http://www.uliaea.ca/), here (https://filiph.github.io/markov/), and the infinite number of monkeys typing away at their typewriters), I thought I would take a crack at using Markov Chains to write something new, stealing from none other than the Bard himself.

The formal definition of a Markov Chain is a stochastic model for ‘memoryless’ state transition, meaning each state’s probability distribution is dependent on only the current state and none of the states before (the Markov property). Here’s a quick example:

Let’s imagine that for a given year of 10th graders, 80% will move on to 9th grade, 15% will repeat 8th grade, and 5% will drop out of school. Let’s also suppose that student A has repeated 8th grade 5 years in a row. Common sense dictates that A will probably repeat 8th grade again, but to the memoryless Markov process the probability stays at 18%/15%/5%, using only the current grade (state) of the student is being to predict the future.

Since only the current step is ever taken into account, Markov Chains are simple to code yet can produce surprisingly powerful results. In our case our word frequency will generated from the 154 Shakespeare sonnets, with each current word being its own state (due to the small data size using more than one word for state representation would probably result in lines directly from the original sonnets).

While there have been other Markov Sonnet attempts (see here (http://www.devjason.com/2010/12/28/shakespeare-sonnet-sourced-markov-text-generation/) and here (http://www.survo.fi/demos/ex55.html)), none of them preserve the classic ABAB CDCD EFEF GG rhyme scheme of the sonnets. Preserving the rhyme scheme proved to be an interesting problem that necessarily breaks the memoryless property of the Markov Process (last word of rhyming lines will be dependent on previous lines instead of the current word), but in the interest of making actual sonnets I’m willing to bend the rules. The rhyming problem can be solved by adding an additional Markov Process for generating Sonnet lines backwards and keeping lists of words that rhyme with each other to sample randomly. While we are at it, minimum and maximum line size were also added to make the couplets look more believable. The code for generation the sonnet pairs can be found here ().

In the end, I’m pleasantly surprised by the sonnets generated by the program, despite the dozen or so eggs that were broken to get there. While Markov Chains can be a fantastic tool to model data, in some cases additional system logic need to be implemented to reach satisfactory result level of result.

Written by John Z