Scientists Use Big Data to Discover 6 Basic Emotional Story Arcs

Humans are fundamentally driven to find and tell stories. They have the power to share information and define our existence, which explains our innate fascination with storytelling. Advances in computing power, language processing, and the digitization of text now make it possible to study a culture through its texts using ‘big data.’

As American author Kurt Vonnegut said, “There is no reason why the simple shapes of stories can’t be fed into computers, they are beautiful shapes.”

Emotional arc of Harry Potter and the Deathly Hallows, by J.K. Rowling. The entire seven book series can be classified as a complex “Kill the monster” plot. (Credit: Hedonometer / Andy Reagan / Kirsch)

Andrew Reagan at the Computational Story Lab at the University of Vermont in Burlington and his team used sentiment analysis to map the emotional arcs of over 1,300 stories and then used data-mining techniques to reveal the most common arcs. They discovered six core story arcs that form the building blocks of complex narratives.

This study is fascinating because it provides empirical evidence for the existence of basic story arcs for the first time, shedding light on the nature of storytelling and its appeal to humankind.

The Experiment

Scientists obtained a collection of 1,327 books that were mostly, but not all, fictional stories using metadata from Project Gutenberg. To generate emotional arcs they analyzed the sentiment of 10,000 word windows, which they applied to the text (see image below), and rated the emotional content of each window.

They discovered six emotional story arcs as the result of three methods: As modes from a matrix decomposition by SVD, as clusters in a hierarchical clustering using Ward’s algorithm, and as clusters using unsupervised machine learning.

Explore their story arcs and corresponding data visualizations here.

6 Basic Story Shapes

  • “Rags to riches” (rise)
  • “Tragedy”, or “Riches to rags” (fall)
  • “Man in a hole” (fall-rise)
  • “Icarus” (rise-fall)
  • “Cinderella” (rise-fall-rise)
  • “Oedipus” (fall-rise-fall)

From top left: rags to riches, man in a hole, Cinderella, tragedy, Oedipus, Icarus. (Credit: Reagan et. al/ University of Vermont)


After the team solidified the six basic emotional story arcs, they looked at the correlation between the emotional arc and the number of story downloads to see which types were most popular. It turns out the most popular stories followed the Icarus and Oedipus arcs.

Stories that followed more complex arcs, using the basic building blocks in sequence, were also popular. In fact, the research shows that the most popular stories involved two sequential man-in-hole arcs and a Cinderella arc followed by a tragedy.

Do you have a story to tell with data? Infogram can help. Our tool makes it easy to create beautiful visualizations — the perfect supporting character for any great story.