Storytelling That Tells Stories About (Lots and Lots of) Stories

Understanding Complex Systems

Jim Wildman
Sharon and Clyde
4 min readNov 10, 2017

--

I’ve become a fan of a man I’ve never met who specializes in a data field that I understand about as much as I understand regional dialects in Papua New Guinea.

(That’s a small shout-out to a buddy who does, Jonathan Claussen. At least the Papua New Guinea part.)

Andy Reagan is the guy, a Senior Data Scientist at MassMutual’s Data Science Program.

Andy has a doctorate in Advanced Mathematics with a focus on Complex Systems.

He also describes himself as a “soon-to-be father.”

I became familiar with Andy’s research last summer after stumbling on coverage of a paper he co-authored about “The emotional arcs of stories.”

My first thought: another bite at that apple?

Project Gutenberg in a Nutshell

Boy, was I wrong.

The paper’s introduction begins with these claims which are water to my soul:

  • “The power of stories to transfer information and define our own existence has been shown time and time again.”
  • “We are fundamentally driven to find and tell stories.”
  • “Stories are encoded in art, language, and even in the mathematics of physics.”

The paper goes on to:

Use a simple, robust sentiment analysis tool to extract the reader-perceived emotional content of written stories as they unfold on the page.

After analyzing a huge database of texts in Project Gutenberg, Andy and his colleagues report that “the emotional arcs of stories are dominated by six basic shapes”:

  1. “Rags to Riches” (rise)
  2. “Tragedy” or “Riches to Rags” (fall)
  3. “Man in a Hole” (fall-rise)
  4. “Icarus” (rise-fall)
  5. “Cinderella” (rise-fall-rise)
  6. “Oedipus” (fall-rise-fall)

(For the record, the paper recognizes there have been “various attempts to enumerate and classify the core types of stories” — there’s also a reference to Kurt Vonnegut’s rejected master’s thesis about story arcs.)

Graphing Harry Potter

This is the part of the paper that caught my eye:

It’s a graphical representation of the emotional arc of “Harry Potter and the Deathly Hallows.”

It’s also explained this way in the paper:

Annotated emotional arc of Harry Potter and the Deathly Hallows, by JK Rowling, inspired by the illustration made by Medaris for The Why Files. The entire seven book series can be classified as a ‘Kill the monster’ plot, while the many sub plots and connections between them complicate the emotional arc of each individual book: this plot could not be readily inferred from the emotional arc alone. The emotional arc shown here, captures the major highs and lows of the story, and should be familiar to any reader well acquainted with Harry Potter. Our method does not pick up emotional moments discussed briefly, perhaps in one paragraph or sentence (e.g., the first kiss of Harry and Ginny).

So cool.

Sentiment in Books as Indicators of History

A second paper co-authored by Andy and others is equally as captivating.

This one is titled “Sentiment analysis methods for understanding large-scale texts: a case for using continuum-scored words and word shift graphs.”

Again, lots there that extends well beyond my educational credentials … but the paper is a broad analysis of methods used to measure the sentiment of huge amounts of datasets like Tweets and news coverage.

A giant Google books database was the dataset this time.

Andy and his colleagues plotted the “sentiment time series”:

The graph shows the results from all the measurement methods they studied — those are all the acronym-like words on the right.

There are obvious consensus peaks and valleys.

The paper points out:

Three immediate trends stand out: a dip near the Great Depression, a dip near World War II, and a general upswing in the 1990’s and 2000’s.

Again, so cool.

Real Time Complex Data Analysis

It goes on.

Andy Reagan is also a contributor to Hedonometer.com, described this way:

Our hedonometer is based on people’s online expressions (via Twitter), capitalizing on data-rich social media, and we’re measuring how people present themselves to the outside world.

It’s interactive.

The site also has a really cool word-for-word sentiment slider analysis on the Declaration of Independence.

I think context analysis methods that are living and breathing like this are part of our future.

I also wonder what else is out there besides sentiment analysis.

Data is such an important storytelling resource — especially if its as up-to-the-moment as possible.

--

--

Jim Wildman
Sharon and Clyde

Jim helps clients tell brand stories for key audiences. He comes out of NPR where he spent 16 years with “Morning Edition.”