Punctuation in novels
Adam J Calhoun

Punctuation in some more novels.

I, too, was inspired by that series of posters and started a project but then set it aside. Adam did a great job moving forward with “heatmaps” and I thought I’d throw my hat in the ring.

Grabbing texts from Project Gutenberg, I stripped away everything that wasn’t one of

. , ‘ “ : ; ! ? — _ ( ) * & [ ]

And made some “heatmaps” with a similar color scheme:

I noticed someone’s response was curious about comparing novels across authors, so here goes Mark Twain:

The Adventures of Tom Sawyer
The Adventures of Huckleberry Finn
The Prince and The Pauper
Life on the Mississippi
A Connecticut Yankee in King Arthur’s Court

I don’t see much similarity on first glance, other than that shared red line in Huck Finn and CT Yankee.

Here’s Charles Dickens:

A Tale of Two Cities
Nicholas Nickleby
David Copperfield
Great Expectations
Oliver Twist

Don’t know that I’d say there’s a lot of similarity. Nicholas Nickelby looks like it has a bunch of really long sentences though!

Some more analysis is called for, obvs. Proportion of colons/semicolons in all punctuation for each author? Average sentence length? Definitely there are more dimensions to examine.

A brief look at the frequency charts for Mark Twain:

Tom Sawyer
Huck Finn
CT Yankee
Life on MS

Almost definitely needs to be compared to other authors.

My code is available on Github

Like what you read? Give Tyler Field a round of applause.

From a quick cheer to a standing ovation, clap to show how much you enjoyed this story.