Punctuation in some more novels.
I, too, was inspired by that series of posters and started a project but then set it aside. Adam did a great job moving forward with “heatmaps” and I thought I’d throw my hat in the ring.
Grabbing texts from Project Gutenberg, I stripped away everything that wasn’t one of
. , ‘ “ : ; ! ? — _ ( ) * & [ ]
And made some “heatmaps” with a similar color scheme:
I noticed someone’s response was curious about comparing novels across authors, so here goes Mark Twain:
I don’t see much similarity on first glance, other than that shared red line in Huck Finn and CT Yankee.
Here’s Charles Dickens:
Don’t know that I’d say there’s a lot of similarity. Nicholas Nickelby looks like it has a bunch of really long sentences though!
Some more analysis is called for, obvs. Proportion of colons/semicolons in all punctuation for each author? Average sentence length? Definitely there are more dimensions to examine.
A brief look at the frequency charts for Mark Twain:
Almost definitely needs to be compared to other authors.
My code is available on Github