A Wordsmith of Ice and Fire
There can be only one true Story King who sits on his Iron Throne: George R.R. Martin.
For this article, I bent the knee and pledged my allegiance to King George of Wordstar. I took the five books of his lore, threw them in the software fires of Wordsmith* and stared into the flames. Here’s some of the wisdom I gained — and now is passed on to you, dear reader.
Although… Wisdom? Perhaps it’s more a game of thoughts, and I’m not yet sure what conclusions to bring to the map table.
The 5 books were cast not as one, but separately, so we’re able to see some differences in statistics.
Now let’s travel to Westeros.
* With software like Wordsmith Tools, one is able to distill all sorts of nice statistics from (large) textfiles and calculate mean word length, mean sentence length etc. — it’s a kind of linguistic super collider accelerator.
1. Average water dance
First, I knew I needed some training before I delved in all this fickle matter.
I asked Syrio Forel about martial arts. He taught me a trick or two in water dancing, and I gained some insight in averages.
Here’s the mean sentence length per book:
- A Game of Thrones:…..19.87
- A Clash of Kings:.……..18.85
- A Storm of Swords:.…..21.83
- A Feast for Crows:.……18.61
- A Dance with Dragons: 21.99
One would suspect that there is a certain consistency in such matters, however these values don’t differ that much. Might we be inclined to deduce that Martin’s moods swing bit by bit throughout the writing process?
Or perhaps in books 3 and 5 (longest MSL) he had more to tell between interpunction?
Or do books with longer sentences sell better?
Or was he in a rush to finish them?
In a next episode, I shall compare these values to those of similar bestseller series, like Dan Brown’s Langdon series, 50 Shades, and Harry Potter. And the complete Sherlock Holmes — just to put things in perspective.
We might find some hard data-backed evidence that explains the commercial success of these types of books series. Or not, I don’t know.
Syrio then knew I was ready for the next lesson: mean word length.
Also sorted per book:
(Full stats below)
These are even closer together. There’s probably a lot of consistency to be derived from this data.
Ah well. That’s enough. Not a lot of adventure left here.
On to the next.
2. Single in the Citadel
I went to the Citadel and consulted with the Maesters. They taught me about the single-occurrence words. Well, these are magical and give testament to GRRM’s great prowess in language.
The ancient scrolls taught me that there are 6556 words that occur only once, in a combined corpus of 1,765,463 words total (or 19Mb — a lot of papyrus).
That’s a ratio of 269 — let’s call it the Bronts Ratio from now on.
Though I’m not yet sure how, this might tell us something about the vocabulary of the author. What I do know, is that this number is dependent on corpus size: I did the same calculation with a few other Wordsmith results, and the value differs greatly (3.87 and 14.72) — though the authors share an equal amount of virtuosity: Lahiri and Rushdie.**
(notes, stats and backgrounds soon here on Medium)
The bright insights of a Statistics Maester would be of great assistance. Send me a raven below.
We might shed some more light on the relationship between Bronts Ratio and STTR, and how to interpret the STTR — Standardised Type-Token Ratio. Which hovers around 43 ~ 44 throughout the Ice and Fire books. Rushdie’s STTR in Haroun is 46 and Lahiri charts an STTR in Bibi of 52.**
This might tell us something about the vocabulary used. This comparison, however, is not entirely fair, because Haroun and Bibi are short stories and the Ice and Fire books stretch half a bookshelf.
(Tokens are ‘all’ words, Types are distinct words. Full stats below.)
3. Melisandre’s Concord
Back North, Melisandre kissed me and I felt both joy and darkness deep within. Although the hot season may draw to a close, we still had some work to do on putting one phrase in order. Before the night falls, which is dark and full of terrors.
A concord of Winter and Words is coming:
Here we can see that this powerful phrase occurs only 22 times throughout the 5 books. One would have expected it to appear more often (or I at least did).
Some are a quote from a character, at other times the narration takes us icewards. Most are at the close of a sentence, or appear as a single sentence themselves.
At the crack of dawn, just before I left, Melisandre told me that I know nothing.
What other phrases would you like me to put in concord? And shine a light here on Medium. Send me a raven below in the comments.
4. Longest words Warlocks
It was high time I drank some Essence of Nightshade, so I sat down with the Warlocks in Qarth. Though the city had lost much of its splendour since Dany sacked and burnt it, the House of the Undying was unscathed and still standing tall.
After a few beakers of their thick intoxicating liquid, there appeared before me some visions of the longest words that flow from the quills of the One True King of Wordsmith. Here’s a list of 15- to 18-letter words:
Guess I best stay in Qarth a little longer, just until my lips turn purple and we’ll start our next adventure.
Send me a raven — below in the comments. Or through the fibers by mail at firstname.lastname@example.org
(Please note that I’m not a Wordsmith expert yet; I’d like to consider these explorations as the chronicles of my journey deeper into Wordsmith. Also, I’d love to team up with a mathematician/statistics expert. Let me know when you’re interested. Send me a raven below in the comments.)
** In Haroun and the Sea of Stories (Rushdie), there’s 3058 words that occur only once (in a text of 45,000 words), that’s a Bronts Ratio of 14.72.
With Lahiri’s The Treatment of Bibi Haldar, it’s 1233 words that occur once — in a text of 4768 words; a Bronts Ratio of 3.87.
More on Rushdie and Lahiri in my next episodes — here on Medium.