Deconstructing the Hemingway App

Sam Williams
We’ve moved to freeCodeCamp.org/news
4 min readDec 5, 2017

This is the first of two articles on how I deconstructed and re-implemented the logic from the Hemingway app. Read Part 2 here.

I’ve been using the Hemingway App to try to improve my posts. At the same time I’ve been trying to find ideas for small projects. I came up with the idea of integrating a Hemingway style editor into a markdown editor. So I needed to find out how Hemingway worked!

Getting the Logic

I had no idea how the app worked when I first started. It could have sent the text to a server to calculate the complexity of the writing, but I expected it to be calculated client side.

Opening developer tools in Chrome ( Control + Shift + I or F12 on Windows/Linux, Command + Option + I on Mac) and navigating to Sources provided the answers. There, I found the file I was looking for: hemingway3-web.js.

Minified file on the top, Formatted file on the bottom. What a difference it makes!

This code is in a minified form, which is a pain to read and understand. To solve this, I copied the file into VS Code and formatted the document (Control+ Shift + I for VS Code). This changes a 3-line file into a 4859-line file with everything formatted nicely.

Exploring the Code

I started to look through the file for anything that I could make sense of. The start of the file contained immediately invoked function expressions. I had little idea of what was happening.

!function(e) {
function t(r) {
if (n[r])
return n[r].exports;
var o = n[r] = {
exports: {},
id: r,
loaded: !1
};
...

This continued for about 200 lines before I decided that I was probably reading the code to make the page run (React?). I started skimming through the rest of the code until I found something I could understand. (I missed quite a lot that I would later find through finding function calls and looking at the function definition).

The first bit of code I understood was all the way at line 3496!

getTokens: function(e) {
var t = this.getAdverbs(e),
n = this.getQualifiers(e),
r = this.getPassiveVoices(e),
o = this.getComplexWords(e);
return [].concat(t, n, r, o).sort(function(e, t) {
return e.startIndex - t.startIndex
})
}

And amazingly, all these functions were defined right below. Now I knew how the app defined adverbs, qualifiers, passive voice, and complex words. Some of them are very simple. The app checks each word against lists of qualifiers, complex words, and passive voice phrases. this.getAdverbs filters words based on whether they end in ‘ly’ and then checks whether it’s in the list of non-adverb words ending in ‘ly’.

The next bit of useful code was the implementation of highlighting words or sentences. In this code there is a line:

e.highlight.hardSentences += h

‘hardSentences’ was something I could understand, something with meaning. I then searched the file for hardSentences and got 13 matches. This lead to a line that calculated the readability stats:

n.stats.readability === i.default.readability.hard && (e.hardSentences += 1),
n.stats.readability === i.default.readability.veryHard && (e.veryHardSentences += 1)

Now I knew that there was a readability parameter in both stats and i.default. Searching the file, I got 40 matches. One of those matches was a getReadabilityStyle function, where they grade your writing.

There are three levels: normal, hard and very hard.

t = e.words;
n = e.readingLevel;
return t < 14
? i.default.readability.normal
: n >= 10 && n < 14
? i.default.readability.hard
: n >= 14 ? i.default.readability.veryHard
: i.default.readability.normal;

“Normal” is less than 14 words, “hard” is 10–14 words, and “very hard” is more than 14 words.

Now to find how to calculate the reading level.

I spent a while here trying to find any notion of how to calculate the reading level. I found it 4 lines above the getReadabilityStyle function.

e = letters in paragraph;
t = words in paragraph;
n = sentences in paragraph;
getReadingLevel: function(e, t, n) {
if (0 === t || 0 === n) return 0;
var r = Math.round(4.71 * (e / t) + 0.5 * (t / n) - 21.43);
return r <= 0 ? 0 : r;
}

That means your score is 4.71 * average word length + 0.5 * average sentence length -21.43. That’s it. That is how Hemingway grades each of your sentences.

Other Interesting Things I Found

  • The highlight commentary (information about your writing on the right hand side) is a big switch statement. Ternary statements are used to change the response based on how well you’ve written.
  • The grading goes up to 16 before it’s classed as “Post-Graduate” level.

What I’m going to do with this

I am planning to make a basic website and apply what I’ve learned from deconstructing the Hemingway app. Nothing fancy, more as an exercise for implementing some logic. I’ve built a Markdown previewer before, so I might also try to create a writing application with the highlighting and scoring system.

Click here to read what I did next and how I implemented the logic I learnt from the Hemingway app.

What have you learnt from reverse engineering a website?

If you’ve done something similar, let me know in the comments. It’s great hearing about cool things that other developers have found.

--

--

Sam Williams
We’ve moved to freeCodeCamp.org/news

I'm a software developer currently building Chat Bots for E-Commerce companies. Outside of coding I love to go Rock Climbing and Traveling.