A blue theater curtain, partially open.

Behind the Scenes of Engramo, Pt. 2

Vojtech Janda
Engramo English Blog
3 min readMay 24, 2021

--

Engramo English is a mobile learning app. It’s the smart solution for the English learner of the 21st century. You download it, open it, and it does its job — it teaches you English. But how does it actually work? What is hidden behind that nice & clean Dashboard?

Corpus Material

We like to point out that the exercises presented to students in Engramo English consist of real sentences. While textbook sentences were used early in Engramo’s development, a decision was made to commit to English as it is really used. The easiest way to do that, instead of, say, scouring newspaper articles manually, is to get access to a corpus of such texts (either by constructing a new one with specialised software tools or by licensing an existing one for commercial use). In Engramo English’s case, several corpora have been licensed (if you are really interested, you can find the full list below the article). Then it’s just a matter of searching through the selected corpora for the desired words, word forms, phrases and structures, which is fairly easy once you learn how to do it. Of course, even in this step, we had to filter things out. Firstly, there was the matter of quality — we wanted sentences from native speakers, but unless they were for special Bits on important informal or “slang” phrases, they also had to come from standard English. Secondly, there was the matter of content — as we want Engramo to be suitable for learners of all ages and persuasions, we had to drop any sentences referring to sexual acts and controversial topics such as current or recent political affairs and excessive violence. Sometimes, in order to avoid reference to particular brands, people, etc. (and thus unauthorised promotion or even slander), we automatically or manually substitute the name(s) with generic counterparts. And then there is the question of context. Quite often we take several sentences as one to form an exercise. But in some cases, having enough context for a sentence would mean taking the whole paragraph from the corpus, which just wouldn’t be feasible. In that situation, we either expand or slightly modify the sentence so that the context is still there, only a little more compacted. In a way, this approach resembles the work linguists do when writing grammar books for academic purposes — they start from the language material to formulate the rules, and while we have taken some shortcuts there, we still look to corpora to validate the grammar we teach or modify it if we find that it has been changing significantly.

List of Corpora used for exercise creation in Engramo English

Araneum Anglicum Maius [2015]

English Broadsheet Newspapers 1993–2013 (SiBol with trends)

English Corpus for SkELL 3.8

English Web 2013 (enTenTen13)

English Wikipedia

EUROPARL7, English [at a smaller scale than the rest of the corpora]

Project Gutenberg English

Ted Talks transcripts [although the description of the individual corpus has since been removed, the corpus should still be available according to SketchEngine’s List of corpora]

Timestamped JSI web corpus 2014–2016 English

--

--

Vojtech Janda
Engramo English Blog

Linguist specializing in usage-based, corpus linguistics & sociolinguistics, English-Czech translator, hobby programmer