DuoLingo: a (better than!) text book approach to practical spaced repetition

Published in

Ben and Dion

6 min readFeb 1, 2016

I am a big believer in spaced repetition as a scientifically proven method to help you learn both effectively and efficiently.

There are several implementations, and one of the most basic and famous ones is the Leitner system, which is a great way to explain the point. Rather than going through a series of flash cards in serial, repeating each of them again and again, why not consider whether you got the question correct or not? When you get the correct answer the question goes to the next level of boxes, which isn’t gone through as frequently. Knowledge that you grok naturally goes further back in the line and you don’t have to be quizzed on it as frequently. The minute you get it wrong however it comes back up.

The most efficient system will know when to ask you so perfectly that you will struggle to answer, but will eventually come up with the right answer. This may not be exactly what you want however, as you may want some knowledge at the tip of your fingers.

There has been a lot of work on various algorithms, and computer systems have been developed to further the cause. The prototypical example of this is the SuperMemo software. Pavel Janeka not only built and iterated on this software over decades, but he published information on each algorithm, and then showed what seemed to work well and what didn’t.

Software such as the original SuperMemo, and the currently popular open source Anki, rely on you to rate how well you knew the answer, to feed the right information back to the system.

Was the answer immediately available in your memory? Did it take a sec? Did you have to really think about it? Once you saw the answer did you think “oh crap, right!” or was it a “huh, really? I had no idea!”. These could have different effects on when to next ask you the question.

A common misunderstanding in learning something is that it is ideal to know the answer immediately. Instead, it has been shown that for long term learning you want to have to think and generate the answer. This is known as the generation effect. Fantastic studies have shown that subtle changes to how you quiz have large effects. For example, simply asking a subject to fill in a word’s missing letters resulted in better memory of the word (crosswords anyone?).

There has been a backlash against standardized testing, but the big mistake is in how we focus our quizzing on assessment rather than using them as a fantastic learning tool.

If you are interested in lifelong learning, or how to help your kids not get stuck in the trap where we think we are learning but in fact we are just binging information that we will forget a week later, check out Make It Stick.

Where DuoLingo Comes In

With all of that context we come back to the product at hand. Once you learn about space repetition you quickly see how DuoLingo offers a user experience that hits the research without the user haven’t to answer how well they knew something (which can be flawed in of itself!).

Let’s look at the different types of question and answers and how the system gathers information.

Choosing Pairs

This type of question offers words in two languages that you need to pair up. If you get one wrong you aren’t finished, but the system can:

Remember which words you may have trouble with (I say “may” because I have fat fingered in the past)
Make sure to quiz you on some of these items again sooner than if you nailed them
Take the time to answer into account. If you are quickly matching that tells you that it is tip of the tongue whereas a more labored approach may mean that you are doing more generation of answers (or you may have a distraction or had to go to the bathroom…)

Translation: Typing or Picking

When asked to translate something into your language of choice there are a few pieces of information available:

Some words have dotted underlines that you can tap on for a hint. The act of doing this shows the system that it wasn’t top of mind
If you are struggling with typing the answer the system can convert to multiple choice to give you a last change to generate the answer. At this point the system also knows that even if you get it “right” you don’t know it as well.

Listening

When listening to the language and then typing in what you hear there is the option to repeat the phrase, and to repeat it slower. If you need to repeat it, the system may know something, but if you repeat it slowly it really knows that you didn’t quite get it.

Grouping of questions

The grouping of questions matters. If you go into a block of questions on the “past tense” then you have content that can make it a lot easier to get the right answer. You get used to the patterns again as you go through. This isn’t all bad, because it allows you to tie together this knowledge and its patterns. However, we always need to remember why we are trying to learn something and get as close to that environment as possible. We probably won’t be stuck in conversations that are limited to these groupings.

“We practice as we want to play, so we can play like we do in practice.”

The Chicago Blackhawks famously made changes to how they practiced ice hockey. They got away from their regular blocks “on Thursday’s at 2pm we do passing like this…” and randomized a lot more. Just when they were getting into the flow they would switch. At first the players hated it and they thought they won’t progressing, but this shows you one of the frustrating tricks… we aren’t good judges on how we are progressing. When we cram we feel confident that we are learning, but this is because it is shallow learning. We are familiar with the content, but this doesn’t mean that we know it.

DuoLingo has another option to allow for this, the “Practice Weak Skills” button that will feed you questions across the board rather than just in one of the groups. It won’t be totally random, and you will still get to fire neurons close to each other.

There is a slew of research on how we learn, and it is criminal how little we put into practice. It is a real pleasure to have seen DuoLingo iterate over the years that I have been using it, and seeing how it applies this research in clear usable ways.

With systems like DuoLingo we can measure large populations learning, in this case languages. We will be able to tweak our learning and prove out hypothesizes. We will be able to better personalize these learnings to you the learner. I can’t wait to see how we progress.

Have you seen other examples of spaced repetition or other practical implementations of learning research?