More about Duolingo, its founder and the story of CAPTCHA

Shengyu Chen
5 min readJun 27, 2020

--

Last week, I wrote about how impressed I was with Duolingo’s new feature and how they were able to convince me to pay a premium subscription. You can read a little more about it here. This week, as I was drawing and listening to Podcasts, I stumbled across this gem (you can listen to this here):

I love serendipitous discoveries like this. So I gave it a listen and I figured it is definitely worth your time if you want to learn little more about the app and its founder. I am going to make a quick summary on what impressed me the most about the founder here in this blog:

Recaptcha story

So it turns out the founder of Duolingo also birthed the idea of CAPTCHA. As much as I love the app Duolingo and its ability to bring free education to everyone. There’s something special about the creation story of CAPTCHA and its successor reCAPTCHA. Let me quickly summarize what impressed me the most about this story. (You should definitely listen it yourself if you don’t want to miss out on this).

For those who don’t know exactly what CAPTCHA or reCAPTCHA is, here you go:

CAPTCHA (left) & reCAPTCHA(right)

Back when Yahoo was the darling of the tech world (late 90s), Yahoo had a spam problem where hackers were creating millions of email addresses everyday to send out ads to folks with emails. They did with code and it was difficult to stop them. Yahoo had a hard time dealing with this issue. The chief scientist at the time came to Carnegie Mellon and gave a talk over this. Luis and his advisor were among the attendants. They decided to work on this and were able to identify the core problem behind this was Yahoo’s inability to distinguish computers and humans. Luis and his advisor focused on this. The solution was CAPCHA. The best image recognition software had a ton of issue of identifying letters from images. Humans had little problem. That was the winning idea.

Soon after this idea was introduced, it gained adoption all over the web because it was so effective. Luis essentially gave this idea out for free. He later was able to identify a way to monetize this by getting deals from New York Times to digitize their backlog of articles in the past 100 years. The way to do this was to feature the hard to identify words by the machines and gave those to users who wanted to register an account of sorts.

This was an ingenious discovery. I really loved how he found this idea. When CAPCHA was rolled out to virtually all the websites, users were essentially spending a 30 seconds on average to fill out this form. Luis also heard a lot of complaints from his friends at parties who really detested it (I hope in a joking way). Luis did a back of the envelope calculation. He figured out that there were millions of humans who do this everyday (wasting 30 seconds). That’s collectively hundred thousands of hours human labor wasted to do this. He thought there was definitely a way to make more meaningful use of this time.

One day, during his drive, it just came to him. He was able to relate this to the movement around digitization of the world’s libraries. It was essentially the same problem. The most state of the art programs wasn’t able to identify 30% of the scanned letters from books. And that became the problem CAPTCHA was able to solve. So Luis would arrange all the unidentifiable letters from the digitization efforts and use them in the CAPTCHA prompts. The most interesting part of this was that the first major website that contributed significantly to the digitization of all of our libraries were the users of onlinebootycall.com. (When I listened to this part, I really lmaoed).

So put it another way, the users who were finding bootycalls accidentally advanced the progess of digitization of all of our books. The irony is simply way too strong wtih this one. After this, the rest was history.

Pricing, Monetization and the system to figure out when to send notifications Duolingo

The few facts to take away from Luis’ mission of rolling out free education to everybody was about Duolingo’s paying users and its revenue. In case I forget about this. Since this Podcast was recorded around March, I assume the numbers Luis disclosed are still accurate:

  1. Duolingo makes 600,000 dollars every day (That’s around 219 million dollars in annual revenue). From the way Luis was talking about it, it seemed that 20% of the revenue came from ads and 80% came from premium subscription.
  2. Duolingo has 40 million active learners everyday.
  3. Duolingo has 20% users from United states
  4. In the early days, Duolingo wanted to make the app free for everyone but they quickly ran into trouble because the company just couldn’t stay afloat. It first introduced ads model but later added premium model because users complained and wanted to pay to remove the ads. The premium model is much much more lucrative than ads.

On the other hand, the feature that Luis was most proud of was the intelligent notification system. A few months back, I wrote about Pinterest’s targeted notification system here. (John Egan who ran that program also commented that this targeted notification system is one of the most effective programs he ran. They were among the few that actually in the industry who successfully implemented the intelligent notification system). So given I learned about Pinterest’s example, I was more than thrilled to hear another success story for this by Duolingo.

Luis mentioned that they want the users to use Duolingo to use it every single day. (In the Podcast, you can find it at around the 59:11 minute mark). That was the only way to turn the usage into a habit. The way they implemented this was through AI backed notification systems. The AI backed system figures out when to send the notifications and what to say in that notifications. One of the examples was if the user fails to log in after five consecutive notifications, Duolingo will pop up a notification that says we are going to stop sending you notifications because it is not working. This became an extremely powerful way to get people to come back.

After hearing about this, I definitely need to catch up with the folks at Duolingo and find out more about it. Hopefully I can learn about it and use it in my work.

--

--

Shengyu Chen

Doing to think better, writing to remember. Sharing makes me feel that I am working on things bigger than me. #build #create