What is CAPTCHA?

Thinkestry
Thinkestry
Published in
5 min readMay 4, 2020

Everyone using the internet would have faced this at least once.

You may have seen a small part of the page asking you to solve a small arithmetic problem or asks you to write down a distorted text in an image or click few images that are traffic lights, etc or to tick a box to verify you are not a robot, in order to move forward with your activity in that site.

They are called CAPTCHA — Completely Automated Public Turing Test To Tell Computers and Humans Apart. It is a program that protects web pages from bots or other unusual programs that can cause some undesirable effects.

It may seem ironic that a machine is verifying that you are not a machine.

A study found that a person is spending approximately 10 seconds to solve a CAPTCHA, It may seem time consuming for a person to pass through, but it is having its own uses.

Initial stages of CAPTCHA and still some old sites involve a user to do simple arithmetic calculations between two or three numbers where at least one number out of them is in a distorted format or an image that makes sure bots cannot understand them.

CAPTCHAs are slowly evolving to take very less time from the user to pass through while still fully capable of avoiding bots.

reCAPTCHA which is now owned and maintained by Google, is called as reverse CAPTCHA. This is the common type of CAPTCHA that you may see in modern web pages now.

How CAPTCHA is helpful?

  1. A lot of email services like gmail, yahoo, etc. offer free email services to general public. Before CAPTCHA was implemented they suffered attacks from bots creating multiple email IDs for spamming purposes. CAPTCHAs made sure that free email IDs are generated only for humans.
  2. CAPTCHAs help in avoiding spam comments in blogs that redirect to other sites to generate traffic for the spammer. A commenting action is sometimes verified with CAPTCHA to make sure that it is a human and not a bot which is automated to comment.
  3. CAPTCHA helps in avoiding spammers to get access to email IDs that are publicly listed in websites. Spammers use to crawl websites to get email IDs to which they can spam to. CAPTCHAs helped in minimizing the spam mails to those publicly listed email IDs by revealing the email ID only after a CAPTCHA verification.

There are multiple use cases like these where CAPTCHA is making sure that bots are not accessing information.

A spammer can access these directly, but that is a huge process to get to a minimum viable point to start his spamming activities. CAPTCHAs are making sure that spamming activities in the internet are comparatively very less.

reCAPTCHA in Digitizing Books

reCAPTCHAs other than filtering out bots accessing web pages, is having a side purpose that helped in digitizing books.

Book digitization process involves scanning books from page to page and then converting it to text format to store them for a very long time without any degradation that can happen to physical copies.

Digitization helped in sharing knowledge much easier and faster as use of internet was rising tremendously. It also helped in increasing the availability of those books as they can easily be copied and shared over internet.

Digitization process is not always cent percent accurate, separate sub processes should be carried out to ensure its accuracy. While converting scanned copies to text format, computers and programs faced issues to understand some words and interpreted them wrong.

reCAPTCHA took the help of people indirectly. reCAPTCHA while verifying, shows 2 images as shown in below example reCAPTCHA image.

One out of these words, one is a control word, where reCAPTCHA has successfully interpreted the word before. The other one is a scanned image of a word from some book, where the program has doubts on its interpretation accuracy.

When the control word is entered correctly, it assumes that the next word is also entered correctly, provided the result meet its accuracy expectations. The same word is shown to multiple people and the same procedure is followed. When a same answer is obtained frequently, the reCAPTCHA finds the correct interpretation of the scanned word and corrects the same in the digitized book from where it is taken.

Note the tag like of the reCAPTCHA trademark in the image above, It is

stop spam. read books.

This process helped Google a lot in digitizing their books available at Google Books.

reCAPTCHA in Understanding Maps

Google used the same idea that it followed in digitizing books to understand maps. The recent versions of reCAPTCHA involves people to select multiple images of specific things like traffic lights, cars, trucks, buses, zebra crossings, etc.

Out of these images, some are control images where the program has successfully interpreted the things in them before, while the others are shown to improve the accuracy of interpretation that can help in improving maps and street view for Google.

The same technique is followed here, as explained before in the books digitization. For example, when a person selects multiple images that contain bus from the reCAPTCHA, some bus image are accurately interpreted by the program before, while some are placed there with less accuracy where human interaction is needed to confirm that it is a bus.

When the control image is spotted correctly, it assumes the other images are marked correctly, provided the result meet its accuracy expectations. The same image is shown to multiple people and the same procedure is followed. When a same images are frequently spotted for the mentioned parameter, the reCAPTCHA understands the correct interpretation of the thing in the image and updates the same in the maps data.

Although CAPTCHAs may take some time, it ultimately combines those small efforts of all humans to ensure internet is comparatively safe and is highly put to use to help humanity.

Little drops of water make a mighty ocean.

Join our email list: https://www.thinkestry.in/subscribe

Check out our other mediums: https://linktr.ee/thinkestry

--

--

Thinkestry
Thinkestry
0 Followers
Editor for

We are trying our best to make knowledge available free of cost and are striving hard to help this world and the upcoming generations to leverage science.