Aadhaar Card Verification and Information extraction from Front & Back Side using AI-OCR Tool.

Shiva Thavani
Analytics Vidhya
Published in
4 min readMar 6, 2021

In the previous article, we validated and extracted text from the front side of the Aadhaar card, Now in this tutorial, we will see how we can validate and extract useful information from the backside of the Aadhaar card.

Link of the previous article —

Importance of Aadhaar Card

An Aadhaar card is a unique number issued to every citizen in India and is a centralized and universal identification number. The Aadhaar card is a bio-metric document that stores an individual’s details in a government database and is fast becoming the government’s base for public welfare and citizen services.

So, In this article, we will focus on the verification and information extraction from the backside of the Aadhaar Card as it is also an important step in validating the document.

Every document has some unique features which makes it different from other documents. Distinguishable features of Aadhaar Card are -

  1. State Emblem of India
  2. GOI Symbol
  3. QR Code
  4. Aadhaar Logo
  5. UIDAI Symbol
Identifying Features of an Aadhaar Card

Now, we have identified unique features so our next will be to train an Object Detection Model to identify these features and validate them. I will not go into much detail on how to train an object detection algorithm as I have covered it thoroughly in the last tutorial kindly refer to that :)

I have used TensorFlow 2 Object Detection API for training and validation purposes, you can use other deep learning frameworks like PyTorch, Keras depending on your requirements and tradeoffs.

STEP 1: Verification of Document

The Object Detection model will be used to verify whether the input document is a valid Aadhaar or not if it is, we will proceed to the next step, or else the document will be declared invalid and the process will end.

STEP 2: Extracting Data using OCR

After it is verified that the submitted document is an Aadhaar then information present on Aadhaar will be extracted by the means of Optical Character Recognition (OCR). This information will mainly contain address of the user, which can be verified from bank’s database.

Here is the process flow of the complete Algorithm from detection to validation and extraction of information

Process flow Aadhaar AI-OCR & Computer Vision Tool
Process flow Aadhaar AI-OCR & Computer Vision Tool

Now, our AI-OCR and Computer Vision tool is ready. Let’s see at some outputs.

Unique features of the Aadhaar Card Front (left) & Back (right) detected by Object Detection Algorithm.

Textual Information written on the Document is also extracted from both sides and is saved in a .txt file, which can be further used for the validation and verification process.

Final Outcome

This AI-OCR tool is useful for all the financial institutions as KYC has been mandated by the Reserve Bank of India (RBI) and especially in the post-COVID world where all efforts are being made to reduce Human to Human interaction, so this tool will resolve both the issues and help financial institutions to ease up the whole process efficiently.

Tools & Technologies Used

Python — Most suitable programming language for carrying out all AI tasks.
Google Cloud OCR — To extract the text from the Aadhaar card & validate it.
Tensorflow — To train our ML model on Aadhaar features.
Labellerr — To annotate the images for training a Model
OpenCV — To pre-process the images and make their format suitable to proceed onto training step.
Docker — To containerize the whole application and deploy it on cloud platforms.

Technology Stack
Technology Stack

About Me

I am a passionate programmer who is willing to explore chores out of his comfort zone; from developing challenging large-scale software to small weekend hackathons. For my daily routine am pursuing Computer Science Engineering from Thapar Institute of information and technology TIET.
Connect with me on LinkedIn

--

--