Aadhaar Card Verification and Information extraction from Front & Back Side using AI-OCR Tool.
In the previous article, we validated and extracted text from the front side of the Aadhaar card, Now in this tutorial, we will see how we can validate and extract useful information from the backside of the Aadhaar card.
Link of the previous article —
Importance of Aadhaar Card
An Aadhaar card is a unique number issued to every citizen in India and is a centralized and universal identification number. The Aadhaar card is a bio-metric document that stores an individual’s details in a government database and is fast becoming the government’s base for public welfare and citizen services.
So, In this article, we will focus on the verification and information extraction from the backside of the Aadhaar Card as it is also an important step in validating the document.
Every document has some unique features which makes it different from other documents. Distinguishable features of Aadhaar Card are -
- State Emblem of India
- GOI Symbol
- QR Code
- Aadhaar Logo
- UIDAI Symbol
Now, we have identified unique features so our next will be to train an Object Detection Model to identify these features and validate them. I will not go into much detail on how to train an object detection algorithm as I have covered it thoroughly in the last tutorial kindly refer to that :)
I have used TensorFlow 2 Object Detection API for training and validation purposes, you can use other deep learning frameworks like PyTorch, Keras depending on your requirements and tradeoffs.
STEP 1: Verification of Document
The Object Detection model will be used to verify whether the input document is a valid Aadhaar or not if it is, we will proceed to the next step, or else the document will be declared invalid and the process will end.
STEP 2: Extracting Data using OCR
After it is verified that the submitted document is an Aadhaar then information present on Aadhaar will be extracted by the means of Optical Character Recognition (OCR). This information will mainly contain address of the user, which can be verified from bank’s database.
Here is the process flow of the complete Algorithm from detection to validation and extraction of information
Now, our AI-OCR and Computer Vision tool is ready. Let’s see at some outputs.
Textual Information written on the Document is also extracted from both sides and is saved in a .txt file, which can be further used for the validation and verification process.
Final Outcome
This AI-OCR tool is useful for all the financial institutions as KYC has been mandated by the Reserve Bank of India (RBI) and especially in the post-COVID world where all efforts are being made to reduce Human to Human interaction, so this tool will resolve both the issues and help financial institutions to ease up the whole process efficiently.
Tools & Technologies Used
Python — Most suitable programming language for carrying out all AI tasks.
Google Cloud OCR — To extract the text from the Aadhaar card & validate it.
Tensorflow — To train our ML model on Aadhaar features.
Labellerr — To annotate the images for training a Model
OpenCV — To pre-process the images and make their format suitable to proceed onto training step.
Docker — To containerize the whole application and deploy it on cloud platforms.
About Me
I am a passionate programmer who is willing to explore chores out of his comfort zone; from developing challenging large-scale software to small weekend hackathons. For my daily routine am pursuing Computer Science Engineering from Thapar Institute of information and technology TIET.
Connect with me on LinkedIn