How Aadhar number is generated and validated.

sharath krs
5 min readApr 6, 2020

--

Have you ever wondered how Aadhar number is generated in India? How do we take care of random number generation and avoiding of palindrome numbers?

What is an Aadhar card?

The Aadhaar is a unique one-time government-issued identity card that is assigned to all Indian residents. It is a 12-digit random number that records the individual’s biometric and demographic data.

The Aadhaar program was constituted in 2016 when the Unique Identification Authority of India (UIDAI) was set up. All Aadhaar cards are issued through this body, which collects the cardholder’s demographic and biometric data to enable a more streamlined and transparent method of allocating certain government benefits and subsidies to citizens.

Sample of Aadhar card.

There are many benefits of Aadhar Card. We will not go through those things here. We only focus on how the number is generated and validated?

Aadhaar Generation involves process like a quality check, packet validation, demographic and biometric de-duplication etc. Aadhaar is generated successfully only if:

  • Quality of enrolment data meets prescribed standards laid down by UIDAI.
  • The enrolment packet passes all the validations done in CIDR
  • No Demographic/Biometric duplicate is found

So we know Aadhar number consists of 12 digits. In that 11 Digits are uniquely generated and the last digit is the checksum.

How checksum is generated and validated.

So what is checksum?

Generating checksum is done by Verhoeff_algorithm.

The Verhoeff algorithm’s most common usage is in the UIDAI-Aadhaar number generation program.

The Verhoeff algorithm is a complicated one, and cannot be calculated manually. This is suitable for computer-era.

The Verhoeff algorithm, a checksum formula for error detection first published in 1969, was developed by Dutch mathematician Jacobus Verhoeff (born 1927). Like the more widely known Luhn algorithm, it works with strings of decimal digits of any length. It detects all single-digit errors and all transposition errors involving two adjacent digits.

As 100 crores+ Aadhaar numbers will be generated, the Verhoeff Algorithm was the chosen one. And, it is not expected that anyone will try to manually validate the Aadhaar number.

What is Verhoeff Algorithm?

So I will be using javascript for demonstrating Verhoeff_algorithm.

First, we have to consider the multiplication table, The permutation table and The inverse table.

// The multiplication tablevar d = [[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],[1, 2, 3, 4, 0, 6, 7, 8, 9, 5],[2, 3, 4, 0, 1, 7, 8, 9, 5, 6],[3, 4, 0, 1, 2, 8, 9, 5, 6, 7],[4, 0, 1, 2, 3, 9, 5, 6, 7, 8],[5, 9, 8, 7, 6, 0, 4, 3, 2, 1],[6, 5, 9, 8, 7, 1, 0, 4, 3, 2],[7, 6, 5, 9, 8, 2, 1, 0, 4, 3],[8, 7, 6, 5, 9, 3, 2, 1, 0, 4],[9, 8, 7, 6, 5, 4, 3, 2, 1, 0]];// permutation table pvar p = [[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],[1, 5, 7, 6, 2, 8, 3, 0, 9, 4],[5, 8, 0, 3, 7, 9, 6, 1, 4, 2],[8, 9, 1, 6, 0, 4, 3, 5, 2, 7],[9, 4, 5, 3, 1, 2, 6, 8, 7, 0],[4, 2, 8, 6, 5, 7, 3, 9, 0, 1],[2, 7, 9, 3, 8, 0, 6, 4, 1, 5],[7, 0, 4, 6, 9, 1, 3, 2, 5, 8]];// inverse table invvar inv = [0, 4, 3, 2, 1, 5, 6, 7, 8, 9];

In below function validateAadhar we handle all the errors and only take numbers as input

function validateAadhaar(aadhaarString) {if (aadhaarString.length != 12) {return new Error('Aadhaar numbers should be 12 digit in length');}if (aadhaarString.match(/[^$,.\d]/)) {return new Error('Aadhaar numbers must contain only numbers');}var aadhaarArray = aadhaarString.split('');var toCheckChecksum = aadhaarArray.pop();if (generate(aadhaarArray) == toCheckChecksum) {return true;} else {return false;}};

Basically, if your Aadhar number is 1234 5678 9123

3 is the checksum so we have used pop to take last digit

var toCheckChecksum = aadhaarArray.pop();

Below is the snippet of generating the checksum and in a validate function we verify the checksum and last digit. If both are same we can assume the Aadhar is validated

// generates checksumfunction generate(array) {var c = 0;var invertedArray = array.reverse();for (var i = 0; i < invertedArray.length; i++) {c = d[c][p[((i + 1) % 8)][invertedArray[i]]];}return inv[c];}

Full code can be found in GitHub link.

Some other thoughts.

UID number was intended to be only numeric and not say alpha-numeric so that people require minimal literacy (number knowledge only) to relay/remember their UID numbers.

Any person’s identifying information (such as date of birth or name) was intentionally not used in the number in order to prevent any stereotyping or in anyway guessing the background of a person from his/her number.

This was the reason why the first 11 digits was made random. Aadhar number used one real random (with random bits generated from such things like atmospheric noise: Like found here) generator and multiple pseudo-random generators with periodic re-salting to create numbers that have a fairly uniform distribution in a number space of about 8 billion numbers — it is 8 billion and not 10 billion as the first two digits i.e. numbers starting with ‘0’ and ‘1’ are reserved for future expansion. (I personally hope however that India’s population never crosses that mark!).

UID numbers need to be unique as well. The random number generation part of the algorithm can produce duplicates — say one in a few thousand and with no real predictable frequency. So, I think they perform a uniqueness check-in post-processing the generated numbers. They might use a file-based unique index and also an alternate bit-array based implementation to do this. Other additional filters that run in the post-processing stage also includes logic to remove/reject “nice” looking numbers — ever wondered why no one gets UID numbers with repeating sequences, palindromes or the like? The reason Might be to ensure no one even attempts an entitlement route to get special numbers. The randomness and democracy of number allocation doesn’t end here. Even the allocation to enrolments that are processed in the system (say 12 Aadhaars get generated every second) happens in a first-come-first-serve basis — sort of like getting the next available token when you walk into a bank.

Above algorithm is also used in PAN card, credit card number generation (Although pan card have preset numbers for individual and business).

--

--