Hashing? Why? How?

Sharon Abishek
featurepreneur
Published in
4 min readFeb 19, 2021

What is it and where is it most widely used?

Hashing is the process of converting an input of any length into a fixed sized value using a mathematical function. The algorithm changes the message to be hashed, which is called the input into its corresponding hash value. This algorithm is generally called a hash function.

One major characteristic of hashing is that each hash value or output has to be unique, which means it must be impossible to produce the same hash value for different inputs. Hence, a same input message must always produce the same hash value.

The hash function should be quick in producing a hash value.Hashes are majorly used in blockchain to ensure it’s immutability. The amount to be transferred, the senders and receivers of the transaction are considered as inputs and a corresponding hash value is generated, which can be used to confirm and identify the transaction.

Photo by Ewan Kennedy on Unsplash

Why do I need hashing in data structures?

Hashing involves storing and retrieving of data with less key comparisons and the goal of hashing is to find the required target data as efficiently as possible.The time complexity reduces to 0(1) for a hashed search whereas for a sequential search it is 0(N) and for binary search 0(log₂N) where N is the total number of key value comparisons that needs to be traversed before the target data is found.Since in hashing each key has an unique address, we can directly go to the specified address to find the target value thus reducing the number of traversals and thereby reduce the time complexity. Dictionaries are built upon hash tables.

Hash tables are a way to cut down the amount of nil values and also to store the information in such a way that it is easily accessible. Lets talk about an example to understand the tragedy that can be avoided using hash tables!

Let’s assume there are a few key-value pairs that have to be stored in such a way that the index and the key of the key-value are the same so that we know exactly where they are, reducing time complexity . So, the 0th position would have the key with the integer value 0 and so on. This might seem like a great way to back our dictionary in memory, but the main problem awaits. For any position that we don’t have a key for, is stored as “nil”.

visualization of the example

So when the size of the array increases, to say a billion i.e., we have a set of 10 key values randomly spread from 1 to a billion, so now we are left with 9,999,990 Nil values! This is indeed a problem and has to be solved.

This is why we need hashing and hash tables.

So how does hashing really work?

Taking the same example, using a hash table we would be able to store the 10 keys ranging from 1 to a billion within a table of just 10 positions/elements! This is achieved using hash functions mentioned earlier. A hash function will take all the keys for a given dictionary and strategically map them to an index location in an array, so that it can later be retrieved easily. Basically a hash function determines what index location to store to store a particular key at.

A good hash function for this case might be to find the power of tens in the key value.(1 can be represented as 10⁰ ,thus 0th index, similarly 100 can be represented as 10²,hence the index position 2)

saved 9million nil values :D

Now we have consolidated the 10 key values ranging to a billion within 10 index positions. This example clearly illustrates the need and power of hashing. Some other simple yet famous hash functions include kmod10, kmodn, folding, midsquare method and so on! An ideal hash function is that which produces unique hash values for different inputs quickly. No two key values should be hashed to the same index (hash value), but in practical scenarios, hash functions might sometimes produce the same hash value for different inputs which leads to collisions.

Thanks for reading.

--

--

Sharon Abishek
featurepreneur

Trainee Decision Scientist @ Mu Sigma. Data enthusiast and an Electronics engineer.