How a hash-based collection works?

Apoorvakapoor
Xebia Engineering Blog
3 min readNov 15, 2019
Photo by Patrick Fore on unsplash

Before explaining how hash-based collection works, I will explain what is hashing and how it came into the picture. So let’s start with what is hashing -

Wikipedia states The term “hash” offers a natural analogy with its non-technical meaning (to “chop” or “make a mess” out of something), given how hash functions scramble their input data to derive their output. In simple words, Hashing is a process of converting the input of any length into a fixed-size string of text(Hash value) using a mathematical function (Hash Function).

One of the most popular uses of hashing is in search algorithm implementations, It helps to narrow down our search. Let’s consider a problem - the book call number. In a library there are plenty of books, to find a book from the library in the first attempt hashing provides a solution, Each book in the library has a unique call number take it hashcode. A call number is like an address, it tells us where the book is located in the library. Hashing is a technique to make things more efficient by effectively narrowing down the search at the outset in O(1) time complexity.

To understand how hash-based collection works we need to understand -

What is hashCode();?

The hashCode() method is used to get the hash code of an object. hashCode() method of Object class returns the memory reference of an object in integer form. It is possible to provide your own implementation of hashCode().In HashMap, hashCode() helps to calculate the bucket and therefore calculate the index where the element is placed. This method must be overridden in every class which overrides equals() method.

Object.equals() - The equals() method is used to compare equality of two Objects.The java.lang.Object.equals(Object obj) indicates whether some other object is “equal to” this one.

Java Hashed Collection

HashMap is one of the hash-based collections. It stores data in key-value pair format. It uses a single linked list called Bucket to store data in HashMap. Each bucket contains the information mentioned below :

  • Hash
  • Key
  • Value
  • Next

HashMap uses put method to store data and perform following steps internally:

Map<String,String> map = new HashMap<>();map.put(“key1”,”value1”);

HashCode of the key1 is generated and used to determine the bucket that will be used to store the mapping. First, calculate the index by taking hashcode of the key1 and n (as the number of buckets or the size of an array).

index = hashCode(key) & (n-1)

Once a bucket is identified, A node is created with the hash value, key, value and null as the address of the next element, only if the same key is not present at the same index. If the same key is present at the same index it will override the value with the existing one and if keys are different this is called collision in HashMap. In this case, a linked list is formed at that bucket location and a new entry is stored as the next node. In Java 8 linked list is replaced by a balanced tree.

HashMap uses get method with the key objects to retrieve corresponding value from hash-based collection.

map.get(new Key(“key1”));

It calculates the hash code of key and then find its index by using the same formula, Then it goes to the particular index and compare the key present. If both are equal then return the value otherwise, check for the next element if it exists.

--

--