Hash Map and it’s implementation in Java

Chinmay Venkata Sabbam
art of coding
Published in
6 min readSep 22, 2019

--

1. Hierarchy of HashMap In Java

Hierarchy of Hash Map in collections Frame Work

2. Prerequisites for Internal Working of Hash Map

Before we need to learn about the internal working of the Hash Map. First we need to know about the Highlights of hash Code Method in java and also How hashing is implemented in java .

2.1 Highlights of hash Code Method

  1. For any object in java we are having the default Method hashCode()
  2. The default implementation of hashCode() in Object Class returns distinct integers for different objects.
  3. The hash code is depend up on the two parameters

(1) Object type

(2) content with in the Object

4. Even one of these parameters is changes, hash code of an object is changed

5. Let’s see the below Example , here object type (string) and content with in the Object is also same for these two objects. So it gives a same hash code for these two objects

Hash code generation for string s1 and s2 Objects

Advantage : We are converting every Object in to the hash code, so that we can easily perform the any Operations on Objects, like sorting etc based on the hash code.

2.2 Hashing In Java

  1. We have to create the Number of Buckets
  2. Each bucket consists of Hash index starting from the zero
  3. For Each Objects which are participated in Hashing is converted in to the Hash Code
  4. Hash index of every Objects is calculated using the below formula
Hash Index = Hash Code % number Of Buckets

5. Objects which are having similar index is stored in one bucket (Generally Linked List Implementation). This is how the hashing is done in java

3. Internal Working of Hash Map

3.1 Node Structure in Hash Map

Node in Hash Map consists of hash code,key,value and Next Node(i.e Means it is using linked list data-structure internally.

Node structure in Hash Map

3.2 Buckets in Hash Map to implement Hashing technique

  1. It is can create a number of Buckets(initial capacity of Hash Map) of size 16 which is starting from the hash index 0 to 15 by default
Node<K,V> [] table

2. Buckets are used to store nodes. Two or more nodes can store in the same bucket . In that case link list structure is used to connect the nodes.

3. Buckets are different in capacity. A relation between bucket and capacity is as follows:

capacity = initial Capacity * load factor

Default initial capacity of the Hash Map takes is 16 and load factor is 0.75f (i.e 75% of current map size). The load factor represents at what level the Hash Map capacity should be doubled.

For example product of capacity and load factor as 16 * 0.75 = 12. This represents that after storing the 12th key — value pair into the Hash Map , its capacity becomes 32.

4. A single bucket can have more than one node, it depends on hashCode() method. The better your hashCode() method is, the better your buckets will be utilized.

3.3 Put Method Operation Implementation In Hash Map

3.3.1 Algorithm of the put operation

Algorithm for the Hash Map Put Implementation
  1. Input: Key(K), Value(V)
  2. step no: 1 : computes hash code for the Key you provided
  3. step no: 2 : calculate the hash Index for every key based on hash code value using below formula
hash Index = hash code % (size of the buckets - 1)

4. step no : 3 : Nodes will be formed based on the Key,hash Code,value,next Node

5. step no: 4 : place the nodes in the Particular hash index of the bucket
if same Nodes have a similar Hash Index (collision nodes) will be placed as a Linked List.

Note: Hash Map has the ability to store key as null. it takes hash Index 0

3.3.2 Example to Understand Put Algorithm

  1. consider the Map

Map<String,Integer> map = new HashMap<>();
map.put(“KING”,100);
map.put(“CLARK”,100);
map.put(“BLAKE”,100);
map.put(“FORD”,100);
map.put(“SMITH”,100);
map.put(“WARD”,100);
map.put(“JONES”,100);

2. After calculation of hash index based on hash code for every key

hash Index was calculated for every Key

3. Nodes will formed and placed in the bucket based on Hash Index

placing Nodes based on the hash Index

observe keys KING,BLAKE,WARD having similar hash Index 4 , the nodes which are having similar hash Index are called collision nodes and they are placed as Linked list data structure as shown in above figure

3.4 GET Method Operation Implementation In Hash Map

3.4.1 Algorithm of the get operation

Get operation algorithm
  1. step no: 1 : Input: Key(K), computes hash code for the Key you provided
  2. step no: 2 : calculate the hash Index for every key based on hash code value using below formula
hash Index = hash code % (size of the buckets - 1)

3. step no: 3: it goes to the hash index and it compares the calculated Hash code of the Key Object to the Hash code of the Node present in the hash index of the bucket .

4. step no : 4 : if it matches , compare the content using equals Method and it returns the value

5. step no : 5 : if it does not matches, it goes for the next node which is stored in the form of the linked list in the same hash Index until it matches with Hash Code.

3.4.2 Example to Understand get Algorithm

  1. consider the same map which we used to understand the put algorithm.
  2. if we are fetching the Key-map.get(“WARD”). it calculates the hash code and hash index of WARD Key . it gives hash index 4, it compares the hash code of WARD key to KING and BLAKE Nodes first. it does not found hash code of these Nodes matches with hash code of WARD key. later it compares with WARD Key Node and it returns the value.

4. Improvement in JAVA 8

1.when we have too many unequal keys which gives same hash Index.then the number of Nodes in that particular hash Bucket is increases

2. When the number of Nodes in a particular hash Bucket grows beyond the threshold(TREEIFY_THRESHOLD = 8). Content of that bucket switches from using a linked list of entry Objects to the BALANCED TREE

3.Balanced Tree improves the worst case performance o(n) to o(logn)

4. Based on the hash code values Nodes are distributed in the balanced tree

5. If two hash codes are same , then it sees the key Length

Nodes distributed based on hash code values in the balanced tree
sample Example of balanced tree

5. Performance of Hash Map

The performance of hash Map is depend up on the initial Capacity and the load Factor

We can alter the hash Map through constructors provided to it

Hash Map provides 4 constructors and access modifier of each is public:

  1. HashMap(): It is the default constructor which creates an instance of HashMap with initial capacity 16 and load factor 0.75.
  2. HashMap(int initial capacity): It creates a Hash Map instance with specified initial capacity and load factor 0.75.
  3. HashMap(int initial capacity, float loadFactor): It creates a Hash Map instance with specified initial capacity and specified load factor.
  4. HashMap(Map map): It creates instance of HashMap with same mappings as specified map.

6. Impact of load Factor :

1.The load factor is a measure of how full the hash table is allowed to get before its capacity is automatically increased. Default value is 0.75

2.the default load factor (.75) offers a good trade-off between time and space costs.

3. Higher values decrease the space overhead but increase the lookup cost (reflected in most of the operations of the Hash Map class, including get and put).

4.The expected number of entries in the map and its load factor should be taken into account when setting its initial capacity, so as to minimize the number of rehash operations.

5. If the initial capacity > (maximum number of entries in map/ the load factor), no rehash operations will ever occur.

--

--