Frequently Asked Java Concept Programs-Part3-HashMap Internal working

Published in

Javarevisited

13 min readMar 16, 2024

Dear Readers , this is the most common java interview question. If you consider yourself an experienced java developer , knowing hashmap internal working is very important.

In this article we explore what are the possible questions interviewer might expect.

Why to overload equals() and hashcode() method?
What happens if we dont override equals or hashcode method?
what are the Java8 changes to Hashmap?
When to use Hashmap over other collection?
Why Hashmap allows null key?
How to avoid collision in Hashmap?
Can HashMap be used in Multi-Threaded environment?
Why should the key be immutable in HashMap?

Hashing:

Hashing is a process of converting an object into integer form by using the method hashCode().This helps in indexing and for faster search.

Hashmap:

Its a part of Java collection framework.It uses a technique called Hashing.It extends AbstractMap that implements Map interface.Its implementation is based on Hashtable datastructure.

It stores the data in the pair of Key and Value. HashMap contains an array of the nodes, and the node(inner static class) is represented as a class. It uses an array and LinkedList data structure internally for storing Key and Value. There are four fields in HashMap.

Buckets: Array of the node is called buckets. Each node has a data structure like a LinkedList. More than one node can share the same bucket. It may be different in capacity.

Methods of HashMap

The initial capacity of HashMap is 16 and where as load factor is 0.75.

The initial capacity is the number of buckets in the hashmap and the load factor is a measure of how full the hash map is allowed to get before its capacity is automatically increased.

When we try to put a Key and Value within HashMap, first the equals method will check whether both objects are equal and then it hashes and find the bucket where to insert this Key and Value. If both the objects are equal, it will override the already existing Object in Bucket.

equals(): It checks the equality of two objects. It compares the Key, whether they are equal or not. It is a method of the Object class. It can be overridden. If you override the equals() method, then it is mandatory to override the hashCode() method.

hashCode(): This is the method of the object class. It returns the memory reference of the object in integer form. The value received from the method is used as the bucket number. The bucket number is the address of the element inside the map.
Note : Hash code of null Key is 0.

put(key, value)- Inserting a Key-Value Pair

When we initialize a HashMap, the Buckets or array are initialized. Calling put(key, value) to insert the content following things happens.

It generates the hash of the key object with the inbuilt hash function to determine the index number or the bucket. It generally looks like this.

index = keyObject.hashCode() & (numberOfBuckets -1)

Note: The Hash code of the key may be large enough to create an array. hash code generated may be in the range of integer and if we create arrays for such a range, then it will easily cause outOfMemoryException. So we generate an index to minimize the size of the array.The default size of the hashmap is 16.

2. The index is what the bucket is. Here we have used hashCode() to find index.

3. If the index or bucket is empty, we add the key, value from put, into the LinkedList of that bucket as Entry<K, V>.

4. Now, if the bucket is not empty, then we traverse or loop through the LinkedList of that bucket. This is called a hash collision.

5. For each keyObject in the list, we call its equals(newKeyObject).

6. If it returns true, then this Entry<K, V> will be replaced by the new Entry.

7. If it reached the end of the list, it gets added there i.e the new Entry<K, V> is get added at the end of the LinkedList for that bucket.

8. The procedure gets repeated for each put method.

get(key)- Retrieve the Value

HashMap is known for O(1) time complexity for retrieving the data, however, that is not entirely true. The O(1) is for the average-case scenario, not the worst case. It all depends on the hash function being used. If it is bad and gives the same bucket for each key, then you are having the time complexity of a LinkedList. That’s why it is very important to write a better hashCode and equals method implementation. If possible, always use wrapper classes for the keys. It is much more efficient and you don’t need to worry about poor implementation.

We have to call get(key), and it will return the value associated with that key. The process is similar to putting except for replacing or adding.

Determine the bucket from hashCode() using the hash function we used in put() method.
If empty returns null.
Else, it loops through the list of that bucket.
For each Entry <K, V> in the list, it calls the equals() method. If equals() returns true, then it returns this Entry’s value.
If no key is matched then returns null.

To summarise, equals() method helps us to identify if the object is unique and hashcode() method helps us to identify the bucket in which the values has to be Stored.

Let us go through an example to understand the working

Case1 : equals() and hashcode()- not over-ridden:

package collection;

import java.util.HashMap;

class Employee {

    String name;

    public Employee(String name) {
        this.name = name;
    }

}
public class HashMapWorking {
    public static void main(String args[]) {
        Employee emp1 = new Employee("One");
        Employee emp2 = new Employee("One");

        HashMap<Employee, String> hm = new HashMap<>();

        hm.put(emp1, "One");
        hm.put(emp2, "Two");

        System.out.println("Both Objects are Equal: "+emp1.equals(emp2));
        System.out.println("Employee 1 Hashcode: "+emp1.hashCode());
        System.out.println("Employee 2 Hashcode: "+emp2.hashCode());
        hm.forEach((k, v) -> System.out.println("Key is: " + k + " Value is: " + v));
    }
}

output:
Both Objects are Equal: false
Employee 1 Hashcode: 1523554304
Employee 2 Hashcode: 1175962212
Key is: collection.Employee@4617c264 Value is: Two
Key is: collection.Employee@5acf9800 Value is: One

Though both objects are equal() since we haven’t overridden the equals(), so it considers both the objects as unique keys. We also haven’t overridden the hashcode(), so even when both objects are same, their hashcode is different.

It compares two objects (eventhough they are equals, since we have not overridden the equals()) it shows both the objects are not equals and will be considered as Unqiue Keys and with values

Case 2: Override equals() and Not hashcode:

class Employee {

    String name;

    public Employee(String name) {
        this.name = name;
    }
    @Override
    public boolean equals(Object obj) {
        if (this == obj) {
            return true;
        }
        if (!(obj instanceof Employee)) {
            return false;
        }
        Employee other = (Employee) obj;
        return Objects.equals(name, other.name);
    }

}
public class HashMapWorking {
    public static void main(String args[]) {
        Employee emp1 = new Employee("One");
        Employee emp2 = new Employee("One");

        HashMap<Employee, String> hm = new HashMap<>();

        hm.put(emp1, "One");
        hm.put(emp2, "Two");

        System.out.println("Both Objects are Equal: "+emp1.equals(emp2));
        System.out.println("Employee 1 Hashcode: "+emp1.hashCode());
        System.out.println("Employee 2 Hashcode: "+emp2.hashCode());
        hm.forEach((k, v) -> System.out.println("Key is: " + k + " Value is: " + v));
    }
}

Both Objects are Equal: true
Employee 1 Hashcode: 1523554304
Employee 2 Hashcode: 1175962212
Key is: collection.Employee@4617c264 Value is: Two
Key is: collection.Employee@5acf9800 Value is: One

After over-riding equals method, the objects are considered equal.

Although both the objects are equal() the hashcode() is different and so the both objects will be stored in different buckets

Case3: Over-ride hashcode and not equals() method:

package collection;

import java.util.HashMap;
import java.util.Objects;

class Employee {

    String name;

    public Employee(String name) {
        this.name = name;
    }
    @Override
    public int hashCode() {
        return Objects.hash(name);
    }

}
public class HashMapWorking {
    public static void main(String args[]) {
        Employee emp1 = new Employee("One");
        Employee emp2 = new Employee("One");

        HashMap<Employee, String> hm = new HashMap<>();

        hm.put(emp1, "One");
        hm.put(emp2, "Two");

        System.out.println("Both Objects are Equal: "+emp1.equals(emp2));
        System.out.println("Employee 1 Hashcode: "+emp1.hashCode());
        System.out.println("Employee 2 Hashcode: "+emp2.hashCode());
        hm.forEach((k, v) -> System.out.println("Key is: " + k + " Value is: " + v));
    }
}

Both Objects are Equal: false
Employee 1 Hashcode: 79461
Employee 2 Hashcode: 79461
Key is: collection.Employee@13665 Value is: One
Key is: collection.Employee@13665 Value is: Two

Since we have overridden only hashcode() and not equals method() ,the objects comparison becomes false,which means objects are unique.

So even though the hashcode points to same bucket, the objects are considered as unique keys and both the values will be stored in the bucket.

Case 4: override both equals() and hashcode():

package collection;

import java.util.HashMap;
import java.util.Objects;

class Employee {

    String name;

    public Employee(String name) {
        this.name = name;
    }

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }
    @Override
    public int hashCode() {
        return Objects.hash(name);
    }

    @Override
    public boolean equals(Object obj) {
        if (this == obj) {
            return true;
        }
        if (!(obj instanceof Employee)) {
            return false;
        }
        Employee other = (Employee) obj;
        return Objects.equals(name, other.name);
    }

}
public class HashMapWorking {
    public static void main(String args[]) {
        Employee emp1 = new Employee("One");
        Employee emp2 = new Employee("One");

        HashMap<Employee, String> hm = new HashMap<>();

        hm.put(emp1, "One");
        hm.put(emp2, "Two");

        System.out.println("Both Objects are Equal: "+emp1.equals(emp2));
        System.out.println("Employee 1 Hashcode: "+emp1.hashCode());
        System.out.println("Employee 2 Hashcode: "+emp2.hashCode());
        hm.forEach((k, v) -> System.out.println("Key is: " + k + " Value is: " + v));
    }
}

Both Objects are Equal: true
Employee 1 Hashcode: 79461
Employee 2 Hashcode: 79461
Key is: collection.Employee@13665 Value is: Two

Since we have overridden the equals() and hashcode(), when inserting in map — It has identified both objects are equal and both has same hashcode values. So when trying to insert into the bucket, only one value will be inserted that is the reason we have only one element in hashmap.

Case 5: HashMap key Immutability

The hashcode helps in calculating the bucket position for storing the key-value pair in the Map. Different hashcode values may be referring to the different bucket locations.

If, accidentally, the hashcode of the key object changes after we have put a key-value pair in the map, then it’s almost impossible to fetch the value object back from the map because we don’t know in which bucket we had put the key-value in past. The old key-value pair is not reachable, and so It is a case of a memory leak.

On runtime, JVM computes hashcode for each object and provides it on demand. When we modify an object’s state, JVM set a flag that the object is modified and hashcode must be AGAIN computed. So, next time you call the object’s hashCode() method, JVM recalculates the hashcode for that object.

Immutability ensures that we will get the same hashcode every time, for a key object.

Immutable classes like String, Integer or other wrapper classes are a good key object candidate.

Note: Immutability is recommended and not mandatory

package collection;

import java.util.HashMap;
import java.util.Map;

public class ImmutableKeyTest {
    public static void main(String[] args) {
        Map<Employee, String> employeeMap = new HashMap<>();

        Employee emp1 = new Employee(1, "John");
        employeeMap.put(emp1, "Engineer");
        
        emp1.setId(100);  // Changing the ID

        String position = employeeMap.get(emp1);
        System.out.println("Position: " + position);  // Output: null
    }
}

class Employee {
    private int id;
    private String name;

    public Employee(int id, String name) {
        this.id = id;
        this.name = name;
    }

    public int getId() {
        return id;
    }

    public void setId(int id) {
        this.id = id;
    }

    public String getName() {
        return name;
    }

    public void setName(String name) {
        this.name = name;
    }
    
    @Override
    public int hashCode() {
        return id;
    }

    @Override
    public boolean equals(Object obj) {
        if (this == obj) {
            return true;
        }
        if (obj == null || getClass() != obj.getClass()) {
            return false;
        }
        Employee other = (Employee) obj;
        return id == other.id;
    }
}

output:

Position: null

Hash Collision

A collision will occur when two different keys have the same hashCode, which can happen because two unequal objects in Java can have the same hashCode.

HashMap handles collision by using a linked list to store map entries ended up in same array location or bucket location.

Re-Hashing:

Whenever the number of entries in the hashmap crosses the threadshold value, then the bucket size of the hashmap is doubled and re-hashing is done.The already existing entries of the map are copied and new entries are added to this increased hashmap.

Threshold value = Bucket size * Load factor

Ex1: If the bucket size is 16 and the load factor is 0.75 then the threshold value is 12.

Ex2:

The bucket size is 16 initially which can grow to 32 when the number of entries on the map crosses the 75% which means inserting in 12 buckets, bucket size becomes 32.

HashMap allows null key and multiple null values:

Java does not call the hashcode method on the null key, instead it stores this null key in a default bucket (say bucket 0). Now when we try to get value at null key from HashMap, Java returns the value at null key in bucket 0.

Benefits of Rehashing:

Rehashing reduces the likelihood of collisions by increasing the number of buckets, spreading the key-value pairs more evenly across the hash map.
It allows the hash map to maintain its efficiency in terms of constant-time average complexity for basic operations (insertion, retrieval, deletion).

Java8 changes to HashMap

Before Java8, singly-linked lists was the data structure used for storing nodes in hash map buckets.
However, a significant change was introduced in Java 8, where the implementation shifted to self-balancing binary search trees (BST) once a specific threshold was crossed(static final int TREEIFY_THRESHOLD = 8;).
The rationale behind this change was to address performance issues related to lookup time in hash map buckets.
Using singly-linked lists for storage could lead to a worst-case time complexity of O(n) for lookups.
The typical binary search trees could result in uneven distribution of nodes, where one subtree becomes significantly larger than the other subtree leading to unbalanced struture resulting in look up to o(n) complexity.
Red-black trees and AVL trees, known as self-balancing trees, were developed to mitigate these issues by maintaining balance in the tree structure.
With the adoption of self-balancing binary search trees in Java 8, even scenarios where all items hash to the same bucket would ensure a worst-case lookup time of O(log n).

Java8 changes in HashMap — HashMap changes in Java8

Time and Space complexity of HashMap

In case of evenly distributed hashmap, the time complexity for insert, search and delete has o(1) complexity.

In worst case scenario, where all the entries goes to the same bucket,the time complexity is o(n)

In case when reached the threshold is reached to convert from linked list to a self-balancing binary search tree, the time complexity will be reduced to o(log n).

The space complexity is typically O(n). Where n is the number of key-value pairs stored in the hash map. Each key-value pair occupies constant space, and the space required grows linearly with the number of elements.

Why to select HashMap over other collection?

Provides constant time performance for get and put operation assuming the elements are placed well among all the buckets.

Allows one null key and multiple null values.

if the elements doesnt need ordering , then hashmap is a good option.

Can HashMap be used in Multi-Threaded environment?

HashMap can’t be used for multithreading because it is not Thread-safe.

In HashMap, if one thread is iterating over an object, another thread is trying to access the same object, it throws ConcurrentModificationException.

package concepts.threading;


import java.util.*;
import java.util.concurrent.*;


class HashMapTest extends Thread {
    static HashMap m = new HashMap();

    public void run()
    {
        try {

            Thread.sleep(2000);
        }
        catch (InterruptedException e) {
        }
        System.out.println("Child Thread updating Map");
        m.put(3, "Value C");
    }

    public static void main(String arg[])
            throws InterruptedException
    {

        m.put(1, "Value A");
        m.put(2, "Value B");

        HashMapTest t = new HashMapTest();

        t.start();

        Set s1 = m.keySet();

        Iterator itr = s1.iterator();

        while (itr.hasNext()) {

            Integer I1 = (Integer)itr.next();
            System.out.println(
                    "Main Thread Iterating Map and Current Entry is:"
                            + I1 + "..." + m.get(I1));
            Thread.sleep(3000);
        }

        System.out.println(m);
    }
}

output:

Main Thread Iterating Map and Current Entry is:1...Value A
Child Thread updating Map
Exception in thread "main" java.util.ConcurrentModificationException
 at java.base/java.util.HashMap$HashIterator.nextNode(HashMap.java:1597)
 at java.base/java.util.HashMap$KeyIterator.next(HashMap.java:1620)
 at concepts.threading.HashMapTest.main(HashMapTest.java:43)

To avoid this , we can use concurrentHashMap.

In ConcurrentHashMap while one thread is iterating the remaining threads are allowed to perform any modification in a safe manner. In the above program Main thread is updating Map, at the same time child thread is also trying to update the Map object. This Program will not throw ConcurrentModificationException.

package collection;

import java.util.*;
        import java.util.concurrent.*;

class ConcurrentHashMapTest extends Thread {

    static ConcurrentHashMap<Integer, String> m
            = new ConcurrentHashMap<Integer, String>();

    public void run()
    {

        try {

            Thread.sleep(2000);
        }
        catch (InterruptedException e) {
        }

        System.out.println("Child Thread updating Map");
        m.put(3, "Value C");
    }

    public static void main(String arg[])
            throws InterruptedException
    {

        m.put(1, "Value A");
        m.put(2, "Value B");

        ConcurrentHashMapTest t = new ConcurrentHashMapTest();

        t.start();

        Set<Integer> s1 = m.keySet();

        Iterator<Integer> itr = s1.iterator();


        while (itr.hasNext()) {

            Integer I1 = itr.next();

            System.out.println(
                    "Main Thread Iterating Map and Current Entry is:"
                            + I1 + "..." + m.get(I1));
            Thread.sleep(3000);
        }

        System.out.println(m);
    }
}

output:

Main Thread Iterating Map and Current Entry is:1...Value A
Child Thread updating Map
Main Thread Iterating Map and Current Entry is:2...Value B
Main Thread Iterating Map and Current Entry is:3...Value C
{1=Value A, 2=Value B, 3=Value C}

Thank you for reading this article. Please provide your valuable suggestions/ feedback.
Clap and Share if you liked the content.
📰 Read more content on my Medium (on Java Developer interview questions)
🔔 Follow me on : LinkedIn

Please find my other useful articles on Java Developer interview questions

Following are some of the famously asked Java8 Interview Questions

Frequently asked Java Programs

Dear Readers, these are the commonly asked java programs to check your ability on writing the logic

SpringBoot Interview Questions | Medium

Rest and Microservices Interview Questions| Medium

Frequently Asked Java Concept Programs-Part3-HashMap Internal working

Written by Veenarao