How HashMap Internally Works in Java

Liberatoreanita
6 min readFeb 27, 2023

--

Here we are to talk about HashMap in Java!

In this article I will analyze HashMap in Java and in particular:
1. What happens internally when creating a HashMap;
2.
Analyze put() method and how works internally in HashMap;
3. Analyze collision situation with diagrams;

What is HashMap?

The HashMap class of the Java collections framework provides the functionality of the hash table data structure. It stores elements in key/value pairs. Here, keys are unique identifiers used to associate each value on a map. The HashMap class implements the Map interface. HashMap extends AbstractMap! HashMap uses a technique called Hashing.

What is Hashing?

Hashing in Java HashMap is a technique used to efficiently store and retrieve key-value pairs. It involves computing a hash code for the key object and then using this hash code to locate the corresponding value object in the HashMap.

When a key-value pair is added to the HashMap, the key’s hash code is computed using the key object’s hashCode() method. This hash code is then used to determine the index of the bucket where the key-value pair should be stored. If multiple key-value pairs have the same hash code, they are stored in the same bucket as a linked list.

When a value is to be retrieved from the HashMap, the key’s hash code is computed again and used to locate the bucket where the key-value pair might be stored. If there is a linked list at that bucket, then the actual key object is compared against each key in the linked list using the equals() method until a match is found. Once a match is found, the corresponding value is returned.

Let’s talk about DEFAULT_INITIAL_CAPACITY of HashMap. Capacity is the number of buckets in the HashMap. The initial capacity is the capacity at the time the Map is created. Finally, the default initial capacity of the HashMap is 16. We can found DEFAULT_INITIAL_CAPACITY in HashMap class.

static final int DEFAULT_INITIAL_CAPACITY = 1 << 4; // aka 16

The result is this, from 0 to 15 bucket to store key-value pair.

If Object key is null, the value of the hash method is 0 otherwise the value can be similar to 80247871 generated by → h = key.hashCode()) ^ (h >>> 16

static final int hash(Object key) {
int h;
return (key == null) ? 0 : (h = key.hashCode()) ^ (h >>> 16);
}

As the number of elements in the HashMap increases, the capacity is expanded. The load factor is the measure that decides when to increase the capacity of the Map. The default load factor is 75% of the capacity.

static final float DEFAULT_LOAD_FACTOR = 0.75f;

So, let’s now analyze how to create a HashMap and insert elements.

HashMap<String, Integer> map = new HashMap<>();
map.put("One", 1);
map.put("Two", 2);
map.put("Three", 3);

When we call the put() method, then it calculates the hash code of the Key “One.” Suppose the hash code of “One” is 84525 (remember that it was calculated earlier in the hash method). To store the Key in memory, we have to calculate the index.

When we call put() method, there is another method called putVal():

public V put(K key, V value) {
return putVal(hash(key), key, value, false, true);
}
final V putVal(int hash, K key, V value, boolean onlyIfAbsent,
boolean evict) {
Node<K,V>[] tab; Node<K,V> p; int n, i;
if ((tab = table) == null || (n = tab.length) == 0)
n = (tab = resize()).length;
if ((p = tab[i = (n - 1) & hash]) == null)
tab[i] = newNode(hash, key, value, null);
else {
Node<K,V> e; K k;
if (p.hash == hash &&
((k = p.key) == key || (key != null && key.equals(k))))
e = p;
else if (p instanceof TreeNode)
e = ((TreeNode<K,V>)p).putTreeVal(this, tab, hash, key, value);
else {
for (int binCount = 0; ; ++binCount) {
if ((e = p.next) == null) {
p.next = newNode(hash, key, value, null);
if (binCount >= TREEIFY_THRESHOLD - 1) // -1 for 1st
treeifyBin(tab, hash);
break;
}
if (e.hash == hash &&
((k = e.key) == key || (key != null && key.equals(k))))
break;
p = e;
}
}
if (e != null) { // existing mapping for key
V oldValue = e.value;
if (!onlyIfAbsent || oldValue == null)
e.value = value;
afterNodeAccess(e);
return oldValue;
}
}
++modCount;
if (++size > threshold)
resize();
afterNodeInsertion(evict);
return null;
}

HashMap contains an array of the nodes, and the node is represented as a class. It uses an array and LinkedList data structure internally for storing Key and Value. There are four fields in HashMap. At the beginning Node<K,V>[] tab is empty and has 0 to 15 buckets.

So, HashMap contains an array of Node and Node can represent a class having the following objects :

  1. int hash
  2. K key
  3. V value
  4. Node next

Result in debug mode:

At the beginning, without any element on HashMap, all buckets are null!

Inside putVal() method, we can found index calculation to store the Key in memory.

i = (n - 1) & hash
i = (16 (Default initial capacity) -1) & 84525 (calculate by hash method)

So, i = 13 (in this case)

This means that, in index 13 we will have a situation like this:

At index 13 we have first Node that contains following data:

int hash → 84525 calculate by hash method

Hashing is the process of mapping an object or a primitive data type to some representative integer value using hashing algorithms. In Java, a hash code is an integer value that can be computed for all objects.

For example:

K key→ “One”, our first element to store inside our map as a key.

map.put("One", 1);

V value → 1, value for key“One” is 1

Next is pointer to next node if index calculation will be the same as the existing node. This is in case of collision. A collision, or more specifically, a hash code collision in a HashMap, is a situation where two or more key objects produce the same final hash value and hence point to the same bucket location or array index.

Let’s make an example of probably collision!

Assume that when we put inside our map map.put(“Two”, 2), the index value calculated by i = (n — 1) & hash is equal to 13. We already have bucket 13 occupied, in which case, the next pointer will have a new Node containing the key-value pair we need to insert, in this case two,2.

Our diagram will be:

Conclusion

The complexity of the put() method in a Java HashMap depends on the internal operations that the HashMap needs to perform to insert an element. Generally, the complexity of put() in a HashMap is O(1) on average, which means that the operation takes constant time, regardless of the size of the map.

However, in rare cases, the complexity of put() could become O(n), where n is the number of elements in the map, due to collisions between the keys of the elements. In these cases, the HashMap must create a linked list of elements with the same hash key, and searching for the correct element will take time proportional to the length of the list. However, if the hash function is well-implemented and the size of the HashMap is sufficiently large, collisions will be rare, and the complexity of put() will remain O(1) on average.

That’s all for this article!

If you have come this far in the following reading, all I can do is say thank you and see you in the next article.

Bye!!

--

--