HashMap in Java — Part 1

Devesh K Chaubey
4 min readJun 23, 2022

--

Recently a lot of interviewers are asking for the internal working mechanism of the hashmap. It has become common question for Java interviews.

With this article we will try to understand and answer below questions:

  1. What is HashMap
  2. Does it allow Null Keys/Values
  3. Does it allow duplicate Keys
  4. Does it maintain order of inserted elements
  5. Is HashMap thread safe
  6. Why it should be avoided to be used in Multi-threaded env, Solution
  7. HashMap vs ConcurrentHashMap
  8. HashMap vs HashTable
  9. What is Capacity and Load Factor
  10. Common methods used and its time complexity
  11. Internal implementation

What is HashMap?

A HashMap is a part of Java’s collection wherein an index of a type can access store items in key/value pairs. It internally uses HashTable implementation.

java.utilpublic class HashMap<K,V>
extends AbstractMap<K,V>
implements Map<K,V>, Cloneable, Serializable
Type Parameters:
K - the type of keys maintained by this map
V - the type of mapped values
All Implemented Interfaces:
Serializable, Cloneable, Map<K,V>
Direct Known Subclasses:
LinkedHashMap, PrinterStateReasons

Does it allow NULL Keys/Values?

It allows at most one null key and multiple null values

Does it allow duplicate Keys?

We can insert duplicate keys but it would override the previous value.

Does it maintain order of its inserted elements?

It does not preserve the order of insertion of entries into the map.

Is HashMap thread safe?

No, HashMap is not thread safe. We can use synchronized for the same.

Why Hashmap Should Not Be Used for Multi-threaded Environments? Solution

It is a bug to have multiple threads use a non-synchronized collection (really any mutable class) in an unprotected manner. Certain if each thread had their own HashMap instance then this is not an issue. It is a problem if multiple threads are adding to the same HashMap instance without it being synchronized. Even if just 1 thread is modifying a HashMap and other threads are reading from that same map without synchronization, you will run into problems.

When a new entry got inserted into the HashMap, the Iterator for the keys is failing. Actually, the Iterator on Collection objects is fail-fast i.e any modification in the structure or modification of its internal structure (e.g., rehash) or the number of entries in the collection object will trigger the exception. And hence the exception ConcurrentModificationException is thrown. Basically, HashMap contains a variable to count the number of modifications (modCount) and the iterator uses it when you call its next() function to get the next entry.

Solution: If you need to use the same hash table object in multiple threads then you should consider using ConcurrentHashMap, wrapping each of the accesses to the HashMap in a synchronized {} block, or making use of the Collections.synchronizedMap(new HashMap<...>()) construct.

HashMap vs ConcurrentHashMap

HashMap vs HashTable?

Capacity and LoadFactor?

An instance of HashMap has two parameters that affect its performance: initial capacity and load factor. The capacity is the number of buckets in the hash table, and the initial capacity is simply the capacity at the time the hash table is created. The load factor is a measure of how full the hash table is allowed to get before its capacity is automatically increased. When the number of entries in the hash table exceeds the product of the load factor and the current capacity, the hash table is rehashed (that is, internal data structures are rebuilt) so that the hash table has approximately twice the number of buckets.

As a general rule, the default load factor (.75) offers a good tradeoff between time and space costs. Higher values decrease the space overhead but increase the lookup cost (reflected in most of the operations of the HashMap class, including get and put). The expected number of entries in the map and its load factor should be taken into account when setting its initial capacity, so as to minimize the number of rehash operations. If the initial capacity is greater than the maximum number of entries divided by the load factor, no rehash operations will ever occur.

If many mappings are to be stored in a HashMap instance, creating it with a sufficiently large capacity will allow the mappings to be stored more efficiently than letting it perform automatic rehashing as needed to grow the table. Note that using many keys with the same hashCode() is a sure way to slow down performance of any hash table. To ameliorate impact, when keys are Comparable, this class may use comparison order among keys to help break ties.

For rest of the details please refer Part 2 of the article.

Thanks for reading!

If you liked my article please like and follow me.

--

--

Devesh K Chaubey

Lead Principal Data Engg and Tech Enthusiast experienced in developing BigData, AI/ML, Data Science projects using latest tech stack.