How HashMap works internally in Java: A debug approach
An interesting approach to learn how HashMap works in Java
Most common interview questions are “How HashMap works in Java”, “How to get and put method of HashMap work internally”. Here I am trying to explain internal functionality with an easy example.
HashMap is one of the most used Collections in java. Rather than going through theory, we will start with an example first, so that you will get a better understanding and then we will see how to get and put function work in java.
Let’s take a very simple example. I have a Country
class, we are going to use Country class object as key and its capital name(string) as value. Below example will help you to understand, how these key-value pair will be stored in hashmap.
1. Country.java
package org.arpit.java2blog;
public class Country {String name;
long population;public Country(String name, long population) {
super();
this.name = name;
this.population = population;
}
public String getName() {
return name;
}
public void setName(String name) {
this.name = name;
}
public long getPopulation() {
return population;
}
public void setPopulation(long population) {
this.population = population;
}// If length of name in country object is even then return 31(any random number) and if odd then return 95(any random number).
// This is not a good practice to generate hashcode as below method but I am doing so to give better and easy understanding of hashmap.
@Override
public int hashCode() {
if(this.name.length()%2==0)
return 31;
else
return 95;
}
@Override
public boolean equals(Object obj) {Country other = (Country) obj;
if (name.equalsIgnoreCase((other.name)))
return true;
return false;
}}
If you want to understand more about hashcode and equals method of an object, you may refer hashcode() and equals() method in java
2. HashMapStructure.java(main class)
import java.util.HashMap;
import java.util.Iterator;public class HashMapStructure {/**
* @author Arpit Mandliya
*/
public static void main(String[] args) {Country india=new Country("India",1000);
Country japan=new Country("Japan",10000);Country france=new Country("France",2000);
Country russia=new Country("Russia",20000);HashMap<Country, String> countryCapitalMap=new HashMap<Country,String>();
countryCapitalMap.put(india,"Delhi");
countryCapitalMap.put(japan,"Tokyo");
countryCapitalMap.put(france,"Paris");
countryCapitalMap.put(russia,"Moscow");Iterator countryCapitalIter=countryCapitalMap.keySet().iterator();//put debug point at this line
while(countryCapitalIter.hasNext())
{
Country countryObj=countryCapitalIter.next();
String capital=countryCapitalMap.get(countryObj);
System.out.println(countryObj.getName()+"----"+capital);
}
}}
Now put a break-point at line 24 and right click on project->debug as-> java application
. The program will stop execution at line 24 then right click on countryCapitalMap
then select a watch. You will be able to see structure as below.
Now From the above diagram, you can observe the following points
- There is an
Entry[]
array called table which has size 16. - This table stores the Entry class’s object. HashMap class has an inner class called Entry. This Entry has key value as an instance variable. Let’s see the structure of the entry class Entry Structure.
static class Entry implements Map.Entry
{
final K key;
V value;
Entry next;
final int hash;
...//More code goes here
}
- Whenever we try to put any key-value pair in hashmap, Entry class object is instantiated for key value and that object will be stored in above-mentioned Entry[](table). Now you must be wondering, where will above created Entry object gets stored(exact position in the table). The answer is, the hash code is calculated for a key by calling the hashcode() method. This hashcode is used to calculate the index for the above Entry[] table.
- Now, If you see at array index 10 in the above diagram, It has an Entry object named
HashMap$Entry
. - We have put 4 key-values in hashmap but it seems to have only 2!!!! This is because if two objects have the same hashcode, they will be stored at the same index. Now the question arises how? It stores objects in the form of LinkedList(logically).
So how hashcode of above country key-value pairs is calculated.
Hashcode for Japan = 95 as its length is odd.
Hashcode for India =95 as its length is odd
HashCode for Russia=31 as its length is even.
HashCode for France=31 as its length is even.
Below diagram will explain the LinkedList concept clearly.
So now if you have a good understanding of hash table data structure, Let's go throughput and get method.
Put
Let’s see the implementation of the put method:
/**
* Associates the specified value with the specified key in this map. If the
* map previously contained a mapping for the key, the old value is
* replaced.
*
* @param key
* key with which the specified value is to be associated
* @param value
* value to be associated with the specified key
* @return the previous value associated with key, or null
* if there was no mapping for key. (A null return
* can also indicate that the map previously associated
* null with key.)
*/
public V put(K key, V value) {
if (key == null)
return putForNullKey(value);
int hash = hash(key.hashCode());
int i = indexFor(hash, table.length);
for (Entry e = table[i]; e != null; e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k))) {
V oldValue = e.value;
e.value = value;
e.recordAccess(this);
return oldValue;
}
}modCount++;
addEntry(hash, key, value, i);
return null;
}
now let’s understand above code step by step
- The key object is checked for null. If a key is null then it will be stored at the table[0] because hashcode for null is always 0.
- Key object’s hashcode() method is called and the hash code is calculated. This hashcode is used to find the index of an array for storing the Entry object. It may happen sometimes that, this hashcode function is poorly written so JDK designer has put another function called hash() which takes the above-calculated hash value as an argument. If you want to learn more about hash() function, you can refer to the hash and indexFor method in hashmap.
indexFor(hash, table.length)
is used to calculate exact index in table array for storing the Entry object.- As we have seen in our example, if two key objects have same hashcode(which is known as collision) then it will be stored in the form of the linked list. So here, we will iterate through our linked list.
- If there is no element present at that index which we have just calculated then it will directly put our Entry object at that index.
- If There is element present at that index then it will iterate until it gets Entry->next as null.
- What if we are putting the same key again, logically it should replace old value. Yes, it will do that. While iterating it will check key equality by calling equals() method(key.equals(k)), if this method returns true then it replaces value object with the current Entry’s value object.
- If it did not find the duplicate key, then the current Entry object will become the first node in linked list and current Entry -> next will become an existing first node on that index.
Get
Let’s see the implementation of get now:
/**
* Returns the value to which the specified key is mapped, or {@code null}
* if this map contains no mapping for the key.
*
*
* More formally, if this map contains a mapping from a key {@code k} to a
* value {@code v} such that {@code (key==null ? k==null :
* key.equals(k))}, then this method returns {@code v}; otherwise it returns
* {@code null}. (There can be at most one such mapping.)
*
*
* A return value of {@code null} does not necessarily indicate that
* the map contains no mapping for the key; it's also possible that the map
* explicitly maps the key to {@code null}. The {@link #containsKey
* containsKey} operation may be used to distinguish these two cases.
*
* @see #put(Object, Object)
*/
public V get(Object key) {
if (key == null)
return getForNullKey();
int hash = hash(key.hashCode());
for (Entry e = table[indexFor(hash, table.length)]; e != null; e = e.next) {
Object k;
if (e.hash == hash && ((k = e.key) == key || key.equals(k)))
return e.value;
}
return null;
}
As you got the understanding of put functionality of hashmap. So to understand get functionality is quite simple. If you pass any key to get the value object from a hashmap.
- The key object is checked for null. If the key is null then the value of Object resides at the table[0] will be returned.
- Key object’s hashcode() method is called and the hash code is calculated.
- indexFor(hash, table.length) is used to calculate exact index in table array using generated hashcode for getting the Entry object.
- After getting index in table array, it will iterate through the linked list and check for key equality by calling equals() method and if it returns true then it returns the value of Entry object else returns null.
Key points to Remeber:
- HashMap has an inner class called Entry which stores key-value pairs.
- Above Entry, the object is stored in Entry[ ](Array) called table
- An index of the table is logically known as the bucket and it stores the first element of linked list
- Key object’s hashcode() is used to find the bucket of that Entry object.
- If two key objects have the same hashcode, they will go in the same bucket of table array.
- Key object ‘s equals() method is used to ensure the uniqueness of a key object.
- Value object ‘s equals() and hashcode() method is not used at all
Please go through core java interview questions and java interview questions for more interview questions.
You may also like:
- HashMap in java
- How HashMap works in java
- hash and index for the method in HashMap
- hashcode and equals method in java
- How to sort HashMap by keys and values
- Difference between HashMap and HashSet
- Difference between HashMap and Hashtable
- How to iterate over HashMap
Originally published at https://java2blog.com on May 3, 2019.