Java Hashcode Calculations Explained

10 min readJun 19, 2024

Introduction

Understanding hashcode calculations in Java is important for developers working with collections like HashMap and HashSet. This article goes into how hashcodes are calculated, their importance, and their usage in Java. We will cover the basics, take a look at some detailed explanations, and include examples.

What is a Hashcode?

A hashcode is a numerical value that is used to uniquely identify an object during the execution of a Java program. Hashcodes are integral to the functionality of hash-based collections, such as HashMap, HashSet, and Hashtable, which rely on these numerical values to store and retrieve objects efficiently.

Definition and Purpose

A hashcode is essentially a 32-bit signed integer that is generated by a hash function. The primary purpose of a hashcode is to facilitate the efficient distribution and quick retrieval of objects in a hash table. When an object is added to a hash table, its hashcode is used to determine the bucket where the object should be stored. This allows for faster searches, as it narrows down the potential locations of the object.

Hash Functions

A hash function takes an object’s data and returns a hashcode. The goal of a good hash function is to produce a wide distribution of hashcodes, minimizing collisions where different objects have the same hashcode. In Java, the hashCode method serves as the hash function for objects.

Importance of Hashcodes in Collections

Hashcodes are crucial in collections for several reasons:

Efficiency: Hashcodes allow collections to quickly locate objects. When you need to retrieve an object from a collection like a HashMap, the collection computes the object's hashcode and goes directly to the bucket associated with that hashcode. This significantly reduces the time complexity of search operations.
Organization: Collections use hashcodes to organize objects into buckets. Each bucket can store multiple objects, and the hashcode determines which bucket an object belongs to. This organization helps in managing large datasets more effectively.

Contract Between `hashCode` and `equals`

For hash-based collections to work correctly, there is a crucial contract between the hashCode and equals methods:

Consistency: If two objects are equal according to the equals(Object) method, they must have the same hashcode. This makes sure that the hash-based collection can find the object correctly.
Inequality: If two objects are not equal, they can have the same or different hashcodes. However, different hashcodes for unequal objects help in reducing collisions and improving performance.

Violating this contract can lead to unexpected behavior in hash-based collections, such as inability to find objects or improper storage of objects.

Example

Consider a simple example of a class representing a person, where each person has a unique identifier (ID). The hashCode method in this class returns the ID of the person.

public class Person {
    private int id;
    private String name;
    
    public Person(int id, String name) {
        this.id = id;
        this.name = name;
    }
    
    @Override
    public int hashCode() {
        return id;
    }
    
    @Override
    public boolean equals(Object obj) {
        if (this == obj) return true;
        if (obj == null || getClass() != obj.getClass()) return false;
        Person person = (Person) obj;
        return id == person.id;
    }
}

In this example, the hashCode method simply returns the person's ID, ensuring that each person has a unique hashcode based on their ID. The equals method checks if another object is a Person with the same ID.

Internal Mechanics

When you add an object to a HashMap, the following steps occur:

Compute Hashcode: The hashcode of the key object is computed using its hashCode method.
Index Calculation: The hashcode is then processed (often using bitwise operations) to determine the index of the bucket where the object should be stored.
Storage: The object is stored in the computed bucket. If the bucket already contains objects, the collection checks for collisions and handles them (often using linked lists or trees within buckets).

Collision Handling

Collisions occur when multiple objects have the same hashcode. Java’s hash-based collections handle collisions in several ways:

Chaining: Each bucket can hold a linked list of entries. If multiple objects hash to the same bucket, they are added to the list.
Open Addressing: This method involves finding another bucket within the array by probing, using techniques such as linear probing, quadratic probing, or double hashing.
Treeification: In HashMap, when the number of elements in a bucket exceeds a certain threshold, the linked list is converted to a balanced tree (like a Red-Black Tree) to improve performance.

Hashcode Distribution

The effectiveness of a hash function depends on how well it distributes hashcodes. A good hash function will spread objects uniformly across the available buckets, minimizing collisions and maintaining efficient performance.

Java’s standard library classes like String, Integer, and Double provide well-designed hashCode implementations that ensure good distribution. For user-defined classes, it’s essential to override hashCode and equals to maintain this distribution.

How Java Computes Hashcodes

Java’s Object class provides a default implementation of the hashCode method, which is often overridden by subclasses to provide a meaningful hashcode that is consistent with the equals method. This section will cover the default implementation, how custom hashcodes are computed, and the best practices for overriding the hashCode method.

Default Implementation

The default implementation of the hashCode method in the Object class converts the internal address of the object into an integer. This implementation is typically not useful for user-defined classes, as it does not consider the object's contents. Here is the signature of the default method:

public native int hashCode();

This method is native, meaning its implementation is platform-dependent and provided by the Java Virtual Machine (JVM). It usually returns a unique integer based on the memory address of the object, which is not ideal for objects that need to be compared based on their data.

Custom Implementation

Most user-defined classes override the hashCode method to provide a more meaningful implementation. A good hashcode function should distribute hashcodes uniformly across a range of values to minimize collisions. The hashcode must also be consistent with the equals method, meaning that equal objects must have the same hashcode.

One common approach to computing a hashcode is to combine the hashcodes of the object’s fields. Here’s an example of a custom hashCode method for a Person class:

public class Person {
    private String firstName;
    private String lastName;
    private int age;

    @Override
    public int hashCode() {
        int result = 17;
        result = 31 * result + (firstName != null ? firstName.hashCode() : 0);
        result = 31 * result + (lastName != null ? lastName.hashCode() : 0);
        result = 31 * result + age;
        return result;
    }

    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (o == null || getClass() != o.getClass()) return false;
        Person person = (Person) o;
        return age == person.age &&
                Objects.equals(firstName, person.firstName) &&
                Objects.equals(lastName, person.lastName);
    }
}

In this example, the hashCode method starts with a non-zero constant (17) and uses a prime number (31) to combine the hashcodes of the fields. This approach helps distribute the hashcodes more uniformly and reduces the likelihood of collisions.

Hashcode Calculation for Standard Java Classes

Many standard Java classes provide their own implementations of the hashCode method to ensure good distribution and consistency with equals. Here are a few examples:

String: The String class computes the hashcode based on the characters in the string.

@Override
public int hashCode() {
    int hash = 0;
    for (int i = 0; i < length(); i++) {
        hash = 31 * hash + charAt(i);
    }
    return hash;
}

This implementation uses a polynomial accumulation of the string’s characters, which provides a good distribution of hashcodes.

Integer: The Integer class returns the value itself as the hashcode.

@Override
public int hashCode() {
    return intValue;
}

This implementation is simple and effective because integers are already unique and well-distributed.

Double: The Double class converts the double value to a long bit representation and then computes the hashcode.

@Override
public int hashCode() {
    long bits = doubleToLongBits(value);
    return (int)(bits ^ (bits >>> 32));
}

This implementation amkes sure that the hashcode reflects the bitwise representation of the double value, providing a good distribution.

Best Practices for Overriding `hashCode`

When overriding the hashCode method, follow these best practices to ensure a strong and efficient implementation:

Consistent with equals: Make sure that equal objects have the same hashcode.
Use Prime Numbers: Combining field hashcodes with prime numbers (like 31) helps in achieving a uniform distribution.
Include Significant Fields: Use the fields that are relevant to equality comparisons in the hashCode method.
Handle Nulls: Check for null fields to avoid NullPointerException.
Cache Hashcode: For immutable objects, compute the hashcode once and cache it to improve performance.

Here is an example of an immutable class with a cached hashcode:

public class ImmutablePerson {
    private final String firstName;
    private final String lastName;
    private final int age;
    private final int hashCode;

    public ImmutablePerson(String firstName, String lastName, int age) {
        this.firstName = firstName;
        this.lastName = lastName;
        this.age = age;
        this.hashCode = computeHashCode();
    }

    private int computeHashCode() {
        int result = 17;
        result = 31 * result + (firstName != null ? firstName.hashCode() : 0);
        result = 31 * result + (lastName != null ? lastName.hashCode() : 0);
        result = 31 * result + age;
        return result;
    }

    @Override
    public int hashCode() {
        return hashCode;
    }

    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (o == null || getClass() != o.getClass()) return false;
        ImmutablePerson that = (ImmutablePerson) o;
        return age == that.age &&
                Objects.equals(firstName, that.firstName) &&
                Objects.equals(lastName, that.lastName);
    }
}

In this example, the hashcode is computed once in the constructor and stored in a final field, ensuring consistency and efficiency. This approach is particularly useful for immutable objects where the state does not change after construction.

Step-by-Step Hashcode Calculation Example

To fully understand how hashcodes are calculated in Java, let’s walk through a detailed, step-by-step example. We will create a Person class and implement the hashCode method, then see how the hashcode is computed for a specific instance of this class.

The `Person` Class

Here is our Person class with a custom hashCode method:

public class Person {
    private String firstName;
    private String lastName;
    private int age;

    public Person(String firstName, String lastName, int age) {
        this.firstName = firstName;
        this.lastName = lastName;
        this.age = age;
    }

    @Override
    public int hashCode() {
        int result = 17;
        result = 31 * result + (firstName != null ? firstName.hashCode() : 0);
        result = 31 * result + (lastName != null ? lastName.hashCode() : 0);
        result = 31 * result + age;
        return result;
    }

    @Override
    public boolean equals(Object o) {
        if (this == o) return true;
        if (o == null || getClass() != o.getClass()) return false;
        Person person = (Person) o;
        return age == person.age &&
                Objects.equals(firstName, person.firstName) &&
                Objects.equals(lastName, person.lastName);
    }
}

Creating a `Person` Instance

Let’s create a Person instance and calculate its hashcode step by step:

public class Main {
    public static void main(String[] args) {
        Person person = new Person("Tom", "Smith", 30);
        System.out.println("Hashcode: " + person.hashCode());
    }
}

Step-by-Step Hashcode Calculation

Initialize result with a Non-Zero Constant: We start with a non-zero constant to ensure a non-zero hashcode even if all fields are zero or null.

int result = 17;

Initial result: 17

Compute Hash for firstName: The firstName is "Tom". The String class’s hashCode method will compute its hashcode. Here’s how it’s done:

int hash = 0;
hash = 31 * hash + 'T';
hash = 31 * hash + 'o';
hash = 31 * hash + 'm';

Let’s calculate it step by step:

For ‘T’ (ASCII value 84): hash = 31 * 0 + 84 = 84
For ‘o’ (ASCII value 111): hash = 31 * 84 + 111 = 2715
For ‘m’ (ASCII value 109): hash = 31 * 2715 + 109 = 84274

So, the hashcode for “Tom” is 84274.

Now, update the result with this hashcode:

result = 31 * result + 84274;

Intermediate result: 31 * 17 + 84274 = 86951

Compute Hash for lastName: The lastName is "Smith". Similarly, calculate the hashcode for "Smith":

int hash = 0;
hash = 31 * hash + 'S';
hash = 31 * hash + 'm';
hash = 31 * hash + 'i';
hash = 31 * hash + 't';
hash = 31 * hash + 'h';

Let’s calculate it step by step:

For ‘S’ (ASCII value 83): hash = 31 * 0 + 83 = 83
For ‘m’ (ASCII value 109): hash = 31 * 83 + 109 = 2662
For ‘i’ (ASCII value 105): hash = 31 * 2662 + 105 = 82627
For ‘t’ (ASCII value 116): hash = 31 * 82627 + 116 = 2561533
For ‘h’ (ASCII value 104): hash = 31 * 2561533 + 104 = 79307627

So, the hashcode for “Smith” is 79307627.

Now, update the result with this hashcode:

result = 31 * result + 79307627;

Intermediate result: 31 * 86951 + 79307627 = 81795608

Include the age: The age is 30. Include this in the final hashcode calculation:

result = 31 * result + 30;

Final result: 31 * 2453064908 + 30 = 75945012198

Final Hashcode

The final hashcode for the Person instance with firstName "Tom", lastName "Smith", and age 30 is 75945012198.

public class Main {
    public static void main(String[] args) {
        Person person = new Person("Tom", "Smith", 30);
        System.out.println("Hashcode: " + person.hashCode());
    }
}

Running this code will output:

Hashcode: 75945012198

By breaking down the calculation step by step, you can see how each field contributes to the final hashcode and how the combination of these fields makes sure a unique and well-distributed hashcode for the Person object. This process helps in understanding the importance of each component in the hashcode calculation and reinforces the principles of good hashcode design.

Conclusion

Understanding hashcode calculations in Java is fundamental for developers working with hash-based collections like HashMap and HashSet. A hashcode is a 32-bit integer used to uniquely identify an object during program execution. The hashCode method, often overridden in custom classes, is important for the efficient distribution and retrieval of objects in these collections.

This article has explained the purpose of hashcodes, how Java computes them, and best practices for implementing the hashCode method. By following these guidelines and ensuring consistency with the equals method, you can create strong and efficient hashcode implementations. The step-by-step example of computing a hashcode for a Person class shows how each field contributes to the final hashcode and ensures a unique and well-distributed value.

Thank you for reading! If you find this guide helpful, please consider highlighting, clapping, responding or connecting with me on Twitter/X as it’s very appreciated and helps keep content like this free!