Dirty Checking Magic in Hibernate: How it Works and Why It’s Important

Published in

Hibernate At the Gates of Mastery

6 min readAug 19, 2023

Dirty Checking is a mechanism used by Hibernate to determine whether any value of an entity has changed since it was retrieved from the database. This helps Hibernate optimize database queries so that only the fields that have changed are updated.

Hi, this is Paul, and welcome to this article, we explore how Dirty checking works in Hibernate.

Here’s how it works:

Entity retrieval: When an entity is fetched from the database, Hibernate stores the initial state of that entity in the first-level cache (session).
Entity modification: After retrieval, the user can change this entity.
State synchronization: Before any operation (e.g., before committing a transaction or explicitly calling `flush()`), Hibernate performs a “dirty checking” process, comparing the current state of the entity with its initial state stored in the cache.
Database update: If Hibernate detects any changes, it generates and executes the corresponding SQL update query to update only those fields that have changed.

Advantages of “dirty checking”:

- Optimization: Only modified fields are updated in the database, which can improve performance as less data is transferred between the application and the database.

- Automation: Developers do not need to explicitly specify which fields have changed. Hibernate does this automatically.

Disadvantages or limitations:

- Overhead: The “dirty checking” mechanism may add additional overhead, especially when dealing with a large number of entities. In most cases, these costs are insignificant, but they can become problematic in some scenarios.

- Transparency: Some developers may find this mechanism to be more transparent, as Hibernate automatically determines which fields need to be updated.

To manage the “dirty checking” process and optimize its operation, developers can use various strategies and annotations provided by Hibernate.

In Hibernate, the “dirty checking” mechanism tracks changes in entities that are in the “persistent” state (i.e., associated with a session). Hibernate’s approach to entity state management is based on the concept of object lifecycle states: transient, persistent, and detached.

We covered the possible entity states in JPA in a previous article, so we won’t go into that here.

Just to say that Persistent state: is the only one when “dirty checking” operates. Specifically, as soon as an object becomes associated with a session (for example, after saving, fetching, or attaching), it enters the “persistent” state. All changes in these objects will be automatically tracked and synchronized with the database during “dirty checking”.

Examples of when an entity will be tracked by the “dirty checking” mechanism:

1. After retrieving an entity: When you retrieve an entity using methods like session.get(), session.load() or an HQL query, the retrieved entity automatically becomes “persistent”.

User user = session.get(User.class, userId);
user.setEmail("example@gmail.com");
user.setName("UpdatedName"); // "dirty checking" will be applied to this entity.

It’s worth noting that when Hibernate performs “dirty checking” (usually before closing the session or when explicitly calling session.flush()), it determines that both of these fields (name and email) have been changed. However, Hibernate optimizes queries and performs a single SQL UPDATE query to update both of these fields in the database, not two separate queries. Thus, only one UPDATE query will be executed that will update both fields (name and email) in a single transaction.

2. After saving a new entity: When you save a new entity using session.save(), that entity becomes “persistent”.

User newUser = new User();
session.save(newUser);
newUser.setName("NewName"); // "dirty checking" will be applied to this entity.

3. When transitioning from “detached” to a “persistent” state: If you have an entity in a detached state and you attach it back to the session (for instance, using session.merge() or session.update()), that entity becomes “persistent” again.

session.merge(detachedUser);
detachedUser.setName("AnotherName"); // "dirty checking" will be applied to this entity if it became "persistent".

For the “dirty checking” mechanism to work, the session must remain open. If the session is closed, no changes will be tracked until the entity becomes “persistent” again in a new or the same session.

SpringData

Now that we understand how it works at the Hibernate level, let’s look at examples with SpringData.
Here’s the full code of service demonstrating various combinations of working with Spring Data JPA and transactionality.

@Service
public class UserService {

    @Autowired
    private UserRepository userRepository;

    // 1. Regular update within a transaction (without explicit saving)
    @Transactional
    public void updateName(Long userId, String newName) {
        User user = userRepository.findById(userId).orElseThrow(() -> new EntityNotFoundException("User not found"));
        user.setName(newName);
        // Thanks to dirty checking, changes will be saved automatically upon transaction completion.
    }

    // 2. Data retrieval without a transaction (changes won't be saved automatically)
    public void nonTransactionalUpdateName(Long userId, String newName) {
        User user = userRepository.findById(userId).orElseThrow(() -> new EntityNotFoundException("User not found"));
        user.setName(newName);
        // Changes will not be saved as the method is not within a transaction.
    }

    // 3. Explicitly saving changes within a transaction
    @Transactional
    public void explicitSaveAfterUpdate(Long userId, String newName) {
        User user = userRepository.findById(userId).orElseThrow(() -> new EntityNotFoundException("User not found"));
        user.setName(newName);
        userRepository.save(user);  // Explicitly saving changes, although it's not required in this context.
    }

    // 4. Creating a new entity and saving it
    @Transactional
    public void createUser(String name, String email) {
        User user = new User();
        user.setName(name);
        user.setEmail(email);
        userRepository.save(user);  // Save is necessary here as the entity is new.
    }

    // 5. Retrieving data in read-only mode
    @Transactional(readOnly = true)
    public List<User> getAllUsers() {
        return userRepository.findAll();
        // As the method is wrapped in @Transactional with readOnly=true, any attempts to change entities within this method will not result in their saving to the DB.
    }

    // 6. Explicitly detaching an entity from the persistence context and then saving it
    @Transactional
    public void detachAndUpdate(Long userId, String newName) {
        User user = userRepository.findById(userId).orElseThrow(() -> new EntityNotFoundException("User not found"));
        userRepository.detach(user);  // Detaching the entity from the persistence context.
        user.setName(newName);
        userRepository.save(user);  // Now we need to explicitly save changes as the entity is detached.
    }
}

This service demonstrates various scenarios of interaction with the database in the context of Spring Data JPA and transactions. I hope this will help you better understand working with this technology.

Let’s take a look at Example 3 again.

@Transactional
public void explicitSaveAfterUpdate(Long userId, String newName) {
    User user = userRepository.findById(userId).orElseThrow(() -> new EntityNotFoundException("User not found"));
    user.setName(newName);
    userRepository.save(user);  // Explicit saving of changes, although it is not required in this context
}

In this example, explicit saving is not required because Hibernate’s “dirty checking” considers changes in managed entities and automatically synchronizes them with the database at the end of the transaction.

If you still call userRepository.save(user); for a managed entity, in most cases, it won’t lead to any direct negative consequences, but there are a few things to note:

Performance: The save call can potentially trigger additional operations, such as merging entities, which may be less efficient than simply waiting for automatic saving of changes at the end of the transaction.

Code readability: Explicitly calling save for entities that are already in the Hibernate management context can confuse developers unfamiliar with the code context. They might wonder why explicit saving is happening here.

Save behavior: In practice, save in Spring Data JPA works as persist or merge depending on the state of the entity. If the entity is new, it will be saved as a new record, if the entity already exists (for example, it was fetched from the database), it will be merged. In most scenarios, this won’t cause problems, but knowing this behavior is essential for understanding some more complex cases.

Therefore, although explicitly saving managed entities is not an error, it is better to avoid it unless there is a specific need to simplify the code and improve its performance.

Conclusion

Wrapping a method in @Transactional is the easiest way to ensure the operation of the Hibernate “dirty checking” mechanism. When a method is wrapped in @Transactional, any entities retrieved or saved within that method automatically become part of the Hibernate Session (or JPA EntityManager). These entities are in a “managed” or “persistent” state.

Any changes made to managed entities within a transaction will be automatically tracked and synchronized with the database at the end of the transaction, thanks to the “dirty checking” mechanism.

See you in the next part of the guide!

Paul Ravvich

Thank you for reading until the end. Before you go:

Please consider clapping and following the writer! 👏
Follow us on Twitter(X), LinkedIn