Creating a Win-Win Scenario: Auditing & Entity Revisions in Spring Boot with Hibernate
A database with incorrect or outdated data is of no use to anyone, especially if you don’t have auditing and entity revisioning set up. To keep track of changes, it’s recommended that each change be recorded and a historical log kept. This then keeps the database up-to-date and error-free.
I’ve found the most luck doing this by utilizing Spring Boot 2.7.8 with Spring data JPA. In this blog post, I will demonstrate step by step how I accomplished the task at hand.
Getting started with tracking changes
The process of collecting changes to data within a database is called Change Data Capture (CDC). Through it, you can identify and capture changes, then forward them to a downstream process or system in real time. There are a couple of methods to build CDC. Hibernate Envers is one, and it uses an application-based trigger to track data changes.
Java version: 17
Hibernate Envers version: 5.6.14.Final
Database: PostgreSQL
Database-migration Tool: Flyway
First things first
To get started with Hibernate Envers, first we enable JPA auditing, because the auditing infrastructure needs to become aware of the application’s current auditor. To do so, we must configure an auditor aware to provide the infrastructure the current auditor by implementing theAuditorAware<T>
interface from Spring Data. <T>
is the type of the auditing instance, but it can be any type.
import org.springframework.data.domain.AuditorAware;
import org.springframework.stereotype.Component;
import java.util.Optional;
@Component
public class AuditorAwareImpl implements AuditorAware<String> {
@Override
public Optional<String> getCurrentAuditor() {
return Optional.of("Medium");
/*
Some basic examples to get the logged-in user's id
if you're using Spring Security.
-- Example 1:
return Optional.ofNullable(SecurityContextHolder.getContext())
.map(SecurityContext::getAuthentication)
.filter(Authentication::isAuthenticated)
.map(Authentication::getPrincipal)
.map(User.class::cast)
.map(User::getUsername);
-- Example 2:
final Authentication authentication = SecurityContextHolder.getContext().getAuthentication();
String auditor = "System";
if (Objects.nonNull(authentication)) {
auditor = authentication.getName();
if (auditor.equals("anonymousUser")) {
return Optional.of("System");
}
}
return Optional.of(auditor);
*/
}
}
In our example, we set Medium
for the current user. You can also take a look at the examples to get the logged-in user’s ID from Spring security.
After implementing an auditor aware, we now need to create another configuration class for persistence, because we need to provide our implementation to JPA.
import org.springframework.context.annotation.Bean;
import org.springframework.context.annotation.Configuration;
import org.springframework.data.domain.AuditorAware;
import org.springframework.data.jpa.repository.config.EnableJpaAuditing;
@Configuration
// Value must match with the name of the bean below.
@EnableJpaAuditing(auditorAwareRef = "auditorAware")
public class PersistenceConfiguration {
@Bean
public AuditorAware<String> auditorAware() {
return new AuditorAwareImpl();
}
}
Keep in mind that the reference of the auditor aware implementation that we created must match with the name of the bean we create.
Last but not least, we need to create columns in the table to store who created or modified the entity as well as the time of creation and modification. There are 4 columns/fields that we need to create:
- CreatedBy: Declares a field as the one representing the principal that created the entity containing the field;
- CreatedDate: Declares a field as the one representing the date the entity containing the field was created;
- LastModifiedBy: Declares a field as the one representing the principal that recently modified the entity containing the field;
- LastModifiedDate: Declares a field as the one representing the date the entity containing the field was recently modified.
The CreatedBy
and CreatedDate
fields will be set only when the record/entity is created, and they will not be updated each time the record gets updated, whileLastModifiedBy
and LastModifiedDate
fields will be updated each time the record/entity gets updated.
With that in mind, let’s create an abstract class that contains all the fields we want for auditing.
import org.springframework.data.annotation.CreatedBy;
import org.springframework.data.annotation.CreatedDate;
import org.springframework.data.annotation.LastModifiedBy;
import org.springframework.data.annotation.LastModifiedDate;
import org.springframework.data.jpa.domain.support.AuditingEntityListener;
import javax.persistence.Column;
import javax.persistence.EntityListeners;
import javax.persistence.MappedSuperclass;
import java.time.Instant;
import java.util.Objects;
@MappedSuperclass
@EntityListeners(AuditingEntityListener.class)
public abstract class LocalBaseAuditEntity {
@CreatedBy
@Column(name = "created_by", updatable = false)
private String createdBy;
@CreatedDate
@Column(name = "created_date", nullable = false, updatable = false)
// You can use any date-time object.
private Instant createDate = Instant.now();
@LastModifiedBy
@Column(name = "last_modified_by")
private String lastModifiedBy;
@LastModifiedDate
@Column(name = "last_modified_date", nullable = false)
// You can use any date-time object.
private Instant lastModifiedDate = Instant.now();
// ... getters, setters, equals & hashCode, and toString functions ommitted.
}
I recommend having an abstract class with EntityListener
for auditing and marked as @MappedSuperclass
so it will be designated as a class whose mapping information is applied to the entities that inherit from it. Keep in mind that a mapped superclass has no separate table defined for it.
If you don’t wish to have an abstract class, you can put all those fields in your entity. Make sure you also call @EntityListeners(AuditingEntityListener.class)
on the class level.
SQL script (PostgreSQL):
...,
created_by TEXT NOT NULL,
created_date TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
last_modified_by TEXT NOT NULL,
last_modified_date TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
SQL script (PostgreSQL) for the existing table:
ALTER TABLE users
ADD COLUMN created_by TEXT NOT NULL DEFAULT 'Medium', -- Any default value for the existing records.
ADD COLUMN created_date TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
ADD COLUMN last_modified_by TEXT NOT NULL DEFAULT 'Medium',
ADD COLUMN last_modified_date TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP;
Full script used in our example:
CREATE EXTENSION IF NOT EXISTS "uuid-ossp";
CREATE TABLE users
(
id UUID DEFAULT uuid_generate_v4() PRIMARY KEY,
name TEXT NOT NULL,
surname TEXT NOT NULL,
email_address TEXT NOT NULL,
-- Required columns for auditing.
created_by TEXT NOT NULL,
created_date TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP,
last_modified_by TEXT NOT NULL,
last_modified_date TIMESTAMP NOT NULL DEFAULT CURRENT_TIMESTAMP
);
If everything goes well, we will be able to see a record with the values of the audits.
final UserEntity userEntity = new UserEntity("Lorem", "Ipsum", "lorem@ipsum.com");
userRepository.saveAndFlush(userEntity);
+------------------------------------+-----+-------+---------------+----------+--------------------------+----------------+--------------------------+
|id |name |surname|email_address |created_by|created_date |last_modified_by|last_modified_date |
+------------------------------------+-----+-------+---------------+----------+--------------------------+----------------+--------------------------+
|cecb7147-6f12-4430-b3d5-d5931283894f|Lorem|Ipsum |lorem@ipsum.com|Medium |2023-01-30 15:37:25.225195|Medium |2023-01-30 15:37:25.225195|
+------------------------------------+-----+-------+---------------+----------+--------------------------+----------------+--------------------------+
Congrats! You can now give your QA a decent answer when they ask “Who edited my records?”
Hibernate Envers to the rescue
If keeping track of who created or changed an entity is not enough, let me introduce you Hibernate Envers.
Envers is a core module of the Hibernate model. It will keep track of all changes on the persisted entities. We can even keep the changes of a removed entity!
Enabling Envers requires more effort, so to start off, you need to install the dependency.
implementation 'org.hibernate:hibernate-envers:5.6.14.Final'
If you wish to use another version, you can check out the mvnrepository. Keep in mind that if you’re using Spring Boot version 3+, you should use Hibernate 6. Otherwise, stick with Hibernate 5. Using Hibernate 5 with Spring Boot version 3+ or vice versa will break the app, because all of the package and configuration parameter names were renamed fromjavax.persistence.*
to jakarta.persistence.*
for both Spring Boot and Hibernate. Source
There are a couple of configurations that we need to make.
Setting an Envers strategy
spring.jpa.properties.org.hibernate.envers.audit_strategy=org.hibernate.envers.strategy.ValidityAuditStrategy
spring.jpa.properties.org.hibernate.envers.audit_strategy_validity_store_revend_timestamp=true
spring.jpa.properties.org.hibernate.envers.store_data_at_delete=true
audit_strategy — ValidityAuditStrategy stores both the start revision and the end revision. It helps us track the revision of next changes.
audit_strategy_validity_store_revend_timestamp — true means that the timestamp of the end revision gets stored. This is usually when the entity is changed.
store_data_at_delete — true means the entity data to be stored in the revision when the entity gets deleted. If you want to keep changes of the removed entity, you must set this property to true. It is false by default.
You can find more configuration properties such as changing suffixes, prefixes, and certain field names here.
Enabling where to use Envers
The one and only annotation we need to use to apply Envers to a class or field is @Audited
. When applied to a class, it indicates that all of its properties should be audited. When applied to a field, it indicates that this field should be audited.
Class level:
import org.hibernate.envers.Audited;
import javax.persistence.Entity;
import javax.persistence.Table;
@Entity
@Audited
@Table(name = "users")
public class UserEntity {
... fields and functions omitted.
}
When used on a class level, all fields will be audited. So, make sure those fields exist in *_aud
table which we are about to create it. If your entity has a child collection, that should have an _aud
table too since it is a part of the parent entity. Also, keep in mind that if you use an abstract class to keep auditing fields, setting the auditing annotation on the class level does not cover the fields in the abstract class. You should still set the audited annotation to the fields or the class itself for abstract class.
Field level:
import org.hibernate.envers.Audited;
import org.springframework.data.annotation.LastModifiedBy;
import javax.persistence.Column;
@Audited
@LastModifiedBy
@Column(name = "last_modified_by")
private String lastModifiedBy;
Creating tables for revision and audits
Hibernate expects to find revinfo
and *_aud
tables in your database. Otherwise, the Envers will fail at startup. We must make sure the tables exist.
Table validation error:
[PersistenceUnit: default] Unable to build Hibernate SessionFactory; nested exception is org.hibernate.tool.schema.spi.SchemaManagementException: Schema-validation: missing table [revinfo]
If you are using a data-migration tool, I highly recommend you to validate hibernate. To do so, set spring.jpa.hibernate.ddl-auto
to validate
in application.properties
“The more validation I need, the less discernment I have.” — Kurt Hanks
spring.jpa.hibernate.ddl-auto=validate
The app will stop at startup if there is a mismatch between table names/columns and entity table names/fields. The best example is the one that we mentioned above (the validation error). Another example could be if you change the audit table suffix and forget to update your table name in the database, the app will crash and show the following error:
Caused by: org.hibernate.tool.schema.spi.SchemaManagementException: Schema-validation: missing table [users_my_lovely_suffix]
SQL scripts for revision and audits
In our example, we only have theusers
table . We enabled auditing on a class level, and last_modified_by
and last_modified_date
in theabstract class. Let’s create some tables.
CREATE SEQUENCE hibernate_sequence;
CREATE TABLE revinfo
(
rev INTEGER PRIMARY KEY NOT NULL, -- Version number.
revtstmp BIGINT -- Epoch timestamp of the version number.
);
CREATE TABLE users_aud
(
id UUID,
rev INTEGER REFERENCES revinfo (rev), -- The version number of the entity.
revend INTEGER REFERENCES revinfo (rev), -- The version of the next version number after entity gets updated.
revtype SMALLINT, -- The type of the revision.
revend_tstmp TIMESTAMP, -- The timestamp of the next version number after entity gets updated.
name TEXT,
surname TEXT,
email_address TEXT,
last_modified_by TEXT,
last_modified_date TIMESTAMP,
PRIMARY KEY (id, rev)
);
You may ask why we’re creating a sequence with the name hibernate_sequence
. Hibernate Envers expects a global sequence called hibernate_sequence
to be inserted into therevinfo
table. If such a sequence does not exist, the following error will be shown at startup — Schema-validation: missing sequence [hibernate_sequence].
Here is when thehibernate_sequence
is used:
Hibernate: select nextval ('hibernate_sequence')
Hibernate: insert into revinfo (revtstmp, rev) values (?, ?)
revinfo
is a reserved table name by Hibernate for Envers. It stores the revision number and its timestamp (epoch). If such a table does not exist, Hibernate will throw out an error — Schema-validation: missing table [revinfo].
rev
represents the version number while revtstmp
represents the epoch timestamp of the revision.
Each *_aud
table must have rev
, revend
, revtype
, and revend_tstmp
. All these column names and table name(s), prefixes and suffixes are reserved by Hibernate. If you wish to change any of them, check out the Envers configuration.
In each *_aud
table, rev
is the revision number, revend
is the revision number of the next revision after the entity gets updated, revend_tstmp
is the timestamp of the next revision, and revtype
is the revision type. There are 3 revision types.
- 0 means entity is created;
- 1 means entity is updated;
- 2 means entity is removed;
Once the desired tables are created, we’re ready to create some entities.
Implementation
Creating an entity:
final UserEntity userEntity = new UserEntity("Lorem", "Ipsum", "lorem@ipsum.com");
userRepository.saveAndFlush(userEntity);
After creation, we will be able to see Hibernate actions if show-sql
is enabled — spring.jpa.show-sql=true
in application.properties
.
Hibernate: insert into users (created_date, created_by, last_modified_by, last_modified_date, email_address, name, surname, id) values (?, ?, ?, ?, ?, ?, ?, ?)
Hibernate: select nextval ('hibernate_sequence')
Hibernate: insert into revinfo (revtstmp, rev) values (?, ?)
Hibernate: insert into users_aud (revtype, revend, revend_tstmp, last_modified_by, last_modified_date, email_address, name, surname, id, rev) values (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
This is how the users_aud
table looks like for the created entity.
+------------------------------------+---+------+-------+------------+-----+-------+---------------+----------------+--------------------------+
|id |rev|revend|revtype|revend_tstmp|name |surname|email_address |last_modified_by|last_modified_date |
+------------------------------------+---+------+-------+------------+-----+-------+---------------+----------------+--------------------------+
|d573d872-a01e-40f8-9a3a-7e7482d66699|1 |null |0 |null |Lorem|Ipsum |lorem@ipsum.com|Medium |2023-01-30 17:36:47.561605|
+------------------------------------+---+------+-------+------------+-----+-------+---------------+----------------+--------------------------+
Keep in mind that the revtype
is 0
, meaning that the snapshot of this entity is insert
.
Let’s update our entity to see what is going to happen.
UserEntity userEntity = userRepository.findById(UUID.fromString("d573d872-a01e-40f8-9a3a-7e7482d66699")).orElseThrow();
userEntity.setName("Updated Lorem");
userEntity.setSurname("Updated Ipsum");
userRepository.saveAndFlush(userEntity);
Hibernate: update users set last_modified_by=?, last_modified_date=?, email_address=?, name=?, surname=? where id=?
Hibernate: select nextval ('hibernate_sequence')
Hibernate: insert into revinfo (revtstmp, rev) values (?, ?)
Hibernate: insert into users_aud (revtype, revend, revend_tstmp, last_modified_by, last_modified_date, email_address, name, surname, id, rev) values (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Hibernate: update users_aud set revend=?, revend_tstmp=? where id=? and rev<> ? and revend is null
This is how users_aud
table looks like after updating an entity.
+------------------------------------+---+------+-------+--------------------------+-------------+-------------+---------------+----------------+--------------------------+
|id |rev|revend|revtype|revend_tstmp |name |surname |email_address |last_modified_by|last_modified_date |
+------------------------------------+---+------+-------+--------------------------+-------------+-------------+---------------+----------------+--------------------------+
|d573d872-a01e-40f8-9a3a-7e7482d66699|2 |null |1 |null |Updated Lorem|Updated Ipsum|lorem@ipsum.com|Medium |2023-01-30 17:41:31.676099|
|d573d872-a01e-40f8-9a3a-7e7482d66699|1 |2 |0 |2023-01-30 17:41:31.692000|Lorem |Ipsum |lorem@ipsum.com|Medium |2023-01-30 17:36:47.561605|
+------------------------------------+---+------+-------+--------------------------+-------------+-------------+---------------+----------------+--------------------------+
Did you notice changes? revtype
is now 1
, meaning that the snapshot of this entity is update
. The previous revision now has revend
and revend_tstmp
values. These indicate the revision number and the time of the next record for the updated entity. How cool is that!
Now, let’s delete our entity to see what’s going to happen.
UserEntity userEntity = userRepository.findById(UUID.fromString("d573d872-a01e-40f8-9a3a-7e7482d66699")).orElseThrow();
userRepository.delete(userEntity);
Hibernate: delete from users where id=?
Hibernate: select nextval ('hibernate_sequence')
Hibernate: insert into revinfo (revtstmp, rev) values (?, ?)
Hibernate: insert into users_aud (revtype, revend, revend_tstmp, last_modified_by, last_modified_date, email_address, name, surname, id, rev) values (?, ?, ?, ?, ?, ?, ?, ?, ?, ?)
Hibernate: update users_aud set revend=?, revend_tstmp=? where id=? and rev<> ? and revend is null
This is how users_aud
table looks like after deleting an entity.
+------------------------------------+---+------+-------+--------------------------+-------------+-------------+---------------+----------------+--------------------------+
|id |rev|revend|revtype|revend_tstmp |name |surname |email_address |last_modified_by|last_modified_date |
+------------------------------------+---+------+-------+--------------------------+-------------+-------------+---------------+----------------+--------------------------+
|d573d872-a01e-40f8-9a3a-7e7482d66699|1 |2 |0 |2023-01-30 17:41:31.692000|Lorem |Ipsum |lorem@ipsum.com|Medium |2023-01-30 17:36:47.561605|
|d573d872-a01e-40f8-9a3a-7e7482d66699|3 |null |2 |null |Updated Lorem|Updated Ipsum|lorem@ipsum.com|Medium |2023-01-30 17:41:31.676099|
|d573d872-a01e-40f8-9a3a-7e7482d66699|2 |3 |1 |2023-01-30 17:50:16.264000|Updated Lorem|Updated Ipsum|lorem@ipsum.com|Medium |2023-01-30 17:41:31.676099|
+------------------------------------+---+------+-------+--------------------------+-------------+-------------+---------------+----------------+--------------------------+
Notice the 2nd row whose revtype
is 2
.
You may ask what those numbers for revtype
are. Those are the ordinal of the RevisionType enum that encapsulates the following state modifications:
ADD (0): Indicates that the entity was added (persisted) at that revision.
MOD (1): Indicates that the entity was modified (one or more of its fields) at that revision.
DEL (2): Indicates that the entity was deleted (removed) at that revision.
Getting Entity Snapshot
With the help of AuditReaderFactory
, we can query the entity snapshots a.k.a history of a record.
AuditReaderFactory.get(entityManager).createQuery()
.forRevisionsOfEntity(UserEntity.class, true, false)
.add(AuditEntity.id().eq(userId))
// .add(AuditEntity.property("your-prop-name").eq(id))
.getResultList();
If the *_aud
table has id
column that keeps the id of the entity, we can safely use AuditEntity.id()
to point the id column. Otherwise, AuditEntity.property(“your-prop-name”)
can be used. .forRevisionsOfEntity()
takes three parameters.
c
— The class of entities for which to query.selectEntitiesOnly
— if true, instead of a list of three-element arrays, a list of entities will be returned as a result. If you want to get a list of the history of a record, set it to true. Setting it false will return a three-element array as shown below.selectDeletedEntities
— if true, the revisions where entities were deleted will be returned.
I recommend creating an audit repository so that you can call any entity revision anywhere:
import org.hibernate.envers.AuditReader;
import org.hibernate.envers.AuditReaderFactory;
import org.hibernate.envers.query.AuditEntity;
import org.springframework.stereotype.Repository;
import org.springframework.util.StringUtils;
import javax.persistence.EntityManager;
import javax.persistence.PersistenceContext;
import java.util.List;
import java.util.Objects;
@Repository
public class AuditRepository {
@PersistenceContext
private EntityManager entityManager;
private AuditReader getAuditReader() {
return AuditReaderFactory.get(entityManager);
}
public <T> List<T> getRevisions(final Class<T> tClass, final String prop, final Object propValue) {
if (Objects.isNull(tClass) || !StringUtils.hasText(prop) || Objects.isNull(propValue)) {
throw new IllegalArgumentException("Invalid params.");
}
try {
return getAuditReader().createQuery()
.forRevisionsOfEntity(tClass, true, false)
.add(AuditEntity.property(prop).eq(propValue))
.getResultList();
} finally {
entityManager.close();
}
}
}
There are many options for creating a query. You can check them out here.
If you use getRevisions to get a history of a user by their email address, you can call it auditRepository.getRevisions(UserEntity.class, “email”, "lorem@ipsum.com");
, and the result will be a list as shown below.
Seeding Existing Data
If you implemented envers for the tables that already have record and audit stragety was set to ValidityAuditStrategy
, you may get errors when Envers updates therevend
column — Cannot update previous revision for entity com.example.demo.entity.UserEntity_AUD
. Don’t worry! For the existing records, we can write a script to seed the audit tables.
Here is a sample script for our users_aud
table.
CREATE SEQUENCE IF NOT EXISTS hibernate_sequence;
DO
$$
DECLARE
u RECORD;
next_id BIGINT;
BEGIN
FOR u IN SELECT * FROM users
LOOP
SELECT NEXTVAL('hibernate_sequence')
INTO next_id;
INSERT INTO revinfo (rev, revtstmp)
VALUES (next_id,
(SELECT EXTRACT(EPOCH FROM NOW() AT TIME ZONE 'utc')) * 1000);
INSERT INTO users_aud
VALUES (u.id, next_id, NULL, 0, NULL, u.name, u.surname, u.email_address, u.last_modified_by, u.last_modified_date);
END LOOP;
END;
$$;
With this script, we make sure that the seeded data represents the current insert (ADD)
snapshot of all the audited data prior to future update (MOD)
changes being applied.
Congrats! You’ve now created a reliable system for tracking change data. Your team will be able to monitor data changes real-time and make amendments when needed. No matter how much data gets updated, it will be recorded and audited.
Interested in digging deeper? You can clone the GitHub repository, and debug the sample project to see how it works.
If you are also interested in how to write integration tests with testcontainers, make sure to read the next blog.