Making collection tables safe and accessible
Very often we work on applications which are built using Spring Boot and related libraries, one of the prominent library among them is Spring Data JPA, which adds a repository support to JPA (Java Persistence API specification). Common vendor implementation of JPA used is Hibernate. Their purpose is to provide ORM support. In a nutshell ORM tool/framework tries to solve Object Relational Impedance Mismatch problems.
I am sure everyone must have encountered the situation where they had to persist an Entity along with few value objects. (Here we are referring to Entity as an instance which has a unique Identifier within the system on the other hand Value objects don’t have a conceptual identity and they pose structural equality, as per Domain Driven Design. It’s a topic for another blog)
CollectionTable from JPA is used for the same, just to give an example There is an entity Employee which contains a list of that EmployeeAddress.
@Entity
@Getter
@Setter
@Table(name="Employee",schema = "org")
@AllArgsConstructor
@NoArgsConstructor
public class Employee {
@Id
@GeneratedValue(strategy=GenerationType.IDENTITY)
private Long id;
private String name;
@ElementCollection(targetClass = EmployeeAddress.class)
@CollectionTable(name = "employee_address", joinColumns = @JoinColumn(name = "employee_id"))
@LazyCollection(LazyCollectionOption.FALSE)
private List<EmployeeAddress> employeeAddressList;
} @LazyCollection(LazyCollectionOption.FALSE)
private List<EmployeeAddress> employeeAddressList;
}
EmployeeAddress.java
@Getter
@Setter
@Embeddable
public class EmployeeAddress {
private String city;
private String Country;
}
So whenever the Employee object is saved, the list of EmployeeAddress also gets overwritten completely.
public Employee saveOrUpdate(Employee employee) throws Exception {
return employeeRepository.save(employee);
}
Some problems with collection tables
- There is always a deletion and insertion of the embedded object list while saving the parent entity.
- There could possibly be an unintended duplication of records in embedded table.
Every time deletion and insertion of the embedded object
This will become an issue when the embedded object list is big. As there will be a large number delete and insert queries that will be run causing unnecessary I/O operations on DB. See the sql generated to save an employee with unmodified employee address.
Hibernate:
select
employee0_.id as id1_8_0_,
employee0_.name as name2_8_0_
from
cms.employee employee0_
where
employee0_.id=?
Hibernate:
select
employeead0_.employee_id as employee1_9_0_,
employeead0_.country as country2_9_0_,
employeead0_.city as city3_9_0_
from
cms.employee_address employeead0_
where
employeead0_.employee_id=?
Hibernate:
delete
from
cms.employee_address
where
employee_id=?
Hibernate:
insert
into
cms.employee_address
(employee_id, country, city)
values
(?, ?, ?)
Hibernate:
insert
into
cms.employee_address
(employee_id, country, city)
values
(?, ?, ?)
Hibernate:
insert
into
cms.employee_address
(employee_id, country, city)
values
(?, ?, ?)
The logs contain 3 inserts and 1 delete statement. This will happen on every save call for the Employee entity.
This insert/update of an employee can also be achieved by marking EmployeeAddress as an entity and removing the @ElementCollection and @CollectionTable annotations.
But this makes more sense when there is a service layer which directly interacts with the EmployeeAddress(e.g there is another service to directly save/fetch/delete EmployeeAddress) and it’s an Entity.
Unintended duplication of records
A situation could arise where a system reaches a dead lock while updating the parent table (Employee) and the collection table (EmployeeAddress).
Since the updates happen in a delete then insert manner for a collection table and because of a deadlock, there is a chances of duplicate entry of the same record. See the snapshot:
Remedy : Introduction of a primary key in collection table
We can address both the problems, if we find a way to associate a primary key with the collection table.
It can be achieved by hibernate’s @OrderColumn annotation.
@Entity
@Getter
@Setter
@Table(name="Employee",schema = "org")
@AllArgsConstructor
@NoArgsConstructor
public class Employee {
@Id
@GeneratedValue(strategy=GenerationType.IDENTITY)
private Long id;
private String name;
@ElementCollection(targetClass = EmployeeAddress.class)
@CollectionTable(name = "employee_address", joinColumns = @JoinColumn(name = "employee_id"))
@LazyCollection(LazyCollectionOption.FALSE)
@OrderColumn(name = "order_column_id", nullable = false)
private List<EmployeeAddress> employeeAddressList;
}
@OrderColumn(name = “order_column_id”, nullable = false) will add new column order_column_id to the employee_address table.
The combination of the join column i.e employee_id and order_column_id will form a primary key on the employee_address.
The order_column_id value for each record for an employee_id will be in the increasing order like 0,1,2,3. See the snapshot
Now let us see what happens in case we add a new address for the same employee_id: 2. The input json with previous two addresses still present is:
employee: {
id: 2,
name: "Rohit",
employeeAddressList : [{
city:"Noida",
country: "India"
},
{
city:"Delhi",
country: "India"
},
{
city:"Chennai",
country: "India"
}]
}
This is what the sql generated looks like when our application calls save() for EmployeeRepository.java for Employee entity.
Hibernate:
select
employee0_.id as id1_8_0_,
employee0_.name as name2_8_0_
from
cms.employee employee0_
where
employee0_.id=?
Hibernate:
select
employeead0_.employee_id as employee1_9_0_,
employeead0_.country as country2_9_0_,
employeead0_.city as city3_9_0_,
employeead0_.order_column_id as order_co4_0_
from
cms.employee_address employeead0_
where
employeead0_.employee_id=?
Hibernate:
insert
into
cms.employee_address
(employee_id, order_column_id, country, city)
values
(?, ?, ?, ?)
From the logs it is clear that the insert operation got executed only once.
To check for primary key constraint on employee_id and order_column_id try executing this query:
INSERT INTO cms.employee_address(
employee_id, country, city, order_column_id)
VALUES (2, 'India', 'Bangalore', 1);
You will see the following exception from the sql server:
ERROR: duplicate key value violates unique constraint "employee_address_pkey" DETAIL: Key (employee_id, order_column_id)=(2, 1) already exists. SQL state: 23505
Final Takeaway
Using the @OrderColumn annotation we can add a primary key to a collection table, thus helping in maintaining uniqueness of records in the collection table and also can prevent unnecessary insert and deletes in the collection table.
Himanshu Mishra is Senior Consultant at KodeYoga, and provide value consulting in Full stack development, Cloud Computing, Reactive programming following best practices to convert clients’ vision into reality.