Spring Boot + JPA — Clear Tests

Published in

Javarevisited

7 min readDec 17, 2021

I have written several articles about testing Spring Boot applications. We discussed integration testing with database and Testcontainers library particularly. But today I want to cover an even more important topic. You see, tests are essential to building robust software. But poorly written tests are not the case. They can slow down the development process and make code hard to change and maintain.

When it comes to testing, one has to declare test data. It can be items in the, records in the database, or messages in the distributed queue. The behavior cannot be tested without data.

So, today we’re discussing the patterns of creating test rows in the database with Spring Boot + Spring Data + JPA usage.

You can find all code snippets in the repository.

Domain Model

Assuming we’re are developing the Medium like service. People can make posts and leave comments. Here is the database schema.

And here are the corresponding Hibernate entities.

Post

Comment

User

Business Case

Suppose we should display top N posts sorted by rating in descending order with comments.

The implementation details have been omitted for brevity because it’s not the topic of the article. You can find the complete working solution in the repository.

Time to write some tests. Suppose we have 10 posts. Each post has 2 comments. And we want to retrieve the top 3 posts. That’s how the test may look like.

This test passes. It works correctly. Anyway, some problems should be mentioned.

The "given" phase is way too complex
The test has too many dependencies
Assertions are vague and unclear

Before performing the query we need to insert some records into the database. In this case, entities are being instantiated via constructors. Seems like a good design choice. All the required attributes can be passed as parameters. If another field appears later, it can be included in the constructor as well. So, we can guarantee that all non-null fields are present.

Well, the devil is in the details.

The test code becomes too coupled with the constructor definition. You see, there can be plenty of tests that need to create posts, comments, and users. If any of these entities’ constructors get enhanced with another attribute, all parts of the code that instantiate objects directly won’t compile.
Not all the entities’ fields are necessary in every test case. For example, while testing the findTopPosts method we're only interested in the rating value of the Post entity.
The arguments are not self-explanatory. Post entity constructor has 5 positional arguments. Every time you look through the test suite you have to stop and analyze the target values that will be assigned.
The test case is imperative rather than declarative. Why is that a problem? Because tests are the API contracts. Well-written tests act as perfect class documentation.

We can come around with Object Mother Pattern. A simple factory with multiple creating policies. That is how we can instantiate new Post objects.

Similar factories can be created for all entities. So, now the test looks like this.

Indeed, it’s easier to understand the flow. And it can work for simple scenarios. But let’s assume that there are several cases of creating new posts. That would require adding many overloading methods.

What if the Post entity had a greater amount of fields? What if some of these attributes were optional? Imagine how many createPost declarations would we need to cover all these cases.

The Object Mother Pattern partially solves the problem of arguments’ names by reducing its number. Anyway, the solution is far from perfect.

Test Data Builder

Test Data Builder is an alternative for the Object Mother pattern. It’s a typical GoF pattern. But we’re enhancing it a bit for our test cases.

Take a look at PostTestBuilder.

It looks like a regular builder with a few slight differences.

Firstly, the class implements the Builder interface that provides the single build method. You may think that it's an overcomplication but soon you'll realise its benefits.

Secondly, the relational attributes. The builder holds Builder<User> instead of User itself (and so for other relations). It helps to make the client code less verbose.

And finally, mutators (withName, withContent, etc.) return a new builder instead of changing the state of the current one. It's a useful feature when we want to create many similar objects that only differ by specific arguments.

The rewritten test looks like this.

Have you noticed how cleaner the test looks? We’re creating 10 posts with a rating of i. Each post has 2 comments. Amazing!

But sadly, it does not work.

org.hibernate.TransientPropertyValueException: Not-null property references a transient value - transient instance must be saved before current operation

Post.author save is not propagated. It can be fixed by CascadeType.PERSIST option usage. But perhaps we do not need to cascade the author persistence in the application. So, changing it just to make the test "green" is the wrong path.

Persisted Wrappers

The Post entity and all of its relations (author and comments) are being created on the build method invocation. So, we need to persist child objects when the corresponding builder processes the instantiation. Do you remember the additional Builder interface? It will help us now.

The persisted callback returns the same builder that will persist the resulting entity prematurely.

Now we need to inject TestDBFacade into our test suite. This could be done by @ContextConfiguration annotation usage.

We can do this even better by declaring custom annotation.

So, that’s the resulting test.

We do not have to wrap comments with persisted callback because Post.comments relation is already marked with the CascadeType.PERSIST attribute.

Now it works.

I think you’ve already guessed that not everything is tuned properly. If you turn on the SQL logging, you can notice it.

We expected to have a single user. But in reality, a new one was created for each post and comment. It means that instead, we have 30 users.

Thankfully, there is a workaround.

The persistence is being processed only on the first call. Latter invocations return the saved entity.

This method is not thread-safe. Nevertheless, it’s acceptable in most cases. Because usually, each test scenario is an independent operation that runs sequentially.

All we have to do is to change persisted with persistedOnce usage.

Now there is only one persisted user.

Another thing to refactor is PostRepository usage. We do not have to add additional dependencies on specific Spring Data repositories. We can add the save method directly to TestDBFacade.

Clear Assertions

The last thing that I want to discuss is assertions usage. Take a look at the current test again.

The size assertion is OK. And what about the sorting and the comments count check? They are not transparent. When you look through the code, you should pay extra attention to figure out the purpose of these assertions.

Anyway, that is not the only problem. What if they failed? We would get something like this in the log.

Posts should be sorted in by rating in descending order
Expected : [posts list]
Actual   : [posts list]

Each post should have 2 comments
Expected : true
Actual   : false

Not so descriptive. But there is a better alternative.

Hamcrest

Hamcrest is an assertion library that helps to build a declarative pipeline of matchers. For example, that’s how posts count can be validated.

Not so different from the initial attempt. What about two others? Well, these assertions are domain-oriented. Hamcrest does not provide utilities for specific business cases. But what makes Hamrest powerful are custom matchers. Here is one for comments count validation.

The matchesSafely method executes the assertion itself. In this case, we check that the sorted is equal to the initial one.
The describeTo method assigns a label to the assertion.
And the describeMismatchSafely logs the output in case of test failure.

The assertion usage is straightforward.

Suppose that received posts do not have the expected count of comments. Here is failed test output.

Expected: 2 comments in each post
     but: PostView[10] with 3 comments ; PostView[9] with 3 comments ; PostView[8] with 3 comments

The intent is obvious now. We expected that every post would have 2 comments but it was 3. Much more expressive than simple expected true but was false.

Time to write the matcher for posts sorting order validation.

And here is the assertion.

Matchers.is is a decorator that makes tests look more expressive. Though it can be omitted.

And here is failed test output.

Expected: is sorted posts by rating in desc order
     but: [7.0, 8.0, 9.0]

Again, the user intent is obvious. We expected to receive [9, 8, 7] but got [7, 8, 9].

Summary

We did some refactoring. Let’s compare the initial attempt and the final version.

The First Attempt

The Final Version

A first option is a group of commands. It’s hard to understand what exactly are we testing. Whilst the second approach tells us much more about the behaviour. It’s more like documentation that describes the contract of the class.

Conclusion

Once I heard a wise statement.

Tests are parts of the code that do not have tests.

It’s crucial to test business logic. But it’s even more important to make sure that tests won’t become a maintainability burden.

That’s all I wanted to tell you about writing clear tests. If you have any questions, please, leave your comments down below. Thanks for reading!