Input Validation Made Simple in Spring Batch Applications

Stefano Stasuzzo
Dec 20, 2019 · 5 min read

Hello everyone! I’m Stefano and this is my first post on Medium. I am a Data Engineer at Quantyca, an Information Technology consulting company, and every day I am committed to facing new technological challenges in data management, system integration, Big Data architectures and software development.

In this post, I will explain the major features of Spring Batch, a widespread batch-processing framework, and in which way the validation paradigm JSR-380 and Hibernate Validator can be integrated to obtain robust applications. Finally, in the last paragraph, I will present a real use case about the integration of these technologies.

Photo by Helloquence on Unsplash

Introduction

Let’s start to introduce and explain two basic concepts such as input validation and batch applications.

Input validation is the activity of ensuring that only properly formed data are allowed to enter in an information system. This operation prevents that malformed input generates unexpected behaviors of downstream components: for this reason is preferable that the validation stage is executed as soon as the data is received from the external party.

Batch applications are processes that can be executed without any user interaction because the set of execution steps and their execution order are specified in advance. The absence of human interaction makes the Validation Task a critical activity in which we have to ensure that an improperly formed input is not processed and therefore does not cause the abortion of the entire process or the persistence of inconsistent data.

In the next paragraphs, we will analyze more in-depth Spring Batch, JSR-380 and Hibernate Validator.


Spring Batch

Spring Batch is a Spring-based framework designed to facilitate the development of reliable, resilient, and robust batch applications.
The major features provided by this framework are:

  • Different input/output sources
  • Native management of Start/Stop/Restart and Retry/Skip
  • Possibility to choose different programming languages
  • Monitoring the execution status of active/completed batches
  • Job processing statistics
  • Many job launch options (e.g. scripts, REST API, JMS messages…)
  • Transaction management

Thanks to optimization and partitioning techniques, Spring Batch also provides other features that will enable extreme high-volume and high-performance batch jobs.

Spring Batch framework schema

A batch process is typically encapsulated by a Job that may be associated with many JobInstances, each of which is defined uniquely by its particular JobParameters that are used to start a batch job. Each run of a JobInstance is referred to as a JobExecution that tracks what happened during a run, such as current and exit statuses, start and end times.

A Step is an independent specific phase of a batch Job, such that every Job is composed of one or more Steps. Similar to a Job, a Step has an individual StepExecution that represents a single attempt to execute a Step.

A Job is executed by a JobLauncher, and metadata about the executed jobs are stored in a JobRepository that also provides CRUD operations for JobLauncher, Job, and Step instantiations. Once a Job is launched, a JobExecution is obtained from the repository and, during the execution, StepExecution and JobExecution instances are persisted to the repository.


JSR-380 and Hibernate Validator

JSR-380 Bean Validation is a specification whose goal is to standardize the validation of Java beans through the use of annotations directly in a Java bean class. This feature allows validation rules to be specified directly in the class they are intended to validate, instead of creating validation rules in separate classes. This specification also allows you to:

  • Express constraints on object models via annotations
  • Write custom constraints in an extensible way
  • Validate objects and object graphs using the provided APIs
  • Validate parameters and return values of methods and constructors
  • Reports the set of violations

Commons annotations are:

  • @NotNull: specifies that a property must be not null
  • @NotEmpty: specifies that a property must be not null or not empty
  • @Size: ensure that a property size is between attributes min and max
  • @Email: specifies that a property must be a valid email address
  • @AssertTrue / @AssertFalse: ensure that a property value is true / false
  • @Positive / @Negative: ensure that a property is positive / negative
  • @Past / @Future: specifies that date must be in the past / future
  • @Max / @Min: specifies that a property has value not greater / not smaller than the value attribute

Each annotation can accept different attributes, but the message attribute — that will be rendered when the value of the respective property fails the validation — is common to all of them.

You can use JSR-380 Bean Validation by adding this dependency to your Maven project:

<dependency>
<groupId>javax.validation</groupId>
<artifactId>validation-api</artifactId>
<version>2.0.0.Final</version>
</dependency>

Hibernate Validator allows us to declare and validate application constraints. The default metadata source is annotations, but there is the possibility of override and extend through the use of XML. It is not tied to a specific application tier or developing model and is available for both server and client application programming. It offers a configurable bootstrap API as well as a range of built-in constraints that can easily be extended by creating custom constraints.

You can use Hibernate Validator by adding this dependency to your Maven project:

<dependency>
<groupId>org.hibernate.validator</groupId>
<artifactId>hibernate-validator</artifactId>
<version>6.0.17.Final</version>
</dependency>

Now we will see how can be easily implemented a Step that is in charge of validating an input file by integrating the features of JSR-380 and Hibernate Validator.


How to integrate Spring Batch with JSR-380 and Hibernate Validator

Let’s try to implement a simple Spring Batch single-step application that cleans up a CSV file from invalid lines.

The implemented step will do:

  1. Read an input file
  2. Validate each line using JSR-380 and Hibernate Validator
  3. Write an output file that will contain only the valid lines

Since the purpose of this post is to explain how to implement the validation step, the initial set-up of the Spring Batch application is neglected but you can find a useful guide at the Spring official website.

Let’s start to define the components that we need to develop this kind of batch application:

  1. Define the main class of the Spring Batch application:

2. Write a configuration class in which the validation step is defined:

3. Define the Bean class that represents an input line and annotate each field
with the desired JSR-380 annotation:

4. Define the Tasklet that implements the validation step and injecting the object Validator made available by the Hibernate Validator library:

Conclusion

In this post, we have seen how easy it is to integrate the JSR-380 specification in the validation step of a Spring Batch application, but surprises are not over! JSR-380 permits to define custom annotation and Hibernate Validator allow to implement a custom validator to validate custom constraint: but all these interesting features will be explained in a future post.

If you liked this post and you want to stay updated on the news, follow us on LinkedIn!

Quantyca

Quantyca — Data at Core

Stefano Stasuzzo

Written by

Junior Data Engineer @ Quantyca

Quantyca

Quantyca

Quantyca — Data at Core

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade