Spring Batch Processing Example

Chamith Kodikara
8 min readSep 2, 2019

Hi let’s have a quick look at Spring Batch Processing.

First let’s see what batch processing is…

Batch processing is a technique which process data in large groups (chunks) instead of single element of data. This is used to process high-volumes of data and do any modifications before processing with minimal human interaction. This has many advantages when dealing with large amount of data such as report generation, billing analysis systems and log analysis etc.

This is a basic idea of batch processing, And in this post we are going to take a look at the spring batch processing

Spring Batch Processing

Spring batch is an opensource batch processing framework which is provided by spring.

A lightweight, comprehensive batch framework designed to enable the development of robust batch applications vital for the daily operations of enterprise systems.

How Batch processing works…..?

https://docs.spring.io/spring-batch/trunk/reference/html/domain.html

Above is a basic structure of the spring batch. This is structured considering a normal batch processing architecture. Let’s see how each of these components works in spring batch.

JobRepository — This manages the process or the condition of Job and Step. All the management data is stored to the spring batch tables in the database which are specified by spring batch

JobLauncher — A simple interface for running a Job, and all possible ad-hoc executions. JobLauncher can be directly used by the user. But this won’t make any guarantee about whether its executed synchronously or asynchronously. It will depend on the implementation of the process.

Job — Single execution unit that defines the series of how the process works. Job is an explicit abstraction which represents the configuration of a job specified by a developer.

Step —Step is the processing unit of the Job, a Job can contain one or more steps depending on the logic we defines, Which we can define as chunk model or tasklet model. Same as a Job, Step is meant to explicitly represent the configuration of a step by a developer.

ItemReader, ItemProcessor, ItemWriter — ItemReader and ItemWriter are the components that reads and writes data, convert data and files to Java objects and vice versa. And we can use ItemProcessor to process these data in between read and write, we can introduce any of the business logic and data conversions etc.

That was the basic background of how spring batch works and its main components. You can find more detailed one in spring batch documentation in below link.

Getting Started….

Now since we have a basic idea of what batch processing is, let’s build a sample application to see how spring batch processing works. For this, I’m going to use a spring boot application. To begin with let’s create a simple spring boot application using “SPRING INITIALIZR”

Now we need to add spring batch features to our app. For this, we are going to add spring-boot-starter-batch to the pom.xml.

<dependency>
<groupId>org.springframework.boot</groupId>
<artifactId>spring-boot-starter-batch</artifactId>
</dependency>

This will include the most compatible batch core and batch infrastructure version according to our spring boot version. Since I’m using the latest spring boot version (2.1.7.RELEASE) we are going to get batch 4.1.2.RELEASE to our application.

Now we have spring batch core and spring batch infrastructure in our application, we can start develop our sample batch application.

In this demo I’m going to get some sample data from a csv file and move those data in to MySQL db using spring batch. First, I’m going to create sample csv file call “orders.csv” with order_ref, amount, order_date, note fields in resources path and add few records.

Next I’m going to add DB connection properties to the application.properties.

# Platform configs
spring.application.name=batch-demo
spring.datasource.platform=org.hibernate.dialect.MySQL5Dialect
spring.datasource.url=jdbc:mysql://localhost:3306/batch_demo_db
spring.datasource.username=root
spring.datasource.password=admin
spring.jpa.show-sql=true

Earlier I told you that spring batch is using DB table schema to keep management data for the spring batch process which are specified by spring batch. those tables are named with the prefix “BATCH_” in our database.

There are two ways to initialize these table first one is let spring batch to create these table automatically, for this we need to add this property in our application.properties file.

spring.batch.initialize-schema=always

if we add “never” instead of “always” we can switch off the initialization.

Another way to initialize batch tables is create them manually, in this sample app I have added the schema script in the dbscript directory.

Now we have completed the background we need to start our application. We can start the implementation.

I’m going to create the entity class for orders to process these data to the DB. This is just a basic entity with id, name, amount, date and note fields.

Let’s enable batch processing in our application. In order to do this we can use “@EnableBatchProcessing” annotation. This will enable spring batch features in our application. As a best practice you should add all the “@Enable” annotations in the main class of your application which will make your implementations easier. Since all your spring feature enabling will happens in one place, if you need to disable any feature you just need to remove correct annotation from the main class.

Now let’s start our batch configuration. I’m creating a BatchConfig class in the app and autowire “JobBuilderFactory”, “StepBuilderFactory” in to it.

JobBuilderFactory and StepBuilderFactory

Spring batch use JobBuilderFactory to create the job builder, StepBuilderFactory to create step builder and initialize JobRepository and PlatformTransactionManager automatically.

Config the ItemReader

Now we need to configure the ItemReader to read data from order.csv file. I’m creating a spring bean using FlatFileItemReader<T> which is a child class of ItemReader Interface.

In here I have created a FlatFileItemReaderBuilder for Order entity. I have given it a name and resource where our CSV file is and skip first line in order to skip CSV headers.

names identified the names of the fields that need to read, linemapper LineMapper interface custom implementation to map file line to the domain object.

LineMapper

We can implement custom line mapper to map lines from CSV file to domain object using LineMapper interface we can use FieldSetMapper implementation to map each field to domain object field and add it to the line mapper with custom delimiters.

FieldSetMapper

I have implement a custom OrderFieldMapper using FieldSetMapper to map each field to the domain Object.

Config the ItemProcessor

Now we have our ItemReader to read the data. We are now going to create ItemProcessor to process these data before we write them in to the database, for this I can create Custom ItemProcessor using ItemProcessor interface.

OrderProcessor

I have created custom OrderProcessor in order to convert order date from util.Date to time.LocalDateTime. This is just a simple processing I used in order to show how item processor works. You can use this to implement your custom logic that you need to run before you write the data in to database.

Now I’m going to define this as a spring bean in our BatchConfig configuration class.

Config the ItemWriter

Now let’s config ItemWriter to write these processed data in to our MySQL DB. We can use either JdbcBatchWriter to inset data using JDBC template or we can use JpaItemWriter which will use JPA and EntityManager to insert data in to the Database. For this example, I have used the JDBC one but you can use either of these which will be the best match for your requirement.

Config the Job and Step

Now since we have completed the reading processing writing implementations for our application we need to define a Step which will have those reader, processor, and writer using StepBuilderFactory to inject in to our batch Job.

Step

In this I have set the reader, processor and writer in to the Step. And I have defined the chunk as 5, so this will read, process and write chunk of 5 data set for each transaction.

Job

Finally we need to config our Job.

This is the Job Configuration, in this I have inject a step and a listener to the Job. We can implement custom listeners using JobExecutionListenerSupport to listen and log results before or after job execution.

These logs will print before and after the Job. We can have our custom implementation in these methods.

This is the Basic Implementation example of the Spring Batch processing, for this one I have only used the chunk mode but they provide tasklet mode. Tasklets are meant to perform a single task within a step. You can have more idea about these two mode in here

This is just a basic implementation that you can use to get a basic idea of how Spring Batch Works. After that you can do your own implementation to find best ways to use it for your requirement.

The full implementation and DB scripts of this can be found in github.

--

--

Chamith Kodikara

Senior Software Engineer @ Efutures, Tech enthusiast, Java programmer, love traveling & photography