Spring Cloud Dataflow on K8s

Siddhant Sorann
MiQ Tech and Analytics
5 min readApr 15, 2020

Providing better visibility, easier debugging and improving the performance of spring batch jobs.

What do we do?

In today’s fast-paced digital landscape, an effective campaign management solution ensures brands and enterprises deliver relevant, personalized experiences to customers across multiple channels and touchpoints. But managing marketing campaigns isn’t easy and it comes with its share of challenges: channel overload, ROI Attribution, automation, personalization and more.

To achieve the best out of our campaigns, Jarvis — MiQ’s Campaign Management Tool enables traders to manage the campaign’s set up, margin, reporting, optimizations from a single interface. Jarvis combines each step of the campaign management cycle into a unique interface, making life easier for the traders.

Why did we start searching for platforms to run our batch jobs?

To ensure that the traders have up to date information available before making any decisions we ingest data from multiple DSPs and aggregate it. Being a programmatic media buying agency, we at MiQ work closely with DSPs to buy inventory from them. To make this possible, we run 100+ jobs every day. Due to the amount of processing being done and sometimes due to unforced errors some of our jobs fail which results in data being inaccurate for some or all of the DSPs. The stability of our jobs had always been a concern for us.

The traders manage their campaigns, create reports based on the aggregated data we ingest, so it is very important for us to make sure that the ingested and aggregated data is accurate. If any of the jobs in our pipelines fail or don’t get triggered, it could result in partial or inaccurate data. This would cause the traders to recheck the data and inform us about the inaccuracy. This leaves us asking the following questions to ourselves:

“Did the batch job fail? Why did it fail? Where are the logs?”

“Why did the scheduled job not run? Do we need more resources?”

Even though we got failure notifications on our email and slack it was very difficult to debug it and get the failure log, which made it difficult to figure out why it was failing. We were using Spring Batch Admin to keep a track of these jobs which had been deprecated.

Taking into account the above issues and some more, we decided to look for a better alternative. What we wanted to be a framework that could handle our batch jobs in a better fashion. Segregated logs for each job, retrying of a job if it fails, easier scheduling of jobs with a reliable scheduler, were a few things we were looking for in a software.

We did a short POC around this, looked at various products like flux, quartz, etc. Finally, we came across Spring Cloud Dataflow. Spring Batch Admin had also been moved to SCDF. Also, our services were written using Spring and Java which made it feel like a good fit for us, so we decided to explore it more.

SCDF has very good support for the Kubernetes environment and since we had a Kubernetes environment in our organization, we decided to give it a shot.

About Spring Cloud Dataflow

Spring Cloud Data Flow provides tools to create complex topologies for streaming and batch data pipelines. The data pipelines consist of Spring Boot apps, built using the Spring Cloud Stream or Spring Cloud Task microservice frameworks.

Some of the major features which caught our attention:

  • An intuitive UI that allows you to trigger, schedule and check the status of tasks and jobs.
  • Easy addition of new tasks.
  • Not having to worry about the tasks when a new version of the application is deployed.
  • A REST API implementation which allows various functionalities like launching, deleting, creating tasks.
  • A shell tool to do everything which can be done in the UI or using REST API.

For some more information about SCDF you can visit: https://dataflow.spring.io/

Advantages of running jobs on Kubernetes

Running batch jobs on Kubernetes means that each job will run as a separate pod. It will have its own resources, which really helps when you have high load jobs like in our case. SCDF creates a new pod for each task execution. This means that each task gets an ample amount of resources, as much as it needs and is run separately from the other tasks. Some of our jobs started taking less time when running on Kubernetes. To throw some light on this, the average duration of some of our major jobs was reduced by about 10%.

What the Spring Cloud Dataflow UI looks like

Spring Cloud Dataflow has a very comprehensive UI which makes it easy to add apps, create customized tasks.

Checking the status of batch jobs, tasks and logs is made quite easy.

This shows the list of tasks that have been added. Each task has an argument passed which makes sure that only the specified job will run in the task. The tasks can be easily run or restarted from the UI. The status of each task is also displayed here which is the status of its latest execution.

Once inside the task, you can look at all the executions of the task. Clicking on the execution ID will take you to the page with the details for that particular execution.

Here you can see the arguments passed to the particular task in that execution. You can also see the batch job section which shows the batch jobs which ran as a part of this task execution. Clicking on the batch job ID will display the batch job execution details.

The hyperlink on the job execution IDs takes you further into the details of the job execution against the task.

The batch job execution page shows details of the batch job which was executed as a part of the task. It includes all the job parameters that were passed to it, the status of the job, the exit code and an exit message if any. In case of a failure like above, the exit message displayed can help you debug as to why the job is failing without going through a lot of hassle.

Another very convenient feature of SCDF is the scheduler.

Previously we were using the spring schedule to schedule our jobs. SCDF on Kubernetes takes advantage of the kubernetes cron jobs feature. It allows you to add a scheduler from the UI itself. This provided us with a big advantage where we could add, remove or modify the schedules of jobs without changing the code and releasing the service.

A quick summary

Spring Cloud Dataflow is a really good open-source platform for handling your spring batch jobs. It provides a really intuitive UI, shell and REST APIs for triggering, scheduling and monitoring of the jobs and tasks. It has the advantage of deploying on Kubernetes, which helped in making our pipelines more efficient and stable. The team developing SCDF is also very active and helpful. You can reach out to them via their Gitter channel or post questions on StackOverflow if you have any questions or issues. Most of our doubts were resolved within a couple of days. You can report bugs on their GitHub repo as well which they are actively developing. So a big thanks to them for developing and maintaining SCDF.

--

--