Apache Flink Job Deployer 1.0: Using the Apache Flink REST API

Marc Rooding
inganalytics.com/inganalytics
3 min readAug 27, 2018

Almost half a year ago, together with Niels Denissen, I open-sourced the Apache Flink job deployer, which we wrote for one of the projects at ING’s Wholesale Banking Advanced Analytics team.

At that time, we made it so that it extended the official Apache Flink Docker image, to allow access to the Flink CLI. This served the purpose, but there were also some disadvantages:

  • If the CLI were to change in a new Apache Flink release, we would need to provide different containers for the Job deployer, since we can only extend from 1 of the base Docker images.
  • We had to mount the Apache Flink configuration file to the container to be able to use the CLI with our Flink cluster.
  • It required parsing plain-text command-line output which is more error-prone than parsing, for example, JSON responses from a REST API.
  • Even though the deployer executable is very small, extending from the base Flink Docker image results in a 750 MB Docker image for the deployer.

REST to the rescue

As an alternative to using the native Flink CLI, there’s also a native REST API. After some investigation, it seemed like almost all the features that we wanted was offered by the API.

The only feature that version 0.3.0 of the deployer offers, which isn’t feasible by using the Flink REST API, is querying a job’s state. Seeing that querying state doesn’t belong as a core functionality for the deployer, and we didn’t use it that often, we were fine with losing this when rewriting the deployer to use the REST API.

And so it began, we refactored the entire codebase to ditch the native CLI and use the Flink REST API.

The end result

The 1.0 release still supports deploying and updating Apache Flink jobs. However, we were able to switch the base Docker image to Alpine, decreasing the final Docker image from 750 MB to 5 MB!

Furthermore, by utilising the REST API, we not only have a better separation but we will also be able to support multiple Apache Flink API versions in one Docker container if the need arises.

What’s next?

If you remember from our initial article, we need to mount the same Kubernetes persistent volume that stores all the job savepoints to our deployer container. We’d love to loosely couple our deployer even more. Unfortunately, the Apache Flink REST API currently doesn’t offer an endpoint for retrieving all the savepoints for a specific job. We’re playing with several thoughts about how to tackle this which we’ll save for a future article.

Last but not least, we’re really curious how other people are using our deployer and what features would be of value to you. Don’t hesitate to contact either me or Niels Denissen! We are also honoured to be able to present our solution at Flink Forward 2018 in Berlin on the 5th of September.

--

--