Honey, I shrank the Java image!

Smaller Docker images with multi-stage builds and Jlink.

Daniel Albuquerque
The Hotels.com Technology Blog
5 min readOct 10, 2018

--

Have you ever wondered why deploying your application take so long? Or why your disk is always full? Or why Amazon data transfer bills are so high?

There are plenty of reasons for this to happen and while some are outside of your control, the size of your docker images is not!

Using Jlink to create a minimal custom runtime, we managed to reduce the size of the docker image for a Spring Boot application by an impressive 250MB.

How did we do this?

To explain how we did this we’ll start by creating a simple Java application. A sort of a Hello World but with a dependency to a 3rd party library (in this case, Google’s Guava).

After using Maven to package this into a jar, we run jdeps and we get the following output:

Jdeps is a dependency analyzer for Java. It processes .class files (or jars) and does a static analysis of the dependencies between them. Because this tool is aware of the Java module system we can use it to list the modules that our application will need to run.

In this example Jdeps is telling us that our application depends on the java.base module and also on a “not found” module.

This happens because we didn’t tell Jdeps where to look for the dependencies (remember we added Guava?).

If we try again, but this time we tell Jdeps where our dependencies are, and if we ask it to recursively go through all of them, we get this instead:

Things make a bit more sense now.

Our application depends on the java.base module, but also on java.logging (via the transitive dependency to Guava).

You can remove the summary flag (-s) to get a more detailed view of which modules are used by each of the imports.

Let’s get to work

Using Jlink

Now that we know what modules we actually need we can use Jlink to create a smaller Docker image.

Jlink was made available with Java 9 and is a command line tool that can assemble a set of modules into a custom runtime image.

To be able to compare the complexity and extra work introduced by this tool I’m going to first create a Docker image with a normal (full size) Java runtime.

TL;DR; we start with Debian as our base image, we then grab Oracle’s OpenJDK, install it, and finally we copy our jars.

And unsurprisingly, our Docker image is massive.

Using Docker multi stage builds

Some time back Docker introduced multi stage builds as a way to keep images size down.

In your Dockerfile you can have multiple FROM instructions, each using a different base image, and each will begin a new stage of the build. What this means is that you can have temporary stages using fat images with the entire JDK, and even Maven or other utilities to build your project, and then selectively copy just what you need to the next build stages.

Let’s look at the following example:

The first few lines are similar to the previous Dockerfile. We start with Debian again and install (a full) Java.

However, this time we run Jlink to create a minimal runtime with just the two modules that we will need.

We then start a new stage with an empty Debian image and copy the minimal runtime instead of installing the full runtime.

Let’s see the difference:

This small change represents a massive saving of 280MB for this simple Hello World application .

Conclusion

Be careful when using this though as you may run into issues at runtime. If you miss a module you will get a not very cool ClassNotFoundException. The number of things that can go wrong is proportional to the size and complexity of your application, however some (or most?) of these steps can be automated by just adding a few extra lines to the Dockerfile or with some existing maven plugins (check links below). And the best part is that you don’t even need to be using the new modules systems (ie, module descriptors) in your project to get these benefits.

References

Docker multi stage builds

Jdeps

Jlink

CodeFx

Maven jdeps plugin

Maven jlink plugin

--

--