Docker With Spring Boot: Part 1 — Create Images Using Docker File

Published in

Javarevisited

8 min readJan 26, 2024

I am assuming you have a fundamental knowledge of docker. The example project is available in this GitHub repository. I will discuss docker fundamental concepts and best practices which are framework agnostic. Let's start with the approach of using a docker file.

Explore the docker file

This is the standard way of dockerizing applications

Docker Image and layers

Docker image is made of layers and uses layer caching which means if nothing changes related to a layer it will not recreate that layer again if it is already available. It also reduces network traffic when pulling an image from a docker registry. These layers might be reused in multiple images too.
There is a relationship between docker files and docker image layers. It starts with the layers of the base image and then apart from the metadata commands like LABEL all the other lines will end up in a new layer
So the order of lines can impact the performance when building images.
These layers are immutable and docker uses a technique called union mount which allows it to create virtual directories by merging content in different layers. Learn more in this article
Docker uses a copy-on-write strategy which allows top layers to read from the bellow layers for existing resources, but when needed to update it copies the files and does the modifications in the new layer.
Let’s see a production-grade docker file for Spring Boot applications to continue the discussion

Examine a Dockerfile

Selecting base image

The docker file starts with using a base image.
When choosing a base image consider official images with no vulnerabilities. Official docker images can be found in the docker hub
Each image has different tags which represent different versions. Always use a fixed version, if the version is not defined docker will pull the image that is tagged as the latest. This might break your builds if breaking changes are available in the latest version.
To define the base image, instruction FROM is used (in lines 1 and 9)

Multistage Build

But this file has two from statements, this is a multistage build. This helps to separate different phases of building images which can be used to build the final image with only the necessary files to reduce the image size and make it more secure. In this example, we don't need maven in the final image which is only required to build the jar file. Removing unnecessary dependency also reduces the vulnerabilities of the final image. Docker is also smart to parallel process non-conflicting instructions in multiple stages.
In the first stage, I’m using the maven base image because here I use maven to build the application. Notice we have given a label using the AS keyword which can be used in later stages.

WORKDIR and COPY instructions

The WORKDIR defines the directory the docker file working on
The COPY command allows to copy of files from the local directory to the image layer
Here in line 3, pom.xml is copied to the working directory (/app)
In line 4, we download all dependencies and plugins using the Maven dependency plugin.

Optimize layers by separating contents into different layers based on the frequency of content change

Why we didn’t copy the whole directory and then run the Maven package? This is to optimize the use of docker layer caching. The chance of changing dependencies is very low, but application code changes frequently. If we create a layer with all content if something changes all the layers following that layer will be recreated which is a waste of time and resources if the dependencies don’t change.
After 4th line, we have a layer which contains the required dependencies
In line 5 copy the source directory and in line 6 create a fat jar (jar with all the transitive dependencies) using the Maven package
Now we have the generated fat jar file in the target directory
There is another feature available which is called cache mounts which enables to mounting of a persistent cache layer when running the RUN instruction at the build phase.

RUN --mount=type=cache,target=/root/.m2 mvn package

Now what happens is .m2 directory will be cached and reused in the consequent builds. To make this work you want to set the DOCKER_BUILDKIT environment variable to 1. This way also we can cache the dependencies and stop redownloading them. But double-check the benefits when this is used in a distributed CI environment
We can use this jar file and run directly in the next stage using the java -jar command. Still, Here I use another layer optimization by using the Spring Boot layer index which helps to extract external dependencies and application code separately from the fat jar file which is done in line 7.

Base image selection in the second layer

These extracted layers are used in the next stage where the base image is eclipse-temurin:21-jre-alpine
The open JDK docker image is deprecated, so you need to find an alternate version like this, another popular base image is amazoncorretto
Here I’m using a tag that contains JRE (Java Runtime Environment). If your code depends on some compilation packages at runtime you might need to choose a JDK version instead.
This is also based on Alpine which is a small Linux-based image. The reduced size means reduced dependencies, hence experiment this your requirements.

Increase security by using a non-root user

Line 9 creates a new user group and the user then line 10 instructs docker to use this new use when running the command after this layer.
This is a security best practice to avoid the use of root users in docker containers. In some cases your base image already has a defined user where you can directly use them, for example, check this gradle base image where gradle use is created.

Multiple commands to RUN instruction to optimize layers and increase security

I want to highlight the fact that we can combine multiple RUN instructions using &&. The importance here is this will result in a single layer.
This can be used to reduce your image size, for example, think you install something using yum p[package manager, these package managers maintain a cache which will end up in the final image and increase image size. Since layers are immutable deleting this cache in a new line will not remove it from the previous layer. But if you perform this action after installation in the same line which means results in the same layer and no cache folder will included.
This can also used to improve security, for example, if you are fetching some secrets required for the build process and removing them in the same RUN instruction will make sure that the secret will not be available in any layer.

Setup application to start when starting the container using ENTRYPOINT

From lines 13 to 16 extracted jar file copied from the builder stage and in line 17 define the entrypoint with the help of the Spring Boot JarLauncher class.
There is another instruction called CMD, the difference is the command in CMD will be overridden if any command is supplied to the docker run command. On the other hand, ENTRYPOINT will always execute and the addition command passed by the docker run command will be appended to the end of this command.
You can combine those by using ENTRYPOINT and then use CMD to pass default parameters to the command in the ENTRYPOINT
This array form is also known as exec which executes the command directly as a p[process. There is another form where we can pass a string that will run in a shell.

ENTRYPOINT java org.springframework.boot.loader.launch.JarLauncher

This introduces another shell process and prevents reaching UNIX signal to your container which are downsides but now you can use shell features like pipes
The recommended approach is to use the exec form
If you need to pass Java system properties in the command it is a bit tricky with this exec form. The Java system properties should be added after the Java command not at the end, hence appending will not work correctly. The solution is to use environment variables but we can't access these shell-specific environment variables from the exece form, hence need to run the application as a shell process.

ENTRYPOINT ["sh", "-c", "java ${JAVA_OPTS} org.springframework.boot.loader.launch.JarLauncher"]

-------
docker run -p 8080:8080 -e "JAVA_OPTS=-Ddebug -Xmx128m" spring-multistage-maven

ENV and ARG

There are two instructions that I didn't use but worth discussing is ARG and ENV.
ARG is used to keep variables that can only be used in build time. The values can be provided with the — build-arg option in the docker build command
ENV is used to set environment variables that can be accessed in both build time and runtime.

COPY vs ADD

Another two instructions that are compared are ADD and COPY. ADD has more capabilities like copying files from a remote location which is very rare. If you used ADD instead of COPY use with caution. COPY is the recommended instruction to copy local files into a docker layer.

Native build using GraalVM

Using GraalVM we can create native executable binaries which can run without JRE.
I will not discuss its mechanics and benefits and spring boot support details here. But we can learn a few docker concepts with the help of the above docker file.
In line 1 the base image name is a bit different than what we saw previously. The name consists of three parts which are the registry in this case it’s GitHub container registry then the user name and the final part is the image name followed by a tag.
The reason we didn't want to specify it in the previous docker file is by default it searches in the docker hub(docker.io) and the name is missing because they are official images
The next thing is why I am using Oracle Linux docker image in the second stage.
That’s because the GraalVM image is based on Oracle Linux and the binary it created can be run only in Oracle Linux-compatible operating system.
One last thing, if you can build a binary that does not depend on OS specifics like shared libraries use scratch as the base image that will not add any layer. (Note that you cant pull scratch from the docker hub) Check this video if you are concerned more about the size of your docker images.

Build and Run

I will not discuss deeply about these commands and what they do, just for the sake of completeness here are commands to build and run in interactive mode

docker build -f prod.maven.Dockerfile -t spring-multistage-maven .

docker run -p 8080:8080 -it spring-multistage-maven

There are many docker file examples available in this GitHub repository where can look, if interested.

I hope you will get to at least one thing even if you are a developer who uses docker in day-to-day use. My goal is to reduce your unknown, unknowns because now you can go and learn these concepts that interest you most, deeply. Help me to reduce my unknown, unknowns by adding a comment if you think there is a missing piece in this article.

In the next part let’s create images using the spring boot plugin which uses Cloud Native Builpacks internally to create images declaratively without writing a docker file which is the more recommended way.

Resources

Getting Started | Spring Boot Docker

Topical guide to using Docker and how to create container images for Spring Boot applications

spring.io

Docker Tip #63: Difference between an Array and String Based CMD

The official terms for this are exec form and shell form commands. Both do nearly the same thing, but there's an…

nickjanetakis.com

Deep Dive into Docker Internals - Union Filesystem

Working with Docker CLI is very straightforward - you just build , run , ins...

martinheinz.dev

Docker ARG, ENV and .env - a Complete Guide

Stop struggling to build Docker images and configuring your dockerized apps. This is the complete guide to build-time…

vsupalov.com