Dockerizing with Distroless

7 min readOct 8, 2020

There are many tutorials that effectively step through the process of Dockerizing an application. These are often wonderful resources that truly help improve understanding and move applications into the exciting world of containers.

However, to enhance security and minimizing container size by using Distroless images, the steps forward are not quite as clear. Keep on reading for a guide along the path of the subtleties of optimization with Distroless images. We’ll look at a couple of languages of choice for examples: Javascript (Node.js), Java, and Python.

Let’s Set the Scene

Dockerizing an application is the process of converting an application to run within a Docker container. The outcome is a much more portable and rapidly deployable application. Let’s link to a couple of good reads about the Dockerization process itself: (article1 | article2). Also, if Docker is a new technology to you, head on over to this comprehensive tutorial Docker-Curriculum that teaches enough to be dangerous.

Distroless Docker images were pioneered by Google to improve security and container size. Typically, security scanning tools help to protect the image and small Linux distributions help to hone container size and performance. Distroless images addresses these topics with images that contain only: the application, its resources, and language runtime dependencies, no operating system distribution. This approach creates a smaller attack surface, reduces compliance scope, and results in a small, performant image. Google has made these for some popular languages. Checkout this jFrog talk from a Google staff engineer on Distroless to learn more about the topic. Disclaimer: Google gets the full advantages of Distroless in tandem with Bazel, but that doesn’t mean it’s still not for you.

Basic Dockerization

Let’s quickly summarize the normal process for application Dockerization. In essence, you’ll start with a base image (from Docker Hub), add your application code with its dependencies, and then configure a few elements likes the ports exposed. For this example, let’s only use Node.js.

Let’s take this stupidly simple Express application that runs on port 8080 in server.js.

1) const express = require('express')
2)
3) const app = express()
4) app.get('/', (req, res) => {
5)  res.send('Hello There')
6) })
7)
8) app.listen(8080, '0.0.0.0')

With this package.json describing its dependencies.

1) {
2)  "name": "docker_web_app",
3)  "version": "1.0.0",
4)  "main": "server.js",
5)  "dependencies": {
6)    "express": "^4.16.1"
7)  }
8) }

Now, for Dockerization. 🥳 Let’s create a Dockerfile to containerize this app.

 1) # Start with the node 12 base image from Docker Hub
 2) FROM node:12
 3) 
 4) # Create app directory
 5) WORKDIR /usr/src/app
 6) 
 7) # Bundle source, assuming the Dockerfile lives in the app's code
 8) COPY . .
 9) 
10) # Install app dependencies
11) RUN npm ci --only=production
12) 
13) # Expose the port the server runs on
14) EXPOSE 8080
15) 
16) # Define a default command to start the server
17) CMD [ "node", "server.js" ]

Finally, build and run the image (executed in the same directory as the Dockerfile)

docker build -t yourusername/repository-name .docker run -p 3000:8080 yourusername/repository-name

Once more, with Distroless

Since I’ve referenced to a Broadway play and simultaneously Jerry Lee Lewis song in an article about Docker containers, we can move on…

Now that we are back to business, let’s realize and admit that your ecosystem and approach will greatly impact Distroless use. However, the goal is to address enough topics and show many examples so you can adapt appropriately.

Because Distroless images have no operating system, a multi-stage Docker build is used to perform some config work upfront and then selectively copy artifacts into the Distroless image.

In the first stage of the build, the application is typically copied into the build-env image. Next, perform some actions like dependency installation or certificate configuration. Finally, move the necessary items into the distroless image. Let’s look at some simple examples:

Node.js

 1) # Use general node image as builder and install dependencies
 2) FROM node:10.17.0 AS build-env
 3) ADD . /app
 4) WORKDIR /app
 5) RUN npm ci --only=production
 6)
 7) # Copy application with its dependencies into distroless image
 8) FROM gcr.io/distroless/nodejs
 9) COPY --from=build-env /app /app
10) WORKDIR /app
11) CMD ["server.js"]

In this case, the application that is brought into the build-env image (line 3), is not any type of an archive file. So, the application is simply copied into the distroless image after the application’s dependencies are installed.

Java

 1) # Use openjdk image as builder and build a jar
 2) FROM openjdk:11-jdk-slim AS build-env
 3) ADD . /app/examples
 4) WORKDIR /app
 5) RUN javac examples/*.java
 6) RUN jar cfe main.jar examples.HelloJava examples/*.class
 7) 
 8) # Copy the jar into the distroless image
 9) FROM gcr.io/distroless/java:11
10) COPY --from=build-env /app /app
11) WORKDIR /app
12) CMD ["main.jar"]

Here, the app has no dependencies and is compiled into a .jar file. Peak at a few jar command arguments if needed for reference. Now, this example is simple; just compiling some straight Java files into a .jar file. However, using Gradle or Maven would be a very similar in approach.

Python

 1) FROM python:3-slim AS build-env
 2) 
 3) # Install dependencies:
 4) COPY requirements.txt .
 5) RUN pip install -r requirements.txt
 6) 
 7) # Now setup distroless and run the application:
 8) FROM gcr.io/distroless/python3
 9) 
10) WORKDIR /app
11) # Set Virtual ENV
12) ENV VIRTUAL_ENV=/opt/venv
13) RUN python3 -m venv $VIRTUAL_ENV
14) ENV PATH="$VIRTUAL_ENV/bin:$PATH"
15) 
16) # Copy the source code into the distroless image
17) COPY --from=build-env /app /app
18) 
19) CMD ["hello.py", "/etc"]

In this Python example, we follow the same copy, install, and move to Distroless pattern. In the Distroless image, we configure the Python virtual environment in the elegant fashion based on this resource. This approach was a big win to streamlining our Dockerfile.

Additionally, depending on use case, the multi-stage build might not be a necessity. For example, coping in a ready-to-go .jar file might be all that’s needed. Let’s begin to look at some more details that are a little more real world (especially for the enterprise).

Vendoring Dependencies

In the previous Docker examples, all of the dependencies are declared in the source code and installed during the Docker build. Vendoring dependencies shifts that concept a bit. Package vendoring is storing the application’s dependent packages within the project.

Now, let’s clarify, I would never dane to disrespect the second principle of the sacred 12 factor application. This vendoring concept is separate. The source code still explicitly declares the dependencies and they live outside the source code repository. Vendoring is instead about the artifact that is created for deployment, not the source code.

This approach is needed in certain situations. For example, you might be operating within an ecosystem where compliance regulation dictates the application artifact be centrally stored, using the “frozen” artifact for multiple environment or platforms. Audit is typically the driver for procedures of this nature.

In this same vein, the (likely automated)environment building the Docker image might only have access to internal, company registries or maybe no outside internet access to download these dependencies. In this case, before building the image, the ci/cd process can download the dependencies for later use / installation. Let’s look at a few cross-language examples:

Node.js

Java

Python

Certificates

When operating within an enterprise, it’s often needed to account for and handle everyone’s favorite thing: certificates. Certificates are actually not too big of a deal, just place the cert in the appropriate place within the distroless image. Also, depending on the application’s language/framework, the cert might need to be consumed by the application as well.

Node.js

Java

Python

Debugging

Since Distroless images lack shell access, debugging can be quite the challenge. Fortunately, there is a corresponding debug image for each language that Distroless supports. The debug image provides a BusyBox shell to enter. If you’re not familiar with the Swiss army knife of Linux, learn more about BusyBox. At the minimum, know you’ll be able navigate and use the editor: vi. That ability alone should drastically improve your debugging process.

Add the :debug tag to change the final image in the multi-stage Dockerfile. Like so:

FROM gcr.io/distroless/python2.7:debug

Also, if the image already has a tag, add -debug. For example, java-debian10:11-debug.

Then build and launch with a shell entrypoint:

$ docker build -t my_fancy_image .$ docker run --entrypoint=sh -ti my_fancy_image/app # ls
BUILD       Dockerfile  hello.py

Don’t be afraid to hop into the image and take a peak at structure or change files. This approach can be quite effective to quickly understand and resolve issues.

When Distroless

Distroless images always bring the described benefits. However, there is a bit of a learning curve to start using them in an ecosystem or project. There is some over head with using them too; such as the multi-stage build that is often needed. On that note, when deciding to use Distroless evaluate the use case. Use the information and examples here to think about the pros and cons and choose the best path forward.

Summary

Well, you’ve done it! You made it all the way through this rambling explanation of Distroless image use. Now, don’t let anything “contain” your efforts. Aren’t you glad I saved the puns till the end of the article? I hope you won’t “dock” me for it. Anyway, take the examples here and benefit from my many hours of struggle and research. Adapt this information to fit your ecosystem and environment to benefit from Distroless images.

Dockerizing with Distroless

Let’s Set the Scene

Basic Dockerization

Once more, with Distroless

Node.js

Java

Python

Vendoring Dependencies

Node.js

Java

Python

Certificates

Node.js

Java

Python

Debugging

When Distroless

Summary

Written by Luke Perry