When to use Dockerfiles (and when not to…)
In this post, we discuss some best practices for using Dockerfile, explore some caveats, and build apps using Dockerfiles and Cloud Native Buildpacks. You’ll learn what jobs each of these tools are best at, and how to decide when to use them.
What are Dockerfiles?
A Dockerfile is a text file that contains commands that will be executed by Docker to build a container image. Dockerfiles always start with a FROM directive specifying the base image to start from. Subsequent commands build on top of and modify that base image.
Let’s get to know Dockerfile a little better by using one to build a small “hello world”, one file Go app. You don’t need to have Go installed to follow along with the tutorial, though, Docker will take care of the dependencies.
Now, let’s create a simple Dockerfile.
To get our container up and running, we need to set up Docker by installing the Docker CLI from docker.com. Then, run the following command to build the app.
docker build -t hello .
The size of our newly built image is 868.3 MB
REPOSITORY TAG IMAGE ID CREATED SIZE
hello latest 005c27e8cd40 7 minutes ago 868.3MB
Now we can run the image with the following command:
docker run -it hello
This is a good start, but the image isn’t optimized.
Writing a better Dockerfile
We started by using
golang:1.16.5 as the base image for our Go app. But we can actually choose from either of the following two images:
golang:1.16.5-alpine specifies the Alpine version of the Go base image. Alpine is a tiny Linux distribution designed specifically for containers. So Docker, Go, and Alpine are made for each other!
We can also add a
FROM scratch line in our Dockerfile, which tells Docker to start again with a fresh, completely empty container image (this is called a scratch container), and copy the compiled program into it. This is the container image that we'll then go on to run later.
Using a scratch image also saves a lot of space, because we don’t actually need the Go tools, or anything else, in order to run our compiled program. Using one container for the build, and another for the final image, is called a multistage build.
Our better Dockerfile looks something like this
After we run docker build again our image will be smaller, the size of our newly built image is ~8MB.
Leveraging build cache
Because an image is built during the final stage of the build process, you can minimize image layers by leveraging build cache.
If the build contains several layers it can be ordered from less frequently changed to the more frequently changed, this ensures that the build cache is reusable.
Follow these steps:
- Install the tools needed to build the application
- Install and update the dependencies.
- Generate the application.
Multi-stage builds allow you to drastically reduce the size of your final image, without struggling to reduce the number of intermediate layers and files. Here is the example Dockerfile.
The Dockerfile cache is, however, fragile and you have to be careful how you write your Dockerfile. What if you didn’t need to write one?
Let’s use Buildpacks
A buildpack is a program that turns source code into a runnable container image. Usually, buildpacks encapsulate a single language ecosystem toolchain. There are buildpacks for Ruby, Go, Node.js, Java, Python, and more.
Building our Go app with buildpacks
To set up buildpacks, follow the instructions for installing Pack CLI here. Let’s use the following command to build the app
pack build hello --builder=paketobuildpacks/builder:tiny
The size of this image is approximately ~30 MB.
pack uses buildpacks to help you easily create OCI images that you can run just about anywhere.
Buildpacks run the following set of processes to build an image of your app.
- The CLI detects the primary language of your project. For example, if your source code directory has a
Gemfile, buildpacks will identify it as a Ruby project; a
pom.xmlfile identifies it as a Java project, and so on.
- The execution environment then analyzes a previous build to determine if there are any steps which can be reused in a subsequent build.
- Buildpacks runs the build, downloading any dependencies and preparing the application to run in production.
- Finally, it exports the result of that build as a Docker image
Along with building the image, pack also let’s you generate a Bill of materials for your container images. A Software Bill-of-Materials (BOM) provides information necessary to know what’s inside your container and how it was constructed.
Let’s run the following for an image built with buildpacks.
pack inspect-image your-image-name --bom
Running it for our sample Go app image gives the following.
Cloud Native Buildpacks provide two forms of Bill-of-Materials.
- Buildpacks can populate Bill-of-Materials information about the dependencies they have provided.
2. A list of what buildpacks were used to build the application.
Buildpacks create “reproducible builds” of container images. Images are created in a reproducible manner. Reproducible builds mean that whenever you run:
pack build hello --builder=paketobuildpacks/builder:tiny
It will produce the image with the exact same image ID (also referred to as a
digest), assuming you have:
- the same source code
- the same builder image
- the underlying buildpack/language support reproducible builds (for example,
gobinaries are reproducible by default)
Let’s demonstrate that for our recently built container
The two images of our same Go app built with the same builder image and buildpack have the same hash value.
And why do we need it?
sha takes into account the contents of the image layers, including metadata, such as the date the image was produced. Reproducible builds can act as part of a chain of trust; the source code can be signed, and deterministic compilation can prove that the binary was compiled from trusted source code.
Now, Try deploying the new image to your favorite cloud, here are some docs to help you out!
The right tool for each job
So far, we talked about Cloud Native Buildpacks, Dockerfiles, and built applications using each of them. For Dockerfiles, their flexibility makes them shine. The images you build are limited by only your ability to script a Dockerfile; you can install system packages, allow or limit root access, start from scratch, augment an existing image, use any of Docker’s verified images, sky is the limit! However, the real challenge lies in the same flexibility. Your Dockerfile becomes another piece of code that you must maintain. Over time the OS, or runtime configurations might require patches or updates. Any automation to standardize, maintain, build images is entirely on you.
Cloud Native Buildpacks resolve the operational complexity of Dockerfiles and provide the structure needed for creating and maintaining images at scale, providing a simple user experience. From choosing and maintaining the base image to providing the contents for the rest of the layers, providing optimizations related to image size and layering, caching, and security, as well as standards and optimizations particular to a given programming language, Buildpacks can do it all. The resulting app images are enriched with metadata that make them easy to inspect, you may also get a detailed Software Bill of Materials (SBOM) including runtime version, application dependencies, and other details.
While Buildpacks provide solutions for most use-cases, there might be situations where you might need more flexibility, for example if you are building apps in a language not supported by the current ecosystem of Buildpacks, in such cases you might have to write your custom Buildpacks. In situations where Buildpacks cannot handle certain requirements, you might have to create a one-off Dockerfile.
Nevertheless, now it’s your turn to explore the tools and find out what fits your needs the best!