Best Practices for working with Dockerfiles
Dockerfiles allow users to define the exact actions needed to create a new container image. This allows users to write the execution environment as if it were code, storing it in version control if desirable.
The same Docker file built in the same environment will always produce an identical container image. Dockerfiles helps in automating the building of container images and establishes a repeatable process.
Some of the benefits Dockerfiles provide are:
- Easy versioning: Dockerfiles can be committed and maintained via version control to track changes and revert any mistakes.
- Predicbility: Building images from a Dockerfile helps remove human error from the image creation process.
- Accountability: If you plan on sharing your images, it is often a good idea to provide the Dockerfile that created the image as a way for other users to audit the process.
- Flexibility: Creating images from a Dockerfile allows you to override the defaults that interactive builds are given. This means that you do not have to provide as many runtime options to get the image to function as intended.
Docker images have intermediate layers that increase reusability, decrease disk usage, and speed up
docker build
by allowing each step to be cached.
Best Practices for writing Dockerfiles:
Use a .dockerignore file
The best way is to put the Dockerfile inside the empty directory and then add only the application and configuration files required for building the docker image. To increase the build’s performance, you can exclude files and directories by adding a .dockerignore
file to that directory as well.
Containers should be immutable & ephemeral
The container created with the image produced by Dockerfile should be ephemeral and immutable. In other words, the container should be destroyed and a new one built and put in place with an absolute minimum set-up and configuration.
Minimize the number of layers / Consolidate instructions
Each instruction in the Dockerfile adds an extra layer to the docker image. The number of instructions and layers should be kept to a minimum as this ultimately affects build performance and time.
Avoid installing unnecessary packages
In order to reduce complexity, dependencies, file sizes, and build times, avoid installing unnecessary packages.
Sort multi-line arguments
Sorting multiline arguments alphanumerically will help avoid duplication of packages and make the list much easier to update.
RUN yum update -y && \
yum install -y apache2 \
git \
java \
python
Build cache
While building an image, Docker will step through the instructions mentioned in the Dockerfile, executing them in chronological order. As each instruction is examined Docker will look for an existing image layer in its cache that it can reuse, rather than creating a new image layer.
If you do not want to use the cache at all, then use the
--no-cache=true
option with thedocker build
command.
However, when Docker is not allowed to use its cache, then the basic rules Docker will follow to find a matching image are mentioned below:
- Starting with a base image that is already in the cache, the next instruction is compared against all child images derived from that base image to see if one of them was built using the exact same instruction. If not, the cache is invalidated.
- For the
ADD
andCOPY
instructions, the contents of the file(s) in the image are examined and a checksum is calculated for each file. During the cache lookup, the checksum is compared against the checksum in the existing images. If anything has changed in the file(s), such as the contents and metadata, then the cache is invalidated. - Aside from the
ADD
andCOPY
commands, cache checking will not look at the files in the container to determine a cache match. For example, when processing aRUN apt-get -y update
command the files updated in the container will not be examined to determine if a cache hit exists. In that case just the command string itself will be used to find a match.
Once the cache is invalidated, all subsequent Dockerfile
commands will generate new images and the cache will not be used.
Build every time
Building docker images is very fast as docker makes use of previously cached build steps (default). By building the image every time, one can use containers as reliable artifacts. For example, one can go back and run a container from previous docker image to inspect a problem, or can run long tests on the previous version image while editing the code.
Dockerfile for Development Environment
For a development environment, map your source code on the host to a container using a volume. This enables to choose the editor of your choice on the host and test the application right away in the container. This is done by mounting the application build folder as a volume rather than copying the build artifact using the ADD command in the Dockerfile.
Understand CMD and ENTRYPOINT
CMD simply sets a command to run in the image if no arguments are passed to docker run
, while ENTRYPOINT is meant to make your image behave like a binary.
- If the Dockerfile uses only CMD, the specified command is executed if no arguments are passed to
docker run
. - If the Dockerfile uses only ENTRYPOINT, the arguments passed to
docker run
are always passed to the entrypoint; the entrypoint is executed if no arguments are passed todocker run
. - If the Dockerfile declares both ENTRYPOINT and CMD and no arguments are passed to
docker run
, then the argument(s) to CMD are passed to the declared entrypoint.
Source: ~ Docker Docs
Disclaimer: Content and Image source has been mentioned. Special Credit to concerned folks.