Dockerfile Linting: Every time, Everywhere

Dave Elliott
6 min readMar 3, 2020

--

Recently I’ve been looking at strategies on how to ratchet up quality in docker workflows. I’m a strong believer in building quality in as early as possible and the earliest point in the docker container lifecycle is the Dockerfile.

Docker Inc. provides a documented set of best practices for writing Dockerfiles. These are a great foundation for building efficient images but rely on the engineer knowing those practices and having the discipline to implement them. Humans are fallible and that’s where a linter comes in handy. A Dockerfile linter is a tool that analyses and parses the Dockerfile and warns when it doesn’t match best practices or guidelines. This gives us an automated way of helping engineers to write Dockerfiles which always meet a reasonable standard. Incorporating a linter into the workflow ensures our Dockerfiles are always readable, understandable and maintainable.

There are several Dockerfile linters out there but one of the better ones is Hadolint. Hadolint not only lints the docker instructions, but it also incorporates ShellCheck which lints any bash or sh code used in the docker RUN instructions. This addition alone makes it a worthy choice, most engineers know how ugly and inscrutable shell commands can become if thrown haphazardly together with a few pipes, an awk, a grep and a bunch of cats.

Hadolint can be integrated into the docker workflow in a number of places: it is supported by a number of Editors and IDE, it can integrate with numerous modern CI tools and can also be integrated into code review platforms. This gives us plenty of options to play with when considering how to inject linting into the workflow. Whenever I look at implementing workflow improvements I try to focus on simplicity, portability, and repeatability. Ideally, I want any quality checks to be:

  • low-friction (ideally transparent to the engineer or developer)
  • automatic
  • portable

Any such validation checks should work in the same way whether I’m running them locally on my laptop, remotely on a cloud-hosted virtual machine, or if I’m triggering them within CI pipelines. They also should not rely on a human to trigger them. This is mainly for two reasons: one, I want engineers and developers to be focused on writing awesome code, designing solutions or solving problems … not wasting precious brainpower on making sure Dockerfiles are written to best practices and two, not all engineers run IDEs or remember to lint Dockerfiles before check-in.

My initial integration focused on integrating hadolint into the Jenkins CI pipelines. This seemed a logical place as it meant that it didn’t rely on a particular IDE or editor being used and it could be automated to run on every feature branch commit.

There are a couple of options for CI integration. One is to install hadolint onto the Jenkins slave, or in my case, bake it into the slave docker image. Once done, linting a Dockerfile is as simple as running the command:

hadolint Dockerfile

Another option, if the slave has access to a docker engine is to run hadolint within a container itself and redirect the Dockerfile to it:

docker run --rm -i hadolint/hadolint < Dockerfile

Both options work fine and allow Dockerfiles to be automatically linted once committed to a source code manager and picked up by the CI server.

The linter can also be configured via commandline options or a yaml configuration file to ignore certain rules, as well as warning when images are being pulled from untrusted registries. The latter option is a fuss-free way of ensuring that engineers aren’t pulling untrusted containers from public registries and ensuring that they pull containers only via the company’s repository manager.

ignored:
- DL3000
- DL3007
trustedRegistries:
- company.com:8080

If running locally, the configuration file can be passed directly via the ‘’-- config” option. When using docker run to execute the hadolint container it can be passed in by bind mounting a local volume. In the example below we bind mount the current working directory to /context in the container:

$ find .
.
./linter
./linter/hadolint.yaml
./Dockerfile
$ docker run --rm -i -v ${PWD}:/context hadolint/hadolint \
"/bin/hadolint" "--config" "/context/linter/hadolint.yaml" "/context/Dockerfile"

As a concept this is OK but it’s not great that this all happens post-commit and relies on the CI tooling to implement. It just isn’t portable. I wanted a solution that would run on my laptop, or any other engineers laptop for that matter, just as well as it would in the CI pipeline. It also needed to “just work” … anytime anyone built a docker image, anywhere, on any host I wanted it to be linted. That’s a lot of “anys”. Given engineers generally use a wide variety of local development environments in terms of IDE, Editors, even Operating Systems it seemed an insurmountable challenge at first. But then Docker itself came to the rescue!

By leveraging Multi-Stage builds which came in with Docker 17.05 and higher, it is possible to make linting a part of every docker build process. Multi-stage builds allow you to use multiple FROM statements in the Dockerfile. Each FROM statement can use a different base image and starts a new stage of the build, discarding the previous stage’s layers.

When running docker build it uses a Dockerfile and a context, the context simply being a specified URL or PATH to a set of files that can be accessed during the build process. By running linting as the first stage of a multi-stage build and passing the Dockerfile and the hadolint config yaml in as part of the context, we can effectively get the build process to lint its own Dockerfile.

The below Dockerfile shows this in practice. The first stage of the build uses a hadolint base image and the RUN command to execute the linting, followed by a second stage which builds the image we actually want — in this case a centos container which simply does an echo “Hello World!” when run.

# First Stage - Linting
FROM hadolint/hadolint:v1.17.5-6-gbc8bab9-alpine as config
# Copy the dockerfile and linter config from the context
COPY config/hadolint.yaml /config/
COPY Dockerfile .
# Execute the linting process
RUN echo "### Linting Dockerfile ###" && /bin/hadolint --config /config/hadolint.yaml Dockerfile
# Second Stage - Our final image
FROM centos:centos7
ENTRYPOINT ["echo", "Hello World!"]

Given the following dir structure:

Dockerfile
/config/hadolint.yaml

We run docker build from the top-level directory.

And confirming that our resultant container is what we expect

If the linter generates warnings then the build process halts. In the case below the config is set so that engineers can only pull images from the company private registry.

ignored:
- DL3000
trustedRegistries:
- my-company.com:5000

This results in warnings being generated for lines 2 and 10 of the Dockerfile respectively.

Similarly, it will stop the build if we use an ADD statement rather than COPY to add files or folders.

Adding some dubious shell commands to the Dockerfile demonstrates the linter running ShellCheck against the RUN instructions and catching the “Useless Use of Cat”.

RUN echo "Hello" > /tmp/hello.txt
RUN cat /tmp/hello.txt | wc

The full list of rules, error codes and rationale is available on the hadolint github pages.

So there we have it. An easy way to ensure that no matter where engineers are building images they are always performing and passing linting. The true beauty of this method is that is contained within the Docker build process itself. Wherever the Dockerfile is built, is where it is linted. On my laptop, in the cloud, in CI pipelines, on other engineer’s laptops…..

Linting: Every time, Everywhere.

As always, appreciate any feedback or comments, especially if its to point out further improvements !

--

--