Optimizing Your Dockerfile

The ordering of instructions in a Dockerfile really does matter. Being strategic about it, you can reduce your image size, reduce the number of layers and take full advantage of Docker’s caching capabilities. If you don’t you’ll end up with longer build times, longer deployment times, images that do not cache and creating unnecessary layers.

Why does this really matter? It depends, but if you are in the business of deploying your applications and doing it quickly and efficiently, downloading a 5mb update that just contains the changes to your source code is faster than downloading 400mb of updates no matter how you look at it.

Order Matters

How the order of instructions in a Dockerfile does matter. Where each RUN instruction in relation to a CMD, ENV, ARG, ADD or COPY always matters because it affects the Docker caching function and thus the build time and image size.

For example if you place an ADD instruction towards the top above a few RUN instructions and anything in the added file or directory has changed, all subsequent layers below, their cache is automatically invalidated regardless. Sometimes a RUN instruction has to go after an ADD, but that is not what I’m referring to.

Order Rules

#1 Place static instructions higher in the order. Instructions like, but not limited to, EXPOSE, VOLUME, CMD, ENTRYPOINT, and WORDIR whose value is not going to change once it is set.

#2 Place dynamic instruction lower in the order. Instructions like ENV (when using variable substitution), ARG, and ADD

#3 Place dependency RUN instructions before ADD or COPY instructions

#4 Place ADD and COPY instructions after RUN instructions for installing dependencies but before dynamic instructions

Example of using the Order Rules

Combine RUN Instructions

Try and combine as many RUN instructions when and where possible. So many Dockerfiles I come across these days look something like this.

Example of how to *not* construct your RUN instructions

This isn’t necessarily bad, but its not good either, each time this runs, if the first run instruction changes, all subsequent layers cache are invalidated, leading to longer build times and larger images. Also by not combining the run instructions you create more layers, this isn’t as much of a problem as it once was as the layer limit has been increased, but general rule of thumb the fewer the better in my humble opinion. Here is an example of how to correct the previous Dockerfile.

Example of how to combine the RUN instructions

Now if the need arises to invalidate the cache either change the RUN instruction or pass in the no cache flag to the docker build command.

Cache Your Dependencies

Yes, this can be done. Is it perfect? Absolutely not, but it definitely helps. My friend and coworker David Weinstein wrote an article on caching Node.JS dependencies a few years ago, its worth a read. The same can be done for Python and Ruby, and I’m sure others, however I will not cover those in this article.


I’ve been working with Docker for years now, to me these are necessities for smooth and repeatable builds and making docker images production ready.

Like with any software system you should never optimize too much too early, so I suggest you always get your application working first with Docker and your Dockerfile, then optimize it.