Reducing the docker build time using docker layers

Jitendra Kumar
Engineering @ Housing/Proptiger/Makaan
5 min readMar 16, 2023

Docker caching can significantly improve build time by reducing the time it takes to build an image. Before we go into details of docker caching, let’s understand the few things prior improving build performance.

  1. Docker Image : A Docker image is a read-only template that contains a set of instructions for creating a container that can run on the Docker platform. these set of instructions written in a file called Dockerfile.
  2. Docker Layer: Each instruction in below dockerfile represents a layer. for example this instruction in below dockerfile RUN npm install -g n is docker layer, similary other instuctions are other docker layers and Docker caches each layer independently. when you build an image for the first time, docker will create a cache for each layer. subsequent builds will reuse the cached layers, reducing the amount of work required to build the image.

Here are some ways that Docker caching can improve build time:

  1. Reusing layers: When you build an image, Docker will only rebuild the layers that have changed since the last build. This means that if you make a small change to your application code, Docker will only need to rebuild the layers that are affected by that change, rather than rebuilding the entire image.
  2. Skipping dependencies: If you have already installed a package or library in a previous layer, Docker will skip the installation step and reuse the cached layer in subsequent builds. This can save a significant amount of time, especially if you have a large number of dependencies.

Before going into demonstration, we should know that if at any layer there is any small change then docker builds all the layers below that particular layer again even if there is not a single change in lower layers.

dockerfile layer demonstration

In above snippet layer 1 is not changed, if layer 2 has some changes and rest layers below are not changed. now when docker start building image then doctor sees that layer 1 is not changed so it uses previously build cache for layer 1, now docker comes to layer 2, it finds layer 2 has changes then docker start rebuilding all the layers below layer 2 without checking these layers, so all the layers from 2 is rebuilt again.

Here is simple example how can we use layer caching to skip downloading of dependencies when there is no change in package.json and its lock file:

We have used above concept of docker layers to reduce the build time. In below snippet, you can see we have directly copied whole project codebase into the working directly, after then npm install is written, so at step COPY . /app, any small change in normal code(except package.json and it’s lock file) would lead all layers below COPY . /app to rebuild again.

snippet of current dockerfile

Now imagine above scenario where you have made a single line of change in a working file(your package.json and it’s lock file is still the same), now you start build this dockerfile now even if you have made a single line of change in code, docker will download all the dependencies again which is itself quite time taking step so this step can be skipped and build time can be reduced.

snippet of above dockerfile build process

clearly you can see in above snippet RUN npm install step is rebuilt again.

we have reduced the build time by skipping npm install step as most of the time is taken downloading a large number of dependencies and rarely package.json is not changed in our case, so we have used npm install layer cache by adding one more layer before npm install, which just copies the package.json and it’s lock file into the working directory. we have achieved this by segregating the packge.json and rest of the developement files.

segregation of code in two different layers

See above snippet we have copied package.json and it’s lock file prior to RUN npm install step So now RUN npm install layer will reuse it’s previous successfull build layer cache if we have not modified our package.json and it’s lock file. In above dockerfile, npm install layer is skipped if we have not modified the package.json or package-lock.json and if there is any change in json files then npm install will build again as expected.

snippet of docker build process

In above snippet, RUN npm install is not rebuilt again and its cache has been used so whole npm install time is saved here.

As you saw in above example we can optimize our build by taking advantage of docker caching so to take advantage of Docker caching, it’s important to structure your dockerfile in a way that minimizes the number of layers that need to be rebuilt. Here are some tips:

  1. Group related instructions together: Instructions that are likely to change frequently, such as package installations or code changes, should be grouped together in the same layer.
  2. Use multi-stage builds: Multi-stage builds allow you to build your application in multiple stages, with each stage producing a separate image. This can help reduce the size of your final image and improve build time.
  3. Use the — no-cache option: If you want to force Docker to rebuild all layers from scratch, you can use the — no-cache option when running the docker build command. However, this will increase build time and should only be used when necessary. Ex — docker build — no-cache -t sample-image:sample-tag . When you execute this command, the daemon will not look for cache builds of existing image layers and will force a clean build of the Docker image from the dockerfile.

Conclusion: so in this way you can optimize and reduce you build time and also reduce the build size by proper structuring of your Dockerfile. we have to structure our dockerfile in such a way so that there should be good balance between build size and build time means there is trade off betwen both. by increasing layers may reduce the build time but increases the build size and vice-versa.

--

--