How you can save hours and Terabytes of bandwidth on GitlabCI

If you’re using GitlabCI with Docker in Docker — aka DinD — did you ever worried about Docker layer cache?

8 min readDec 14, 2021

I recently came up in a big project, where everything is in place since many years, you know, that so called legacy app.

One day, our CI/CD suddenly started to become unusually long for no reasons, then our pipelines quickly went from a 30 minutes average to a 1 hour and more, with random failures everywhere all day long.

It goes to the point where feedback & delivery quickly becomes a nightmare, imagine waiting 1 hour or more, retrying multiple jobs, to finally discover a test does not pass, if you can ever reach the end of the test stage. This is what happened to us during months.

The issue is even bigger when you know there are multiple teams of developers working on this repository, so they’re all in a big trouble. My team being one of these and being the one in charge of this repository, I had to figure out what’s going on.

So, where do we start from?

First things first, our local stack heavily use Docker, we talk about a big Symfony app, we’re using Gitlab on premise and GitlabCI with dedicated runners on AWS. While heavily using Docker, the app is not deployed in Docker.

The job log — on the very first lines — tells me we use something called “docker+machine executor”, then pulls and starts “service docker:dind”, then the job really start with cloning code etc…

So what’s a “docker+machine executor”? It’s basically a GitlabCI executor which is meant to create one worker per job. In our case, these workers are created on AWS. The newly created instance then have a working docker environment out of the box to boot up the job container and run your job in there.

It’s basically meant for auto-scaling your CI infrastructure, and it’s actually based on a blog post you can find here.

Alright, we also spoke about “service docker:dind”, dind stands for “Docker in Docker”. If you’re experienced enough with Docker, here you already know you might have a huge issue just in front of you.

Let me explain. When we talk about Docker, there is an important part, downloading and/or building images. We do build sometimes but we don’t ship in docker so our images doesn’t really change very often, so it’s mainly a fact of downloading images for us.

Anyway in both cases, a Docker image is made of layers, and each of these layers is stored on your machine by Docker.

This is called the layer cache, and it just works out of the box on your local machine. Give it a try, just try to pull twice the same image, you’ll see Docker is smart enough to not re-download anything on the second run.

➜  ~ docker pull hello-worldlatest: Pulling from library/hello-world2db29710123e: Pull completeStatus: Downloaded newer image for hello-world:latest➜  ~ docker pull hello-worldlatest: Pulling from library/hello-worldStatus: Image is up to date for hello-world:latest

Here we can see Docker downloaded the layer 2db29710123e on first run only.

If you google about this, you’ll find some documentation & articles explaining you can also use the flag --cache-from to use the layer cache from a previous image. While this is true, this still forces you to download these layers, so this flag is really only helpful on build times, because you can download layers which is faster than rebuilding those layers. But as I said, I’m not really concerned by build times here.

Can you see this huge issue just in front of us?

This is actually one of the main limitations of Docker in Docker, Docker stores everything in a directory /var/lib/docker , but when you run Docker in Docker, the inner /var/lib/docker directory is just lost when the outer container is removed.

This makes sense doesn’t it?

Now in GitlabCI, we know the job container — and the service docker:dind — purely gets destroyed after each job.

This is even documented by Gitlab as a pure limitation of Docker in Docker, which is only half the truth. It’s a pure limitation for Gitlab shared runners, there’s nothing you can do about this on a shared runner, but we have dedicated AWS runners.

Why is this such a big issue? We’re basically downloading and destroying the same thing again and again, all day long, across all jobs.

How much of the same things are we downloading? About 2.5Gb per job

How many jobs are we running? Minimum 18 per pipeline

How many pipelines are we running? An average of 20 per day

This represents roughly 1 Terabyte per day, yet this is underestimated!

This is obviously a lot, even if you try to consider this in a LAN context, you don’t want to download 1 Terabyte per day. And we’re not even in a LAN context, because we have a registry mirror for this exact purpose, but a registry mirror cannot work with private registries like ours.

The 1 Terabyte question — How to use the Docker layer cache in DinD?

This is one of these questions Google cannot answer for you!

We know we’re interested in the content of /var/lib/docker, and we have full control over the dedicated runner.

Starting from there, the idea is simple, we need to find a way to persist the content of the inner /var/lib/docker. But there is an issue, this directory is located within the service docker:dind, which is a distinct container from our job context container, and we don’t have much control on what happens in this distinct container.

Some people have tried bind mounting this directory from the runner configuration, I can now say I’m one of these, but it just explodes.

runner_image_volumes = ["/var/lib/docker:/var/lib/docker:rw"]

Docker just don’t like having multiple daemons accessing the same underlying directory, and the host of course also holds a Docker daemon to run the job context and the service DinD.

What if we could get rid of the service DinD? This would let us have more control on the context, and ultimately find a way to persist this /var/lib/docker.

I’ll refer to this as a “DinD capable image”. Let’s try this!

Here is a simple Dockerfile, starting from the official docker:dind image — do not reinvent the wheel — just copying the compose binary to get a full docker+compose environment in one container.

FROM docker:20.10.9-dind-alpine3.14COPY entrypoint.sh /usr/local/bin/RUN chmod +x /usr/local/bin/entrypoint.shCOPY --from=docker/compose:alpine-1.29.2 /usr/local/bin/docker-compose /usr/local/bin/docker-composeENTRYPOINT ["entrypoint.sh"]

If you don’t know what the COPY --from does, well I guess it’s quite meaningful, it’s meant to copy files from another image. It can be from an earlier stage in a multi-stage Dockerfile, or just from another complete image.

So we take the official dind image as a base and copy the compose binary from the official compose image. Everything is official here, we’re not reinventing the wheel, this is on purpose.

And here is the entrypoint, just starting the Docker daemon silently in the background.

#!/bin/shdockerd > /dev/null 2>&1 &
if [ $# -ne 0 ]; then  exec "$@"fi
sh

We’re overriding the official entrypoint here. This is on purpose again, the official entrypoint is really big because it tries to guess how you will consume dind, from the socket or over http(s). But at this point I already know I want to consume it from the socket only.

This just works, you can now run Docker in Docker in your own image, sweet isn’t it?

Of course this container must run in privileged mode, but I won’t talk about that because ours were already running privileged.

We now need to persist the inner /var/lib/docker of this container, but this is just in our job context, so it’s actually easy, remember that runner_image_volumes line? This is the one we’ve used to create a bind mount, but it can also just create persistent volumes.

runner_image_volumes = ["/var/lib/docker"]

Easy! What’s happening here? The path /var/lib/docker of the inner container will get persisted to the outer /var/lib/docker in the form of an anonymous volume, so this does not clash with the outer Docker daemon.

Our inner context will be stored somewhere in the outer /var/lib/docker/volumes/xxxx , this is just perfect!

Why? Because the outer context — the worker created by docker+machine executor — can stay alive, it’s only killed after 1 hour of idleness, which just never happens during the workday, and we can tweak that if we need to.

So when our jobs end, we don’t destroy everything anymore, each worker maintains his own docker context as long as it stays alive, and of course this includes the layer cache.

What’s the result?

We spoke about that Terabyte per day, it’s just gone, workers now just download only once when they first boot up.

From a time point of view?

This was an incompressible 3 minutes download time per job, no matter what the job does.

This made our pipelines at least 30% faster, a whole 10 minutes boost on every pipeline.

Developers will love this of course, and we all want developers to be happy, right?

No more random failures, no more network timeouts, no more 1 hour pipelines, our pipelines now takes only 17 minutes!

Most importantly this is rock stable all day long.

And guess what, while this works you don’t even need the “DinD capable image”, I later figured out that my volume would also get mounted to the service DinD.

Volumes are actually mounted everywhere by the runner: to the “predefined” container, to the job container, and also to the services containers. This makes sense after all, how would the service DinD work otherwise?

So this was ultimately a one line solution in the runner config, but it took us roughly 6 months to sort this out.

runner_image_volumes = ["/var/lib/docker"]

There is another thing we did to drop to 17 minutes, so I’ll give this here as a bonus: TMPFS.

TMPFS stands for temporary file system. This lets you create volumes in memory. You can do that on your machine if you want to, but docker & compose also support this feature.

If you have disk intensive operations, like tests performing write operations on a real database, you definitely don’t want this to happen on a physical disk, because doing this in memory would be way more efficient.

Here’s how to easily define a tmpfs for mysql in a compose file for example:

services:
  mysql:
    tmpfs:
      - /var/lib/mysql

This, alone, brought another 10 minutes boost to jobs performing write operations, finally bringing our pipelines down to 17 minutes.

We literally saved hours and Terabytes per day. Furthermore, we have happy developers!
You can’t build a great software without happy developers.

Let me know, did you ever worried about this?