Speeding up a Docker build

Louise Yang
KPCC Labs
Published in
3 min readJan 5, 2018
Containers by Sandor Volenszki

As part of my work on Resound, I created Docker images to make it easy for people to deploy and run the different apps without having to manually install a bunch of libraries. Creating a Dockerfile may seem like a straightforward process, but anyone who has built images off of those files may run into the problem I had when starting out.

The backend of Resound is a Rails API that has several endpoints that require FFMPEG for processing. FFMPEG is an extremely powerful framework that can be used to transcode audio. Because I wanted to limit the size of my Docker image, I didn’t go with any of the official FFMPEG images. Instead, I chose to use the official Rails base image and build FFMPEG on top of that. This gave me the flexibility to pick and choose the bare bones codecs and libraries necessary for me to use FFMPEG in Resound:

./configure \
— enable-gpl \
— enable-postproc \
— enable-swscale \
— enable-avfilter \
— enable-libvorbis \
— enable-libx264 \
— enable-shared \
— enable-pthreads \
— enable-libfdk-aac \
— enable-libmp3lame \
— enable-nonfree

Once I tweaked my Dockerfile enough to build a customized version of FFMPEG as well as the Rails API app I wrote, I thought I was all set to start developing.

During the early iterative process of development — adding more endpoints, gems, and bug fixes — I realized that rebuilding my docker image was taking a long time. It didn’t seem right that the build should take more than a few minutes.

My first step toward speeding up the docker-build step was to maximize how I was using Docker’s image caching. Each instruction in a Dockerfile creates a new layer to be cached. If from one build to the next, the resulting files from an instruction has not changed, then the cached layer is used. To take advantage of this, I updated my Dockerfile to keep the stable parts like maintainer notes, installing system libraries, and making FFMPEG at the top. Because I knew the Rails code would be changing a lot, I moved all of those instructions to the bottom.

COPY Gemfile $APP_HOME/
COPY Gemfile.lock $APP_HOME/
RUN bundle install
COPY . $APP_HOME

My gemfile was less likely to change, and bundle install takes a significantly long time, so moving that above the instruction that incorporates the code changes made sense. It meant Docker used the cache for the gems instead of having to bundle install each time I built. I also included any log files like development.log in my .dockerignore file so that using the app wouldn’t invalidate the cache.

Once I got to the point that changes in code only trigger a build in the Rails app layer, I was still dissatisfied with my Dockerfile. It was getting quite long and the FFMPEG instructions really had nothing to do with the other instructions. I would accidentally end up invalidating the cache layers whenever I wanted to reorder instructions while experimenting.

My next step was to create a base image that included my customized FFMPEG and to build the actual Rails app off of that base image. I differentiated the two Dockerfiles as Dockerfile.ffmpeg and Dockerfile.api in my repo. After doing this, almost all of my Dockerfile changes have been to Dockerfile.api and I haven’t had to even touch Dockerfile.ffmpeg.

An alternative to creating two separate images is to make use of Docker’s multi-stage builds but I’m still not convinced about the efficiency gains in this particular case. If I were satisfied with using an official FFMPEG image, I could use multi-stage and just have another FROM statement for that, but since I have a specific configuration of FFMPEG I want to install, that wouldn’t make sense.

Further reading:

https://www.ctl.io/developers/blog/post/caching-docker-images

--

--