How to write faster, leaner Dockerfiles for Node with Yarn and Alpine

Ryan van Niekerk
7 min readFeb 23, 2017

--

TLDR; Utilizing Yarn in place of NPM for your node dependency management is an easy win with Docker, combined with taking advantage of smaller base images like Alpine and it becomes a no-brainer if you are looking to trim the fat of your Docker images. In some cases (as demonstrated below), you can save close to 90% of the original image size and shave valuable time off of your builds. A Yarn + Alpine build yields 640mb of size and over 30 seconds of time over a traditional Node + NPM standard image in this simple example.

Summary of image sizes and build times

Overview

One of the goals the Operations team at Lonely Planet has is to define standards for our Docker development and build processes. Part of this goal is to determine how we balance the trade-off of size vs simplicity in terms of our Docker images. Often times there are significant savings in image artifact size with very little added complexity just by utilizing a different base image. We prefer to deploy “slim” images to our Kubernetes clusters whenever possible, as they are faster to build, publish and run (and subsequently roll-back if necessary).

There are no shortage of posts explaining how to slim your images down, this post won’t necessarily be an exercise of that. Instead it focuses specifically on Node applications, comparing two base images (Alpine base image and the standard Node base image) and how the usage of Yarn as opposed to NPM affects the overall size of the image artifact.

I will also do an analysis on how utilizing Alpine and Yarn affects our overall image build times, as these are very relevant metrics.

All build time metrics should take into consideration I’m running these tests on a Late 2013, 15-inch Macbook Pro with 16 GB ram and 2Ghz Intel Core i7. All tests were performed using Docker for Mac version 1.13.1-rc2-beta41

Build time metrics also include a “cold build” and a “warm build”. Cold means the FROM image was not available on the host already. Warm means the FROM image was already fetched and available locally, without any additional layer caching.

Setting up our application

Our kickstart application

Our Node app paradigm will utilize react-kickstart starter kit (I found it here) as I believe it accurately reflects the base state of many modern Node web applications (node + babel + react). You can see the finished product at my forked repository. There is a feature branch for each image type (node-npm, node-npm-slim, node-yarn, node-yarn-slim)

For brevity sake, I will not be using docker-compose in this project, our only goal is to determine the production image artifact size, so development niceties like volumes and onbuild commands will be purposefully omitted. The Dockerfiles will also be constructed in a knowingly lazy manner, do not copy them.

Bootstrapping our bootstrapper

This section is intentionally brief and won’t explain much of what is happening. The summary is that we add a shell script that will run the webpack build command to generate our production artifacts followed by executing our Node server application.

Docker image 1 — Node base image + NPM

Our first image will utilize the standard Node 6.9.5 base image.

Running docker build and docker run gets our app into a “production” ready state, our image contains all of the dependencies it needs to run.

Here are the build times for this image:

  • Cold build: 119 seconds
  • Warm build: 89 seconds

Using Microbadger we can see a breakdown of our total (compressed) image size:

Microbadger for Node + NPM image

Compressed the image is 282MB, uncompressed on disk it is 742 MB. I would consider this a fairly bloated image considering how very little the application is doing.

REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
thenayr/node-npm 6.9.5 3954ac331877 2 hours ago 742 MB

Docker image 2 — Alpine Node base image + NPM

Now let’s switch out the base image to an Alpine base and see what how much space we save:

Dockerfile for Alpine Node

Here are the build times for this image:

  • Cold build: 99 seconds
  • Warm build: 91 seconds

Let’s check out Microbadger again and see how this image stacks up:

Microbadger image layer size breakdown

It is obvious right away that the image we are inheriting from: node:6.9.5-alpine is significantly smaller (16mb vs 245mb).

Our total compressed image size is now only 51.8MB, an 80% reduction.

REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
thenayr/node-npm 6.9.5-alpine 4298630b62dc About an hour ago 132 MB
thenayr/node-npm 6.9.5 3954ac331877 2 hours ago 742 MB

The total on-disk image size is also significantly smaller, 132MB compared to 742MB.

Docker image 3 — Node base image + Yarn

Yarn promises us three benefits over using standard NPM to install depedencies: consistency, speed and security. All three of these things become even more important in a Dockerized environment. Let’s take a look at how it impacts our image size.

Dockerfile for Yarn install

A couple things worth noting here, it takes a bit more work with Yarn to replicate the behavior of npm prune --production. The end-game is to be able to build our app for production mode, but then remove all of the non-production dependencies from our image. For this we utilize a custom script to iterate over devDependencies and delete each one. Yarn also caches modules in a directory *outside* of node_modules we need to run an extra step to remove those from our image: yarn cache clean.

Script to delete devDependencies in our yarn setup

Here are the build times for this image:

  • Cold build: 110 seconds
  • Warm build: 78 seconds

So let’s see how this image stacks up:

Microbadger for Node + Yarn

This one weighs in at 259MB compressed, compare to our first image (Node + NPM) which came in around 280MB. Saving us around 8%. On disk they come in at 706MB (yarn) and 742MB (NPM) respectively. Not too bad for such an easy switch.

REPOSITORY          TAG                 IMAGE ID            CREATED             SIZE
thenayr/node-npm 6.9.5-yarn e0c6ba52878a 8 minutes ago 706 MB
thenayr/node-npm 6.9.5-alpine 4298630b62dc 18 hours ago 132 MB
thenayr/node-npm 6.9.5 3954ac331877 19 hours ago 742 MB

Here is a breakdown of all of our current images. Alpine + NPM is in the lead by far, weighing in at a paltry 132MB.

Docker image 4 — Alpine Node base image + Yarn

Our last and final image will be built using our earlier Alpine image, with Yarn in place of NPM for installing our packages.

Dockerfile for Alpine Node with Yarn

Our Dockerfile doesn’t change much from the previous yarn image, we just inherit from the Alpine base image now.

Here are the build times for this image:

  • Cold build: 86 seconds
  • Warm build: 79 seconds

Our new compressed image size is a mere 29MB! That’s pretty impressive even in comparison to our first Alpine image which totaled around 51MB. We shaved off over 50% of an already small image.

Uncompressed the image comes in around 97MB, reasonable by any means. Let’s tally up all of the uncompressed image sizes:

Repository           Size
6.9.5-alpine-yarn : 97.7 MB
6.9.5-yarn : 706 MB
6.9.5-alpine : 132 MB
6.9.5 : 742 MB

Our final image (alpine + yarn) is just 13% the size of our first image! Pretty impressive results.

Summary

Yarn will yield you faster Docker images. It seems like an obvious choice to start migrating over to Yarn. It’s backwards compatible with most of the critical functions of NPM and is an easy win. It has the biggest effect on build time from our testing as the benchmarks for Node vs Node-Alpine didn’t make a huge difference in terms of warm build time.

In the words of the great Shia Labeouf:

DO IT.

DO IT.

With increased support by many NPM modules, it also makes sense to start migrating your Node images over to Alpine. It’s worth mentioning that in this really basic example there aren’t a whole lot of factors in play, so a more in-depth application might encounter some difficulty, for example, missing build utilities etc.

In both scenarios we tested, there are significant time savings in using both Yarn and Alpine for cold builds. The initial FROM image fetch is much quicker when that layer is 10’s of MB instead of 100’s.

--

--

Ryan van Niekerk

DevOps Engineer at Lonely Planet, Ketogenic freak. All views and opinions are strictly my own.