How to write faster, leaner Dockerfiles for Node with Yarn and Alpine
TLDR; Utilizing Yarn in place of NPM for your node dependency management is an easy win with Docker, combined with taking advantage of smaller base images like Alpine and it becomes a no-brainer if you are looking to trim the fat of your Docker images. In some cases (as demonstrated below), you can save close to 90% of the original image size and shave valuable time off of your builds. A Yarn + Alpine build yields 640mb of size and over 30 seconds of time over a traditional Node + NPM standard image in this simple example.
Overview
One of the goals the Operations team at Lonely Planet has is to define standards for our Docker development and build processes. Part of this goal is to determine how we balance the trade-off of size vs simplicity in terms of our Docker images. Often times there are significant savings in image artifact size with very little added complexity just by utilizing a different base image. We prefer to deploy “slim” images to our Kubernetes clusters whenever possible, as they are faster to build, publish and run (and subsequently roll-back if necessary).
There are no shortage of posts explaining how to slim your images down, this post won’t necessarily be an exercise of that. Instead it focuses specifically on Node applications, comparing two base images (Alpine base image and the standard Node base image) and how the usage of Yarn as opposed to NPM affects the overall size of the image artifact.
I will also do an analysis on how utilizing Alpine and Yarn affects our overall image build times, as these are very relevant metrics.
All build time metrics should take into consideration I’m running these tests on a Late 2013, 15-inch Macbook Pro with 16 GB ram and 2Ghz Intel Core i7. All tests were performed using Docker for Mac version 1.13.1-rc2-beta41
Build time metrics also include a “cold build” and a “warm build”. Cold means the FROM
image was not available on the host already. Warm means the FROM
image was already fetched and available locally, without any additional layer caching.
Setting up our application
Our Node app paradigm will utilize react-kickstart starter kit (I found it here) as I believe it accurately reflects the base state of many modern Node web applications (node + babel + react). You can see the finished product at my forked repository. There is a feature branch for each image type (node-npm, node-npm-slim, node-yarn, node-yarn-slim)
For brevity sake, I will not be using docker-compose in this project, our only goal is to determine the production image artifact size, so development niceties like volumes and onbuild commands will be purposefully omitted. The Dockerfiles will also be constructed in a knowingly lazy manner, do not copy them.
Bootstrapping our bootstrapper
This section is intentionally brief and won’t explain much of what is happening. The summary is that we add a shell script that will run the webpack build command to generate our production artifacts followed by executing our Node server application.
Docker image 1 — Node base image + NPM
Our first image will utilize the standard Node 6.9.5 base image.
Running docker build
and docker run
gets our app into a “production” ready state, our image contains all of the dependencies it needs to run.
Here are the build times for this image:
- Cold build: 119 seconds
- Warm build: 89 seconds
Using Microbadger we can see a breakdown of our total (compressed) image size:
Compressed the image is 282MB, uncompressed on disk it is 742 MB. I would consider this a fairly bloated image considering how very little the application is doing.
REPOSITORY TAG IMAGE ID CREATED SIZE
thenayr/node-npm 6.9.5 3954ac331877 2 hours ago 742 MB
Docker image 2 — Alpine Node base image + NPM
Now let’s switch out the base image to an Alpine base and see what how much space we save:
Here are the build times for this image:
- Cold build: 99 seconds
- Warm build: 91 seconds
Let’s check out Microbadger again and see how this image stacks up:
It is obvious right away that the image we are inheriting from: node:6.9.5-alpine
is significantly smaller (16mb vs 245mb).
Our total compressed image size is now only 51.8MB, an 80% reduction.
REPOSITORY TAG IMAGE ID CREATED SIZE
thenayr/node-npm 6.9.5-alpine 4298630b62dc About an hour ago 132 MB
thenayr/node-npm 6.9.5 3954ac331877 2 hours ago 742 MB
The total on-disk image size is also significantly smaller, 132MB compared to 742MB.
Docker image 3 — Node base image + Yarn
Yarn promises us three benefits over using standard NPM to install depedencies: consistency, speed and security. All three of these things become even more important in a Dockerized environment. Let’s take a look at how it impacts our image size.
A couple things worth noting here, it takes a bit more work with Yarn to replicate the behavior of npm prune --production
. The end-game is to be able to build our app for production mode, but then remove all of the non-production dependencies from our image. For this we utilize a custom script to iterate over devDependencies and delete each one. Yarn also caches modules in a directory *outside* of node_modules
we need to run an extra step to remove those from our image: yarn cache clean
.
Here are the build times for this image:
- Cold build: 110 seconds
- Warm build: 78 seconds
So let’s see how this image stacks up:
This one weighs in at 259MB compressed, compare to our first image (Node + NPM) which came in around 280MB. Saving us around 8%. On disk they come in at 706MB (yarn) and 742MB (NPM) respectively. Not too bad for such an easy switch.
REPOSITORY TAG IMAGE ID CREATED SIZE
thenayr/node-npm 6.9.5-yarn e0c6ba52878a 8 minutes ago 706 MB
thenayr/node-npm 6.9.5-alpine 4298630b62dc 18 hours ago 132 MB
thenayr/node-npm 6.9.5 3954ac331877 19 hours ago 742 MB
Here is a breakdown of all of our current images. Alpine + NPM is in the lead by far, weighing in at a paltry 132MB.
Docker image 4 — Alpine Node base image + Yarn
Our last and final image will be built using our earlier Alpine image, with Yarn in place of NPM for installing our packages.
Our Dockerfile doesn’t change much from the previous yarn image, we just inherit from the Alpine base image now.
Here are the build times for this image:
- Cold build: 86 seconds
- Warm build: 79 seconds
Our new compressed image size is a mere 29MB! That’s pretty impressive even in comparison to our first Alpine image which totaled around 51MB. We shaved off over 50% of an already small image.
Uncompressed the image comes in around 97MB, reasonable by any means. Let’s tally up all of the uncompressed image sizes:
Repository Size
6.9.5-alpine-yarn : 97.7 MB
6.9.5-yarn : 706 MB
6.9.5-alpine : 132 MB
6.9.5 : 742 MB
Our final image (alpine + yarn) is just 13% the size of our first image! Pretty impressive results.
Summary
Yarn will yield you faster Docker images. It seems like an obvious choice to start migrating over to Yarn. It’s backwards compatible with most of the critical functions of NPM and is an easy win. It has the biggest effect on build time from our testing as the benchmarks for Node vs Node-Alpine didn’t make a huge difference in terms of warm build time.
In the words of the great Shia Labeouf:
DO IT.
With increased support by many NPM modules, it also makes sense to start migrating your Node images over to Alpine. It’s worth mentioning that in this really basic example there aren’t a whole lot of factors in play, so a more in-depth application might encounter some difficulty, for example, missing build utilities etc.
In both scenarios we tested, there are significant time savings in using both Yarn and Alpine for cold builds. The initial FROM
image fetch is much quicker when that layer is 10’s of MB instead of 100’s.