Using Yarn with Docker

Published in

HackerNoon.com

5 min readOct 13, 2016

Facebook recently released Yarn, a new Node.js package manager built on top of the npm registry, massively reducing install times and shipping a deterministic build out of the box.

Determinism has always been a problem with npm, and solutions like npm shrinkwrap are not working well. This makes hard to use a npm-based system for multiple developers and on continuous integration. Also, npm slowness in case of complex package.json files causes long build times, representing a serious blocker when using Docker for local development.

This article discuss how to use Yarn with Docker for Node.js development and deployment.

TL;DR

Clone the boilerplate:

git clone https://github.com/mfornasa/DockerYarn.git

Enter the directory:

cd DockerYarn

Build the container:

./build.sh

Run it:

docker run yarn-demo node -e "console.log('Hello, World')"

The first time your build the container, Yarn fetches npm dependencies for you. After that, Yarn is executed only when you modify your package.json, and it uses cache from previous executions. On top of it, you have determinism: the same dependency tree is installed every time and on every machine. And it’s blazing fast!

Let’s get started

The procedure works on Mac and Linux. We are going to the Risingstack Node.js Docker image for Node 6. Please install Yarn on your machine before proceeding.

Download Yarn installation package in a local folder:

wget https://yarnpkg.com/latest.tar.gz

Create a new Dockerfile:

FROM risingstack/alpine:3.4-v6.7.0-4.0.0WORKDIR /opt/app# Install yarn from the local .tgz
RUN mkdir -p /opt
ADD latest.tar.gz /opt/
RUN mv /opt/dist /opt/yarn
ENV PATH "$PATH:/opt/yarn/bin"# Install packages using Yarn
ADD package.json /tmp/package.json
RUN cd /tmp && yarn
RUN mkdir -p /opt/app && cd /opt/app && ln -s /tmp/node_modules

This is based on a well-known trick to make use of Docker layer caching to avoid to reinstall all your modules each time you build the container. In this way, Yarn is executed only when you change package.json (and the first time, of course).

Init package.json

yarn init

Add your first package:

yarn add react

Build and run your new container:

docker build . -t yarn-demo
docker run yarn-demo node -e "console.log('Hello, World')"

Congratulations! You’re using yarn with Docker.

Wait! What about "`yarn.lock”`?

Yarn stores the exact version of each package and sub-package in order to be able to reproduce exactly the same dependency tree on each run. Both package.json and yarn.lock must be checked into source control. As we run Yarn inside the container, we need to retrieve yarn.lock. Luckily, it’s not hard to extract yarn.lock after each run. Simply change the ADD line in the Dockerfile with the following:

ADD package.json yarn.lock /tmp/

and build the container using the following command:

docker build . -t yarn-demo; docker run --rm --entrypoint cat yarn-demo:latest /tmp/yarn.lock > yarn.lock

After the build, yarn.lock is copied to your working directory, and it will be reused on next Docker run, installing the same dependencies each time.

Congratulations! Now you have deterministic Yarn execution.

Wait! Now Yarn is executed at each container build

That is correct, we are now running Yarn at each build, even if package.json has not been modified. This is because yarn.lock is copied from the container to your working directory each time, even if it’s not changed, thus invalidating Docker layer caching. To solve this, we need to copy yarn.lockonly if it’s really changed. To do so:

Create a build.sh file:

#!/bin/bashdocker build . -t yarn-demodocker run --rm --entrypoint cat yarn-demo:latest /tmp/yarn.lock > /tmp/yarn.lock
if ! diff -q yarn.lock /tmp/yarn.lock > /dev/null  2>&1; then
  echo "We have a new yarn.lock"
  cp /tmp/yarn.lock yarn.lock
fi

Make it executable:

chmod +x build.sh

Use it to build the container:

./build.sh

Then run the container:

docker run yarn-demo node -e "console.log('Hello, World')"

Congratulations! You have now a deterministic Yarn execution, and Yarn is executed only when you change package.json.

What about Yarn package cache?

Another powerful feature of Yarn is package cache, which is stored on the local filesystem, to avoid downloading packages again. Our procedure so far does not maintain cache over container builds. This could be an issue for big package.json files.

The following build.sh solves the issue by saving Yarn cache on your working directory.

#!/bin/bash# Init empty cache file
if [ ! -f .yarn-cache.tgz ]; then
  echo "Init empty .yarn-cache.tgz"
  tar cvzf .yarn-cache.tgz --files-from /dev/null
fidocker build . -t yarn-demodocker run --rm --entrypoint cat yarn-demo:latest /tmp/yarn.lock > /tmp/yarn.lock
if ! diff -q yarn.lock /tmp/yarn.lock > /dev/null  2>&1; then
  echo "Saving Yarn cache"
  docker run --rm --entrypoint tar yarn-demo:latest czf - /root/.yarn-cache/ > .yarn-cache.tgz
  echo "Saving yarn.lock"
  cp /tmp/yarn.lock yarn.lock
fi

You also need to add this to your Dockerfile , after the ADD package.json... line:

# Copy cache contents (if any) from local machine
ADD .yarn-cache.tgz /

The cache file is not meant to be pushed to the repo, so it should be added to a.gitignore file.

Congratulations, again! You have now a deterministic Yarn execution, which is executed only when you change package.json, and it uses Yarn caching. Try this with a complex package.json file from a real project, you will be amazed!

If you enjoyed this piece click the “♥︎” button below. For more pieces on DevOps and Docker, join my mailing list.

Hacker Noon is how hackers start their afternoons. We’re a part of the @AMI family. We are now accepting submissions and happy to discuss advertising & sponsorship opportunities.
If you enjoyed this story, we recommend reading our latest tech stories and trending tech stories. Until next time, don’t take the realities of the world for granted!