Using Yarn with Docker
Facebook recently released Yarn, a new Node.js package manager built on top of the npm registry, massively reducing install times and shipping a deterministic build out of the box.
Determinism has always been a problem with npm, and solutions like npm shrinkwrap
are not working well. This makes hard to use a npm
-based system for multiple developers and on continuous integration. Also, npm
slowness in case of complex package.json
files causes long build times, representing a serious blocker when using Docker for local development.
This article discuss how to use Yarn with Docker for Node.js development and deployment.
TL;DR
- Clone the boilerplate:
git clone https://github.com/mfornasa/DockerYarn.git
- Enter the directory:
cd DockerYarn
- Build the container:
./build.sh
- Run it:
docker run yarn-demo node -e "console.log('Hello, World')"
The first time your build the container, Yarn fetches npm
dependencies for you. After that, Yarn is executed only when you modify your package.json
, and it uses cache from previous executions. On top of it, you have determinism: the same dependency tree is installed every time and on every machine. And it’s blazing fast!
Let’s get started
The procedure works on Mac and Linux. We are going to the Risingstack Node.js Docker image for Node 6. Please install Yarn on your machine before proceeding.
- Download Yarn installation package in a local folder:
wget https://yarnpkg.com/latest.tar.gz
- Create a new
Dockerfile
:
FROM risingstack/alpine:3.4-v6.7.0-4.0.0WORKDIR /opt/app# Install yarn from the local .tgz
RUN mkdir -p /opt
ADD latest.tar.gz /opt/
RUN mv /opt/dist /opt/yarn
ENV PATH "$PATH:/opt/yarn/bin"# Install packages using Yarn
ADD package.json /tmp/package.json
RUN cd /tmp && yarn
RUN mkdir -p /opt/app && cd /opt/app && ln -s /tmp/node_modules
This is based on a well-known trick to make use of Docker layer caching to avoid to reinstall all your modules each time you build the container. In this way, Yarn is executed only when you change package.json
(and the first time, of course).
- Init
package.json
yarn init
- Add your first package:
yarn add react
- Build and run your new container:
docker build . -t yarn-demo
docker run yarn-demo node -e "console.log('Hello, World')"
Congratulations! You’re using yarn
with Docker.
Wait! What about "yarn.lock”
?
Yarn stores the exact version of each package and sub-package in order to be able to reproduce exactly the same dependency tree on each run. Both package.json
and yarn.lock
must be checked into source control. As we run Yarn inside the container, we need to retrieve yarn.lock
. Luckily, it’s not hard to extract yarn.lock
after each run. Simply change the ADD
line in the Dockerfile
with the following:
ADD package.json yarn.lock /tmp/
and build the container using the following command:
docker build . -t yarn-demo; docker run --rm --entrypoint cat yarn-demo:latest /tmp/yarn.lock > yarn.lock
After the build, yarn.lock
is copied to your working directory, and it will be reused on next Docker run, installing the same dependencies each time.
Congratulations! Now you have deterministic Yarn execution.
Wait! Now Yarn is executed at each container build
That is correct, we are now running Yarn at each build, even if package.json
has not been modified. This is because yarn.lock
is copied from the container to your working directory each time, even if it’s not changed, thus invalidating Docker layer caching. To solve this, we need to copy yarn.lock
only if it’s really changed. To do so:
- Create a
build.sh
file:
#!/bin/bashdocker build . -t yarn-demodocker run --rm --entrypoint cat yarn-demo:latest /tmp/yarn.lock > /tmp/yarn.lock
if ! diff -q yarn.lock /tmp/yarn.lock > /dev/null 2>&1; then
echo "We have a new yarn.lock"
cp /tmp/yarn.lock yarn.lock
fi
- Make it executable:
chmod +x build.sh
- Use it to build the container:
./build.sh
- Then run the container:
docker run yarn-demo node -e "console.log('Hello, World')"
Congratulations! You have now a deterministic Yarn execution, and Yarn is executed only when you change package.json
.
What about Yarn package cache?
Another powerful feature of Yarn is package cache, which is stored on the local filesystem, to avoid downloading packages again. Our procedure so far does not maintain cache over container builds. This could be an issue for big package.json
files.
The following build.sh
solves the issue by saving Yarn cache on your working directory.
#!/bin/bash# Init empty cache file
if [ ! -f .yarn-cache.tgz ]; then
echo "Init empty .yarn-cache.tgz"
tar cvzf .yarn-cache.tgz --files-from /dev/null
fidocker build . -t yarn-demodocker run --rm --entrypoint cat yarn-demo:latest /tmp/yarn.lock > /tmp/yarn.lock
if ! diff -q yarn.lock /tmp/yarn.lock > /dev/null 2>&1; then
echo "Saving Yarn cache"
docker run --rm --entrypoint tar yarn-demo:latest czf - /root/.yarn-cache/ > .yarn-cache.tgz
echo "Saving yarn.lock"
cp /tmp/yarn.lock yarn.lock
fi
You also need to add this to your Dockerfile
, after the ADD package.json...
line:
# Copy cache contents (if any) from local machine
ADD .yarn-cache.tgz /
The cache file is not meant to be pushed to the repo, so it should be added to a.gitignore
file.
Congratulations, again! You have now a deterministic Yarn execution, which is executed only when you change package.json
, and it uses Yarn caching. Try this with a complex package.json
file from a real project, you will be amazed!
If you enjoyed this piece click the “♥︎” button below. For more pieces on DevOps and Docker, join my mailing list.
Hacker Noon is how hackers start their afternoons. We’re a part of the @AMI family. We are now accepting submissions and happy to discuss advertising & sponsorship opportunities.
If you enjoyed this story, we recommend reading our latest tech stories and trending tech stories. Until next time, don’t take the realities of the world for granted!