Docker!? Shocker!! — Dockerizing for Development

I recently wanted to make changes to a friend’s project. This project essentially consists of an API layer backed by a database and an import script that initializes and populates the database.

The changes that I wanted to make were fairly large: I wanted to rewrite it from using JavaScript to TypeScript and consolidate and organize the code a bit. I couldn’t be fully confident without a good, consistent way to test the project. I decided that using docker-compose would be the way to go as this would allow me to essentially create my own replica of the live project and develop against that.

It took a lot of work and help from my friend and outside sources, but I’m surprised and happy to say that I actually managed to achieve my full goal of dockerizing the project for development. It runs in one step, and changes you make to the source are actively propagated during development.

While there are a ton of articles and of course documentation out there explaining how to dockerize node.js apps or perhaps wordpress apps backed by databases none of them were able to fully explain what I needed to do or keep me out of some of the pitfalls I fell into while working on the changes. I’ll add this article to the fold and hopefully help a few people here or there — probably ones who are as impatient as I am.

A Brief Overview

The project is: https://github.com/Bjorn248/graphql_aws_pricing_api

The details of the project aren’t super important to this article. At a high level, this is a node.js Express app that connects to a MariaDB database to serve a GraphQL API. This also runs on AWS Lambda, but for local development the Express server does the job. There is also a python import script that downloads the data we need (it’s all in CSV format) and populates the database.

Note: I think that docker-compose is great for development, but I don’t see a way to use it as a production tool right now. If you want to deploy databases to production you have to keep in mind replication and backups. If you have other services such as the API they are likely to autoscale at different rates. Finally, it’s likely that you’ll use some managed services such as an API gateway that you can’t and don’t need to dockerize for local development.

A lot of the techniques I talk about in this article will not apply to production and it would likely be unsafe to use them in production. This article is purely for setting yourself up for development.

Step 1: Having a Database

Before anything else, I wanted to actually have a database to work with. Installing MySQL or MariaDB locally is not particularly difficult to do, but the nice thing about using Docker is that you can pick a specific version that you know will match your deployment, and the database is sandboxed in a way that if you needed to switch to another project they’d be isolated from one another. This isn’t a huge deal, but I just like how clean it feels.

Creating a one-off MySQL instance using Docker is easy. In fact there is a published Dockerhub Dockerfile for MySQL that includes instructions on how to do exactly that. Specifically you can do:

docker run -e MYSQL_ALLOW_EMPTY_PASSWORD=1 --name db --rm -d mysql

You may be happy to run this and see that it works…or actually this specific command will just spit out the ID of the container it creates, but you do have a database running now. Let’s break down this command a little bit since there is a lot going on here:

The docker run part tells docker to run a one-off command against a container. The actual container in this case comes at the very end: mysql. Docker is somewhat smart about how it handles this. In this case if you don’t have a local container tagged asmysql, it will attempt to use the one from Dockerhub. It will even download and build it for you if you didn’t already have it. If you want to specify a tag which likely pins a specific version of MySQL or whatever you’re dockerizing, use a colon. For example, this could be mysql:latest or mysql:8.0.3 or mysql:8 (all of which are the same as of my writing this). If you don’t specify a tag, Docker will use latest.

The MySQL container requires some default settings. You can set them with the -e flag which is also short for --env. Think “environment variable.” Above I use a setting that will allow you to log into MySQL with no password. For local development on a database whose integrity you don’t care much about, that’s kind of nice. Otherwise, don’t do it! You can also set the root password as a default.

The --name flag allows us to reference our container by name. This will become important for when we want to connect to this container later. It is optional, though, and if you don’t provide a name a random one gets generated. You can also use the container’s ID which is a hash in place of the name. In this case I just chose db as the name since it’s nice and clean. You may want to choose something more specific depending on how many containers you plan to use at a time.

The --rm automatically removes this container once it stops running. This saves us a cleanup step.

Finally, the -d detaches the container when it starts running. If you exclude it you’ll get all of the output of the startup for your MySQL database in your local shell. That’s great if you want to see it, but since that process doesn’t end on its own since you want to keep the database running, you won’t get your local shell back. If you want to see the output of detached Docker logs, you can use docker log <name>… Or docker log db in this case. You can even follow them with -f.

Now we can run docker ps which should output something like:

CONTAINER ID IMAGE COMMAND                CREATED 
559fb3f932d2 mysql “docker-entrypoint.s…” 8 minutes ago
STATUS       PORTS    NAMES
Up 8 minutes 3306/tcp db
My friend who created the project that was the source for this blog post has pointed out that this looks like a dolphin sniffing a whale’s butt.

If we wanted to stop the container / shut down our database we could use docker stop 559fb3f932d2 or docker stop db. docker kill db would also work, but that is the equivalent of using kill -9 which will ungracefully kill a container that’s not responding. docker stop will attempt to do a normal shutdown. If you do docker stop, the container will be removed as well because of the --rm command from above. If we omit that then we could start up the container again later with docker start db. Depending on your development needs you may want to leave off --rm for containers such as databases. For one-off commands, you may want to use --rm. Otherwise you’ll have to manually find and remove the container later with docker rm. You will have to remove a container if you want to reuse the container name. It also consumes disk space if you keep it.

Connecting Our Database — Reach Out and Touch

Now that our database is up and running we should connect to it so that we can make sure it’s working properly and maybe do some initial setup. If it’s MySQL we’re using specifically another command is provided for us:

docker run -it --link db:mysql --rm mysql sh -c \
'exec mysql -h"$MYSQL_PORT_3306_TCP_ADDR"'

If you run it you’ll be greeted by the MySQL command prompt. Let’s break this command down too:

The docker run you already know… but can you spot which container is being run? It’s actually the mysql in the middle. Note that this is not the same container that is running the database. Instead, Docker creates a different MySQL container instance. However, this one doesn’t start up a database because we pass a command to it. If you don’t pass a command — i.e. just write the command as if we were running it on our own command line — Docker will use that container’s entry point. You might have noticed the docker-entrypoint.sh COMMAND that our original db container was running above. The entry point is whatever the container maintainers want it to be. Usually it’s something friendly … if it’s a database container you’re running it should start up an instance of the database for you.

In this case we don’t need a database, but we want to use the MySQL command line interface or cli. There are a lot of containers that have mysql installed on them, and of course mysql is one of them. You could change mysql to mariadb above and it would still work because that container has the MySQL CLI installed on it as well.

Let’s go back to the beginning of the command. After the familiar docker run we have -it. This is actually two flags and could be written as -i -t. These are important if you plan to connect to a shell or cli on a container and interact with it. The -t flag allocates pseudo-TTY. I’m not totally sure what that means, but it does allow you to type and send commands to the container and have it print stuff back to you. The -i keeps STDIN open on the container so you can continue to send it commands. If you try doing the same docker run command without using both -i and -t you’ll notice that it doesn’t quite work the way you want. You may only be able to send one command or even none at all. If you use -i by itself you may not even be able to exit the container and you’ll have to kill it! Just remember to use -it if you want an interactive shell on the container you’re running.

Next up is --link. We have the ability to expose our database container’s port to our own machine via -p 3306:3306 for example, but that would only help us if we were running mysql on our own machine. Instead we’re running it on another container, so we need to link the two containers together. According to Docker:

Docker sets an <alias>_NAME environment variable for each target container listed in the --link parameter.

Docker also defines a set of environment variables for each port exposed by the source container.

In summary, --link passes additional environment variables to our container from the requested container that give it connection information. So above, with db:mysql this creates an environment variable MYSQL_PORT_3306_TCP_ADDR that has the hostname of the db container and sets it on our cli container. The :mysql part is an optional alias. We could make this whatever we want — even :foo and then we could use FOO_PORT_3306.... We could also omit the alias in which case it would match the container name: DB_PORT_3306.... We could even use the container ID and omit the alias which would be something like 559fb3f032d2_PORT_3306….

Regardless, with the command we use above:

docker run -it --link db:mysql --rm mysql sh -c \
'exec mysql -h"$MYSQL_PORT_3306_TCP_ADDR"'

The MYSQL_PORT_3306_TCP_ADDR will have the host name of our db container.

Next up is --rm which we’ve already seen. This will remove the container once it stops. It will stop once we exit the MySQL CLI or just use something external to stop the container including docker stop. This is a very appropriate time to use --rm since we won’t need to use this container again. If we used docker start it would run the entry point command and spin up another database rather than just connect us to the MySQL CLI again.

We can’t forget the very important part of the docker run command which is the container name. We’re using mysql here since it has the mysql command installed on it, but we could use any container that had mysql installed — for instance you could replace mysql with mariadb above. Don’t be confused by the fact that the container name is the same as the container of our db or the command we want to run to connect to the CLI. This is a separate container.

Finally, we have sh -c. This actually has nothing to do with docker at all, and in fact it could be any command we want. You can try by using echo hello instead of sh -c... if you want and see that it just runs that command. Anyway, what sh -c does on the container is run the sh command which is a shell — probably the default shell and possibly although not necessarily bash. The -c flag takes a string of a command to run. We use that here so we can propagate the environment variable set on our container to the command. If we didn’t wrap the command we wanted to run in quotes the environment variable would be interpolated from our local shell and would be empty or not what we want. You can try this by doing exec mysql -h$BlahBlah instead and leaving the sh -c and string wrapping off.

I’m not quite sure what exec does here and it’s not required for our command to work and connect us to the MySQL CLI. It seems like it ends the shell process and starts the mysql process without forking so once we exit mysql there’s no shell to go back to — but once we exit mysql the container will stop regardless so I’m not sure.

Finally the mysql command itself starts the mysql CLI. The -h flag is the hostname. We can’t use localhost because the container we’re using isn’t the container that our database is actually running on, so we use the host we get from the environment variable set by --link. There are many other arguments you could pass to the mysql command as well including the port and a password, but I set it to allow for empty passwords and we’re just using the default port.

Is There a Better Way than `link`?

One thing I really like about writing blog posts like this is the research I’m forced to do. I had no idea how --link worked, and after reading about it it seems like all it does is set some environment variables on containers. That seems kind of funky and hacky. In fact, I had trouble finding the documentation for --link at all, and Googling gives the first result of:

At first I ignored this since I didn’t think it had to do with --link specifically, but I should have paid attention to the bright red message:

So yeah … using --link is a legacy feature of Docker and is not recommended. So what is recommended? Using user-defined networks! This is mainly for docker for production applications… If you’re using --link locally for development and testing what’s the harm? Also, the documentation about user defined networks seemed intimidating and didn’t have examples for what I wanted to do.

Fortunately, I did some experimenting and using user-defined networks to connect containers is actually really easy. In fact it’s even easier than using --link and you don’t need to know the name of any environment variables.

First, you have to create the network:

docker network create connect-db

Simple enough. You can look at the networks with docker network ls which should now list the network you created. You can also see the details about this network such as subnet information using docker network inspect.

Now we need to attach our database container to the network. If you’re using docker run you can actually connect to a running network right away using --network. This isn’t an option for our database since it’s already running. Instead, we can use docker network connect connect-db db to connect the container to the network (specify network first, then container name or ID).

Now we can use this network to connect our other container for the MySQL CLI. Our command actually becomes a lot simpler:

docker run -it --rm --network=connect-db mysql mysql -hdb

The docker run -it --rm part was already discussed and should be familiar. The --network specifies the network that we want to connect to — in this case it’s the same network that our running database is connected to. The first mysql is the container (we could also use mariadb, etc.). Finally we pass the command which is just mysql -hdb. We could use sh -c 'exec mysql -hdb' too and it would do the same thing. The -h flag for the MySQL CLI specifies the host. Docker is nice enough to create host entries on the network that correspond to container names! So db in this case maps to whatever the host is for our db container on this network.

An Even Easier Way to Connect

Rather than create a separate container entirely with docker run, you can run a command on an existing container using docker exec. A specific example with our current container would be:

docker exec -it db mysql

The -it here works the same as it does for run, and you should use it whenever you need an interactive shell from your command. The first argument to docker exec is the container name — in our case db. After that is the command we want to run. You could also use mysql -hlocalhost, but localhost is the default. Since we’re running the command on the container itself we can use localhost. If you specified a user and/or password, make sure to specify them a la mysql -u$USER -p.

I’m actually not sure why the documentation I found doesn’t recommend using docker exec in this case since it’s much simpler and more sensical to me than using docker run with a separate container and linking them. There may be other issues using docker exec, but for development or just running commands against your test local database it’s probably just fine. I believe the command is relatively new whereas docker run --link has been around since the beginning.

So now we have a database up and running, and we’ve verified that we can connect to the database and execute queries and stuff like that. Next up let’s talk about importing some data so we have something to work with.

Step 2: Data Import

We have a running Database instance, but it’s probably not very useful if it doesn’t have any data. The MySQL docker containers have a way to add data through a volume and configuration files, but this only applies to static data.

As part of the project, my friend wrote a script in python to import the data dynamically. Essentially what this script does is download an index of the AWS pricing import files which gives links for downloading the CSV files that contain the data. These CSV files can then be parsed and imported into MySQL.

Rather than try to rewrite anything, I decided it would be best to use a script that was already working perfectly well to import the data into our MySQL database. It’s important to note that this script takes four environment variables: the MySQL database host, username, password, and port.

Attempt 1: Extending Containers

My initial thought was to use the Database container to run the script and import the data. First of all, there is no way to extend Docker containers per se. Among other reasons this just doesn’t make sense for Docker’s architecture. For instance docker containers can run on two different operating systems so it would actually be impossible to consistently reuse commands between containers.

The closest approximation I have found for extending containers is to create a Dockerfile from one of the containers and copy the content of the other container for your needs. For example:

FROM mysql
RUN apt-get install python #And much more!
COPY importer.py .
ENTRYPOINT ./importer.py

This was ill-thought for at least a couple of reasons. Using the Docker container for python is much simpler than trying to install what we need for python on a container designed for MySQL. You can see what the actual python Dockerfile does: https://github.com/docker-library/python/blob/2f73f58fb5ad731616109e0b8ed6367a0d474c52/3.7-rc/stretch/Dockerfile. I did manage to install Python and get the script running, but it took a lot of research to find the right packages to install for everything I needed.

This is all a moot point because the entry point command for the database container starts the database. This means that if I change the entry point as in the above example there’s nothing that will actually start the database. There will be nothing to import into. This doesn’t work at all.

I could add RUN ./importer.py above and copy the same entrypoint as is used by mysql, but at that point the database hasn’t even started yet so we’re no further along in our solution.

Attempt 2: The Right Container for the Right Purpose

Since you can connect containers together that are on the same network, we can start our database container and then run a separate container for our import script. Just to recap, we can run our database using

docker run --env MYSQL_ALLOW_EMPTY_PASSWORD=1 --name db -d mysql

Note that the -d flag will run the container in the background.

Once our database is up and running (you will have to wait a few seconds) we can run our import script. If you actually want to run it, it’s here: https://github.com/Bjorn248/graphql_aws_pricing_api/blob/master/importer/

This script also has a few dependencies that you can install via pip even using a requirements.txt file as python allows. Instead of trying to do this exclusively on a running container through docker exec, etc. at this point it would be a lot easier to create our own Dockerfile, build it as a container, and run the import script on it.

FROM python
COPY import.py requirements.txt /scripts/
RUN pip install -r /scripts/requirements.txt

Note that the COPY <file1> <file2> <directory> copies files from your local machine into the target directory on the container. You could make this whatever you want and Docker will create it for you on the container. We’re just using /scripts/ here since it’s an import script that we want to run, but you can change it to whatever makes sense to you. Finally pip install is our pythonic way to install out dependencies which will include a way to communicate with our database.

Now that we have our Dockerfile, we can build our container. We can do this simply with docker build .. The last argument is a path to the Dockerfile we want to build. Usually this will be in the same directory that we’re in which would be .. We want to tag the container so that we can refer back to it more easily later. You can do this with -t. All together, docker build -t importer .. Using this is optional, but it’s much easier to refer to the container by tag.

Once we’ve built the container, we can now run it and actually do the import. The command we want to use to run is similar to the docker run we talked about in the previous step to connect to the MySQL CLI. There are a lot of different ways to do this such as:

docker run --rm importer /scripts/importer.py

This is unlikely to work since we’re not linking our database container and our importer script container together in any way. We can link them together in various ways such as using --link or user-defined networks.

docker run --link db --rm importer sh -c \
'MYSQL_DB="$DB_PORT_3306_TCP_ADDR" exec /scripts/importer.py'

A lot of this should look familiar, but just to recap:

  • the --link creates some environment variables on the target container (importer in this case) that link to the source container db.
  • --rm will remove the container when it exists. Since we don’t need this container after it’s done the import — the data will be in our database — we can just have it removed when it’s done.
  • importer is the name of the container we want to run the command on. We created this with docker build -t. If we did not use -t we could refer to the container by its image ID which we could find with docker images. Using a tag is a lot more convenient.
  • The sh -c is not specific to Docker. It’s just running a shell command.
  • The MYSQL_DB="$VAR" part is not specific to docker either. This is passing an environment variable to the exec command. We wrap the whole command in quotes, '' so that we can set the environment variables from the source container. If we didn’t have these quotes, the $ variable would be interpolated from our local shell and would probably be empty. Note that MYSQL_DB in this case is an imaginary environment variable required by the import script. This is just used as an example and has nothing to do with Docker.
  • Finally, the exec /scripts/importer.py will run our import script. Remember that we used our Dockerfile to copy importer.py on our local machine to the /scripts directory on our built Docker container.

Rather than use --link we can also use user-defined networks which takes more than one step but I think is simpler:

docker network create connect-db
docker network connect connect-db db
docker run --rm --network --env MYSQL_DB=db connect-db importer \
/scripts/importer.py
docker network rm connect-db

This continues with our example above. Our fabricated import script relies on an environment variable MYSQL_DB. We pass db as the value and this will be used as the hostname. Remember that when we create the network and add the containers to the same network host entries are created with the container names. db is actually a hostname for our database container on the network.

Connecting with docker-compose

Rather than use docker run it would be nice if we could compose these two containers together in some way. There’s an excellent tool for this called docker-compose. This is a very popular tool for development using docker containers to bring together dependent resources in a way that is agnostic to the local machine.

We already have two containers that we need to use. One is our database and the other is a python container that runs an import script for our database. I’ve created a docker-compose.yml file that specifies our needs for these containers:

version: '3'
services:
db:
image: mysql
environment:
MYSQL_ROOT_PASSWORD: foo
importer:
build: ./importer
depends_on:
- db
environment:
MYSQL_USER: root
MYSQL_PASS: foo
MYSQL_HOST: db
command: ["/scripts/importer.py"]

version: '3' is boilerplate. services: is our list of services. The way that yaml works is that indented properties are sub properties of objects as key-value pairs. The  indicates an array.

We have two services, db and importer. db uses image. This builds from a the docker repository MySQL image. The environment object is key-value pairs for environment variables.

The importer service uses build instead of images. We give it a path to the Dockerfile for the image we want to build. depends_on requires a previous image to be built before docker-compose starts building the dependent image. Note that this doesn’t wait until our database is actually started. It only waits until the container is built. More on that later.

command runs the specified command once the container starts. We can use this to run a command on the image instead of its defined command or entry point in its original Dockerfile. Since this is our own image we are creating, we probably don’t need to use command here and could use it in the importer Dockerfile. I’m using it here as an example.

Now we can run docker-compose up. This will create and run the database image. docker-compose automatically links containers together on the same network. This allows us to use the hostname db on our importer script without any additional flags or properties. This is a nice feature of docker-compose.

We can also access the MySQL CLI connected to our database with docker run as before using --link or --network referring to the docker-compose image name. We can also simply use docker-compose exec db mysql.

Waiting for our Database

If you try to run this out of the box, one problem you’ll see is that the database isn’t actually up when our importer script starts even though we have the depends_on. As stated before, depends_on only waits for the container to be built which will happen before the MySQL database starts.

You can read more about control startup order in the docker-compose documentation: https://docs.docker.com/compose/startup-order/

Docker doesn’t provide any specific way to wait until services are ready. This is intended because there’s no consistent way to determine that a service is ready, and applications should be built to handle circumstances where services are not available yet or become unavailable due to some issue.

In our case, we want the import script to run once on database initialization. There is a convenient tool wait-for-it.sh that you can find from the link above. This will essentially pause and periodically check access to a particular host:port to determine whether it’s ready. Once db:3306 is accessible we can assume that we can connect to our database and run the import script.

importer:
build: ./importer
depends_on:
- db
environment:
MYSQL_USER: root
MYSQL_PASS: foo
MYSQL_HOST: db
command: ["/scripts/wait-for", "db:3306", "--",
"/scripts/importer.py"]

Now we have our database up and running and our data imported. This is great, but it’s not very useful without an application. Let’s get that up and running next.

Step 3: Running Application

Finally, we have the actual application serving the API. In this case it’s a node.js app with an express server serving a GraphQL API that talks to our database. In our example we already have the app — we just need to Dockerize it. This should be pretty simple to do… Adding another Dockerfile entry:

# Dockerfile
FROM node
WORKDIR /usr/src/app
COPY package.json yarn.lock ./
RUN yarn install
COPY src ./
EXPOSE 4000
CMD ["npm" "start"]

We’re using node as our base container. This includes the node.js executable, npm, npx, and yarn.

WORKDIR selects a working directory for the docker container. This is almost like cd /usr/src/app for whenever we run a command on the container. As with ADD from our importer, it creates this directory on the container even if it doesn’t exist.

COPY and RUN should be familiar from our importer script. COPY copies files from our host machine to the target directory on the docker container. Note that the relative directory provided is relative to WORKDIR, i.e. ./ expands to /usr/src/app in the Dockerfile above.

EXPOSE is actually not needed and is advisory. It tells users of your Dockerfile that port 4000 will be used by the app running on the container. It’s up to the consumer of the docker container to publish the port when running the container. docker-compose makes this easy to do, so we’ll go over it later.

The npm start command starts our server and it listens on port 4000. We copy over the package installation files and run yarn install to install dependencies on the container.

# docker-compose.yml continued...
server:
build: ./server
ports:
- "4000:4000"
depends_on:
- db
environment
MYSQL_USER: root
MYSQL_PASS: foo
MYSQL_DB: db

The only new option is ports. This automatically publishes the port from host:container. In our example, port 4000 on our container will be accessible from port 4000 on our host machine (probably localhost for devs). Without this we wouldn’t be able to access the API on our container through our browser or without Docker which would be inconvenient.

Note: we can leave off the :container part of the ports. This makes docker-compose publish that port to an ephemeral port on the host instead. This gives more flexibility, but you’ll have to find the port manually before you can use it on the host machine. docker-compose port server 4000 will give us the port in our example of we used ports: — "4000".

Now when we run docker-compose up we get a running database with data imported and our app up and running as well. However, out of the box we have a couple of problems:

  1. The app crashes immediately since it’s expecting to be able to connect to our database. The app starts before the database has finished starting or completed importing.
  2. Changing our source code for our server doesn’t propagate to the container. If we want these changes to reflect, we’d have to restart the container each time.

Attempt 1: Waiting in Real Time for an Inaccessible Database

Since this is for development, I initially thought that a bit of documentation could get me out of the jam of having to wait for the database.

You’ll have to wait for the Database to start up and initialize. This may take a few minutes. Once that is complete, you can start up the server by running docker-compose run server.

While this does get a server up and running you might notice some issues. Chief among them would be that http://localhost:4000 doesn’t work. A look at docker-compose ps helps explain why that is.

You can see that for the db container at the top, localhost:3306 on our host machine directs to 3306 on the container (note the ->). However this doesn’t show up for our run of the server. Actually there is no way to access the server through a port from our host machine. This makes the API kind of difficult to test properly.

docker-compose run does not publish ports that would be published by docker-compose up. This is apparently by design: https://github.com/docker/compose/issues/1256#issuecomment-90135857

Fortunately, docker-compose up does actually start up the container we need; it just exits since the app crashes. We can simply do docker-compose start server to start up our server again which will publish the ports.

Attempt 2: Handling an Inaccessible Database

Having to wait and manually start up a service is still pretty wonky. It would be nice if this could be handled in a more automatic and perhaps robust way.

There are a lot of different ways to do this, and this content is not specific to Docker. If you refer to the linked Docker documentation above, we want our application to handle situations where the database is inaccessible and fail gracefully.

One way to do this with our app is to use a node.js process manager such as pm2 or forever. I’ll choose pm2. We can use pm2 to keep restarting our app if there is a critical failure such as its inability to connect to the database. Not only will this help us wait for the database to start before attempting to start the app, it will also handle restarting properly if the database goes down temporarily for some reason.

You can update the Dockerfile to use yarn global add pm2, and pm2 is a great tool for managing multiple node processes. In a docker container, you should only have one application running, so you also have the option to add pm2 as a project dependency (update package.json).

Now we can update the command for our server container:

CMD ["npx", "pm2", "start", "src/server.js" "--no-daemon"]

npx is a node.js platform tool that will run an executable from a node.js package from a hierarchy of locations. This has nothing to do with Docker specifically. The above command should work as expected whether you choose to include pm2 as a dependency of the app or install it globally on the container.

The pm2 stuff is also not Docker specific. Feel free to look up the details in the pm2 documentation. This simply starts our app from src/server.js. The --no-daemon flag runs pm2 in the foreground so we can get the logs directly from docker-compose logs rather than having to run something like pm2 logs on the container to get them.

Now, the app container will attempt to restart for as long as it can’t connect to the database. Once it can connect to the database it will start on its own. We could also use wait-for like we did for the importer to help out a bit more.

Note: by default, pm2 will only retry so many times. If your database can’t be accessed for a while pm2 may attempt to retry many times very quickly and stop retrying and the app won’t start once it’s available. You can get around this in a couple of ways such as by --restart-delay to reduce how frequently retries are attempted, or increasing --max-restarts. I had also suggested that the app simply not crash when the Database can’t be found and instead provide a route to signal to the app that the database was ready that importer could call, but my friend didn’t want to implement this.

Restarting on Code Changes

So far what we have will work great if our application is already working, but if we want to test changes while we are developing we’ll have to update a bit further. Fortunately by using pm2 we’re already halfway there since it can watch for file changes. We just need a way to propagate changes from our host machine to the Docker container.

Docker makes this very easy to do with volumes. We can specify a volume in our docker-compose.yml to mount a local directory from our host machine to the container at the specified mount point.

server:
volumes: ./server/src:/usr/src/app/src

Remember above we had WORKDIR /usr/src/app, so all work done on the container will be relative to that path. The source code for our app on our local machine is located at ./server/src, relative to the docker-compose.yml file.

I also want to switch over from using JavaScript to TypeScript. In order to run the app as TypeScript we can introduce an additional build step, but pm2 also provides a way to specify an interpreter. We can add the ts-node TypeScript interpreter to our app dependencies to allow pm2 to use it. We could also add it globally to the container.

We also need to update the pm2 command to actually watch for file changes. All together the changes to the command become:

CMD ["npx", "pm2", "start", "src/server.ts", "--no-daemon",
"--watch", "--interpreter=node_modules/.bin/ts-node",
"--restart-delay=30000"]

The --restart-delay is optionally added from the previous section to prevent the app from crashing too many times too quickly. --watch is used to watch for file changes and restart when they are detected.

Now we can make any changes to ./server/src and pm2 will restart automatically on our container. We’ve achieved fully active development using a dockerized infrastructure!

Conclusion

See the repository to see the conclusion of everything: https://github.com/Bjorn248/graphql_aws_pricing_api/tree/215260f19ff882a32d703dc0238eb600b0f81ed5

For local development you should simply be able to use to clone this repository and run docker-compose up (or docker-compose up -d --build for a better dev experience).

Caveats

One issue that I’ve found is that if you need to add or otherwise change node_modules for the node app there doesn’t seem to be a way to do this that will automatically restart the app to reflect the changes.

I had originally proposed mounting node_modules onto the container from the source machine as well, but this has a couple of potential problems:

  1. The host and container are likely to be different operating systems. This may have an impact as to how packages are installed — especially for OS-dependent behavior.
  2. The package manager may be a different version between host and container which could lead to installation working differently.

You would also be forced into the extra step of running yarn install on your host machine to generate the node_modules directory to mount, and this would be an extra step outside of Docker’s ability to control.

There’s no great solution for this, but it’s simply the nature of how node.js packages are handled. Fortunately, adding node packages should be relatively uncommon. My suggestion for handling it is:

  1. docker-compose stop server — for whatever reason, docker-compose restart seems to break the volume link and pm2 will no longer restart after local changes.
  2. Update source with the new dependency
  3. docker-compose start server and you should be good to go.

For local development tools you may need to yarn install locally as well to get the dependencies, but this shouldn’t have any impact on the containers.

Tidbits and Lessons Learned

  • ADD and COPY are mostly the same, but you probably want COPY. ADD handles special names such as urls and archives differently.
  • EXPOSE seems purely advisory.
  • --link seems magical. I prefer working with user-defined networks which is actually pretty simple. It just requires additional cleanup.
  • docker run creates a new container. Don’t be confused by starting docker run --name db mysql and then trying to connect via docker run mysql. docker exec seems great for this.
  • You usually want docker run -it --rm. You want an interactive terminal and you won’t need the container once you’re done.
  • docker-compose run doesn’t publish ports. Use existing containers created through docker-compose up which you can stop, start, and restart.
  • Don’t mount node_modules. There’s no consistent, fully automated way to actively add a new module. You essentially have to restart the container.
Like what you read? Give Andrew Crites a round of applause.

From a quick cheer to a standing ovation, clap to show how much you enjoyed this story.