Composing Microservices with Docker — Part 2
In part one we learned about our Microservices, we dockerized them using
sbt-docker, created our
entry point, and finally configured them for Cassandra.
That’s all well and good, but we can’t even run them yet — so let’s get cracking!!
All the source code and a markdown version of this blog is available on my github repo:
When we want to run a docker container, we can simply use
docker run. We can even run a number of containers and have them communicate with each other by binding ports on the host machine. Something like this:
$ docker run --name cassandra -p 9042:9042 -v markglh-cassandra-node1-data:/var/lib/cassandra cassandra:3.9
$ docker run --name beacon -p 9001:80 beacon-service:1.0.0-SNAPSHOT
$ docker run --name tracking -p 9002:80 tracking-service:1.0.0-SNAPSHOT
$ docker run --name aggregator -p 9000:80 aggregator-service:1.0.0-SNAPSHOT
As you can see, it gets pretty gnarly and that’s with only one Cassandra node, no initialisation and no NGINX. Additionally it’s not likely to be trivial replicating this setup anywhere other than your laptop.
There are a growing number of ways to run containers in environments; Mesos, Kubernetes, Amazon ECS, the list goes on. Unsurprisingly, given the name of the blog, we’re gonna focus on using
docker-compose for this. Compose itself lets you define your environment, including dependencies, paths, and resources. This can then be spawned anywhere and combined with something like
Docker Swarm to scale things out… but one step at a time!
Within our repository, we’ve nested each service in it’s own directory — each of which is a fully contained
sbt project. At the top level we’ve got our
docker-compose.yml and our
compose-resources. Let’s break this down and discuss each of these in detail.
This is where the various compose resources live. We’ve split this up into
NGINX is a powerful HTTP Server and reverse proxy which we’re using to route requests to our services. Without this we’d have to know which port each container has
exposed to be able to hit it from outside the
docker-compose environment. The
nginx.confisn’t really important, all it’s doing is routing requests on port 80 to the appropriate service (via IP) depending on the path.
The Cassandra resources are essential for our Microservices to function.
To re-iterate what we said in part 1, docker doesn’t yet let you define what it means for a container to be started. You can define dependencies between containers to control the startup order, however sometimes we really want our application within the container to be running first — not just the container itself, this is where the
cassandra-init.sh script comes in.
/init/scripts/wait-for-it.sh -t 0 cassandra-node1:9042 -- echo "CASSANDRA Node1 started"
/init/scripts/wait-for-it.sh -t 0 cassandra-node2:9042 -- echo "CASSANDRA Node2 started"
/init/scripts/wait-for-it.sh -t 0 cassandra-node3:9042 -- echo "CASSANDRA Node3 started"
cqlsh -f /init/scripts/cassandra_keyspace_init.cql cassandra
echo "### CASSANDRA INITIALISED! ###"
This is what holds everything together. First it blocks using the
wait-for-it.sh we discussed in part 1, waiting for each of the three Cassandra nodes to start and become available. It then runs
cassandra_keyspace_init.cql which creates our tables and populates them with dummy data.
docker-compose it’s good practice to break down common definitions into smaller
yml files which we can reuse and therefore reduce the complexity and duplication within our scripts.
restart: always # Sometimes it starts too fast, cheap way of retrying...
What we’re doing above is defining the
cassandra-base container, which will be re-used for each of the Cassandra nodes later. We’re using the official Cassandra image from Docker Hub which allows us to override configuration values using environment variables.
We also define a custom network, this is necessary so we can assign static IP addresses later - these will then match those defined in the
nginx.conf discussed earlier.
This is where the real action is, there’s a lot going on so let’s break it down into bite-size chunks. nom.
- We’ve defined our
- Exposed port 80 from within the
docker-composeenvironment, binding it to port 9000 on the host machine.
- Defined a volume for the
nginx.conf. This mounts the config located in
/etc/nginxwithin the running container. Thereby providing our custom configuration to NGINX.
nginxto our custom
dockernetnetwork we discussed earlier.
With nginx sorted, it’s time to define our Cassandra nodes…
- markglh-cassandra-node1-data:/var/lib/cassandra # This bypasses the union filesystem, in favour of the host = faster.
Above, we define the first Cassandra node, extending
cassandra-base in the
base-containers.yml file. This means we’ll get all the configuration we discussed earlier for free. Pay special attention to the volume here, we’re not mounting a volume from a specific host path as we did with NGINX. Instead we’re using a
named volume and mounting it into the container. This volume doesn’t contain any special data, but it does give us a few niceties:
- Volumes in docker bypass the
Union filesystem, which is the layered filesystem docker uses. This can improve performance when reading and writing lots of Cassandra data. It also has the additional benefit of allowing us to preserve the data both between runs (which would happen even without the volume) but also between different compose environments. As an example, you may have another compose file which spawns different services, but shares the same data. This wouldn’t be possible without volumes.
We’ll skip the definitions for the other two Cassandra nodes as the configuration is virtually identical. We do however have one special Cassandra entry worth discussing.
command: bash /init/scripts/cassandra-init.sh
restart: on-failure # Restart until we successfully run this script (it will fail until cassandra starts)
This is how we’re kicking off the
cassandra-init.sh script we covered earlier. We’re overriding the
init/scripts directory within the container, providing our own init script. We then use a
command to invoke this script when the container starts. This overrides the normal behaviour of the container (starting Cassandra) and allows us to run our scripts with everything we need available; namely
cqlsh. Without having to install those manually in a custom Image.
In all honesty it’s encroaching on “hack” territory. However it works, pretty reliably. In a future blog post I’ll walk through how we’d generally handle Cassandra schemas in a production environment, I like the schema definitions to live with the Microservice’s code in GitHub.
We’ve finally made it to the service definitions!
- First off, we’re using the image we created in part 1.
- The service uses a few different ports, so we expose them and bind to port 80 on the host machine.
stdin_openThis keeps the stdin open, preventing any premature shutdowns.
linkslet us specify dependencies on other containers, this makes sure the other container starts as a prerequisite and lets us refer to it by name. We’ll discuss this soon.
- We could have defined a
volumehere and mounted a
APP_CONFpath which we defined in our
Bootstrap.scalalooks for this configuration before loading the default one, this allows us to provide a different configuration file per environment.
- Finally we make sure our service restarts if anything happens and we bind it to a static IP on our
dockernetnetwork - allowing NGINX to route requests to it.
The other services all follow the same pattern with one exception.
There is a subtle difference here in that we’re providing an
alias for cassandra-node1 - allowing us to refer to it as
cassandra, this matches the configuration provided in
It’s worth noting that these services would be a good candidate for moving common definitions into the
base-containers.yml, I decided against that here for simplicity.
Before we move on, we’ll quickly cover the final few bits in the
# The custom bridge network allows us to specify static IPs for our services - useful for Nginx setup
- subnet: 172.16.2.0/24
# We use names volumes to store cassandra data, this allows our data to persist between different compose files
As we’re using a custom network, we need to define it somewhere, we specify the driver and IP range, you can read more about this here . We’re also defining the named volumes we discussed earlier, note that you must manually create the volumes outside of this
yml. For example
docker volume create --name markglh-cassandra-node1-data. This will create the volume and make it available for our
docker-compose environment to use.
Linking everything up
So, we’ve built our Microservices, we’ve dockerized them, and we’re defined the environment using
docker-compose. What are we missing? How do the services communicate with each other?
Let’s take a quick look at the
application.conf of our Aggregation service.
host = "tracking-service"
host = "beacon-service"
We’re referring to the other services using the names we’ve specified in the
docker-compose.yml, we do the same for
cassandra - using the alias we specified. This is how
docker-compose makes it simple for our services to communicate with one another.
Setup and run!
This part’s easy, first create the named volumes for Cassandra
docker volume create --name markglh-cassandra-node1-data
docker volume create --name markglh-cassandra-node2-data
docker volume create --name markglh-cassandra-node3-data
Next, we want to run
sbt docker to build the image for each of our Microservices - I’ve added a script in the root directory to do this for you:
Finally start the compose environment:
Boom! We’re done, let’s make a few requests with the data defined in
First, can we hit the tracking and beacon services directly?
"beaconName": "Room1 Beacon"
"name": "Chris Goldbrook"
"name": "Andy Stevens"
"name": "Nev Best"
Above, we make two requests directly to the tracking and beacon services, which query Cassandra and respond with a JSON body. Note that we’re using port 9000, this is the NGINX port, which forwards the requests to the appropriate service based on the URL.
Now let’s try the aggregator service:
"name": "Chris Goldbrook",
"beaconName": "Room1 Beacon"
"name": "Andy Stevens",
"beaconName": "Room1 Beacon"
"name": "Nev Best",
"beaconName": "Room1 Beacon"
This request is being routed to the aggregator service via NGINX. The aggregator is then hitting the two other services which results in two Cassandra calls, before aggregating and returning the response. See the diagram below for an illustration of how this is being fulfilled.
So, we’ve created 3 Microservices, dockerized them, defined our environment, preloaded our data, routed requests using a proxy and linked it all together. I hope you found this a useful introduction to Docker and docker-compose. Feel free to comment below or follow me on twitter @markgl
— Mark Harrison