High Availability and Horizontal Scaling with Docker Swarm


Introduction

In the last year of my undergraduate, I started paying better attention to the technologies and skill sets for Software Engineering and System Engineering jobs in my area. Docker and some sort of container orchestration tool was one that showed up a lot. So I read the Docker docs, took the two Docker courses on Linux Academy, built a couple of custom images, made a swarm, and set up my own application stack with Docker Swarm. Let me show you.

Overview

My Technology Stack

Simple diagram of my technology stack

Components

  • Three Google Cloud VMs each running the Community Enterprise Docker daemon. My Docker swarm was initialized with one manager and two workers on the same private network.
  • Inside my swarm I created an application stack called bookit that has services for a MongoDB Replica Set (one primary, two secondaries) and a custom built Node.js application called Bookit made with Express.js, Mongoose, Pug and server side Google Sign-In. The Bookit Node.js application simply allows users to register college text books. (Note: Bookit was just meant to simulate basic website functionality inside the swarm.)

Demonstration

MongoDB Replica set, Service Scaling and High Availability with Docker Swarm

MongoDB Replica Set — Primary Node Failure. In my bookit stack, I have three mongoDB services, one primary and two secondaries to create a replica set. I configured one Docker node with a label and a constraint so that my mongoDB primary node would always run on the same Docker server. Because containers are stateless, I had to figure out a way to have a persistent “storage anchor” for my mongoDB replica set. I achieved this by assigning my default primary mongoDB service with a bind mount to a volume on the Docker server. When I deploy my stack to the swarm, the primary mongoDB node has a mount to the persistent storage on the server and the other two secondary mongoDB nodes copy from the primary, therefore not needing any persistent storage. This is pretty cool because no matter where the secondary mongoDB services get deployed to, they’ll just pull a copy from the primary mongoDB node. (Note: In the real world you’d probably want all database services to have a bind mount to some sort of NFS volume.)

Database connection code snippet from my node.js application

So what happens if my mongoDB primary node fails? Take a look. In my bookit node.js application, I tell mongoose to connect to my replica set’s primary node. (Note: mongo1,mongo2, mongo3 are the mongoDB service names which resolve to the containers IPs thanks to Docker Swarm)

When I start my application, ah-ha, there’s all my data! All pulling from the persistent storage on my server.

Dash board of bookit. All book data is pulling from bind mount.

Now, let’s kill the primary monoDB node. All of the data should be lost right? Nah. Our application will render an error for about a second after we kill mongo1, but after the replica set election finishes, we’re back to normal and none of our data is lost. Why? Because the secondary nodes work hard to maintain a mirror image of the primary and after the election process, one of the mirrored secondaries is now the primary and will now handle read and write requests.

Error message from our application during the election
Docker Swarm visualizer before and after killing mongo1

So we killed our first primary mongoDB node, the one that had a persistent bind mount volume. So won’t our data disappear? Recall that a Stack has services, which can have replicas, which are containers, which have volumes that are created automatically. So while that container is running in our swarm, there is a volume stored locally on the server. The problem is, it could be deleted if we removed the container or if the service moved from one Docker node to the other. We’re on thin ice… What we need to do is get our old primary mongoDB node back up so our data will have a safe, persistent volume. BUT FIRST, let’s see something. Let’s add a new book to our application and see what happens when the old primary mongoDB node comes back.

Snippet of the Add Book Form
Snippet of the Available Books table

Cool, our new book “I killed mongo1” has been created. Let’s bring back mongo1. If I refresh the page, boom, our data is still there. The old primary mongoDB node first copies from the CURRENT primary, then re-elects itself as the NEW primary. So cool!

Docker Swarm visualizer before and after restoring mongo1
bookit main page after restoring mongo1

Swarm Scaling- Increase replicas. Let’s say it’s the first day of class, a ton of people are using bookit and registering a lot of books. We’re swamped. In a normal situation, we’d just hope for the best or maybe we prepared and made sure the server running bookit had sufficient resources for the busiest day of the year. That works, but what about the other 364 days of the year? The server will just sit there near idle wasting away. Not good! But alas, we used Docker Swarm and can just scale our application up for that day, then back down to save resources.

For this case, I used Siege to simulate 200 concurrent connections to our application. So let’s start Siege and then take a quick look at a visual of our swarm’s current state. We have one stack, bookit, with four services. Three mongoDB services and one node.js service that’s running our bookit server.

siege -c 200 http://mybookit.site.here
Visualizer image of our swarm

Almost instantly our single container running our bookit application spikes. In about five minutes, this container maxed out around ~98% CPU utilization. So how can we help? Let’s scale our application! I tell Docker to make six more replicas of my node.js container and to deploy them across the swarm. After that, start load balancing the requests to all of the node.js services running in the swarm.

Visualizer image of the swarm after scaling

As soon as the health checks passed and the new node.js replicas came online, they started handling requests. When I went back to look at my original node.js container, its CPU utilization immediately shot back down to safe levels because it was no longer responding to all requests alone.

So cool, we scaled our application stack in the swarm to handle more requests! But I still have one more thing to show off…

High Availability- Docker node failure. So, let’s say we’re unlucky and one of our interns completely takes down a Docker server (Worker 2). Recall that Worker 2 is a worker node in our swarm and currently has four running containers on it (scroll up to take a look). I left Siege running so we’re still being hammed with requests.

To simulate the failure, I simply stopped the Docker daemon on that server. As soon as the daemon goes down, the swarm manager quickly adjusts and “moves” all the containers that were running on Worker 2 to itself and Worker 1. (Note: The containers don’t really move specifically but rather new containers are spun up in their place on new nodes. Remember, services are universal to the swarm, containers are not.) You can see in the image below that Worker 2 is completely down and all of the containers that were running on it have been spread across Manager and Worker 1.

Visualizer image after stopping Worker 2

So we scaled live during the busiest time of the year and our intern took down a Docker worker node. We must have had down time right? well, only .02 percent.

So after stopping the Docker daemon on Worker 2 and all the containers “moved” to Manager and Worker 1, I stopped Siege. These were the results. 99.98% availability with almost 200 concurrent connections. So cool.

Conclusion

So that’s it. I set up a Docker swarm, made an application to put in it. Tried to loose all my data by killing my mongoDB container. I then scaled my node.js service to handle high traffic loads, and then brought down one of my workers, all while running Siege. After all that was said and down, I had a 99.98% availability.

If you’re interested in Docker and Docker Swarm check out their docs or head over to Linux Academy to learn it. Going forward, I want to explore another popular orchestration tool, Kubernetes to see how it compares.

This article was just meant to give an overview of my sandbox project, but if you want to learn more specifics about my Docker set up and how I did it, keep reading. Otherwise, I’ll see you next time!

Special thanks to Matt Blaul and Dan Milmot

(Optional) Technical Deep Dive

Installation and swarm initialization

I’ve created three Google Cloud CentOS 7 VMs ahead of time, all are on the same private network. I need to install Docker Community Edition so I can access the swarm features.

yum install -y yum-utils device-mapper-persistent-data lvm2
yum-config-manager --add-repo https://download.docker.com/linux/centos/docker-ce.repo
yum update
yum install docker-ce

To initialize the swarm, I simply run the swarm init command on the server which will become the manager. We specify the IP address that will be advertised to other members of the swarm. The output of the command will be used to join workers to our swarm. If we wanted to add more managers, we can run docker swarm join-token manager to manage tokens.

I can verify the swarm manager.

docker system info | grep -i swarm -A 5
output of the docker system info command

We can view basic information about our nodes with

docker node ls
output from docker node ls

Create a Service

So, what’s a service? Service encapsulates an application function just like a container does. Services can run on all nodes, where a container can only run on a single Docker daemon. Services have other cool features like being able to scale with more replicas.

docker service create --name webserver --publish 80:80 httpd

Above, we created a single service named webserver that publishes port 80 on the service to port 80 on the swarm with the httpd image. For more information about images, view the image docs.

When you create a service it starts with one replica which corresponds to one container somewhere in the swarm. My bookit stack (talk about this later), has multiple services, each with one replica by default.

If I want to see the “mapping” of containers to services I can run docker service ps.(I made a second replica just now for node1). You can see that a single service has two replica tasks, bookit_node1.1 and bookit_node1.2. Each replica task invokes exactly one container somewhere in the swarm. The bookit_node1.1 task is running on docker-manager and the bookit_node1.2 task is running on docker-worker2

Mesh Routing and Ingress Load Balancing

“Docker Engine swarm mode makes it easy to publish ports for services to make them available to resources outside the swarm. All nodes participate in an ingress routing mesh. The routing mesh enables each node in the swarm to accept connections on published ports for any service running in the swarm, even if there’s no task running on the node. The routing mesh routes all incoming requests to published ports on available nodes to an active container.” — docs.

Basically, any of the nodes can take requests for a published service and make sure it gets to the right place. Docker docs can also show the possibility of putting an external load balancer in front of your swarm.

Picture from Docker docs

Docker Swarm has a nifty built-in Round Robin load balancer to spread out requests to the appropriate tasks in the swarm.

“ Swarm mode has an internal DNS component that automatically assigns each service in the swarm a DNS entry. The swarm manager uses internal load balancing to distribute requests among services within the cluster based upon the DNS name of the service.” — Docs

In my bookit application, I display the container ID of which container responded to my request. See what happens if I refresh the page.

Update Services

I can scale a running service by using the docker service update command. You can update multiple parameters of your running service like its name, mount, etc, but we’re just going to update the number of replicas.

docker service update --replicas=7 bookit_node1

We’ll create six more (seven total) replica tasks of my bookit_node1 service. If we wanted to scale multiple services at once, we can use the docker service scale command.

docker service scale bookit_mongo1=2 bookit_node1=5

Labels and Constraints

By default, when you create a service, the manager schedules that task on some node in the swarm but it’s not guaranteed to be deployed on the same node every time. We can use constraints or placement preferences to make sure our tasks go where we want.

docker node update --label-add mongo=primary \ iqhkuqtehjcv23nl1s7trer25
Our label has now been added to the manager node

docker node update allows us to update properties of a node, including adding a label to it. Now, that we’ve set up a label, when we create a service, we use a constraint to help place our task.

docker service create --name webserver --publish 80:80 --constraint 'node.labels.mongo == primary' httpd

Application Stack

Containers are stateless and best practice is to have one container perform one function. One for a database, maybe a couple for different node.js micro service, etc. But how can we have those containers (services) work together? The answer is an application stack.

“A stack is a collection of services that make up an application in a specific environment. A stack file is a file in YAML format, similar to a docker-compose.yml file, that defines one or more services. Stacks are a convenient way to automatically deploy multiple services that are linked to each other, without needing to define each one separately.” — Docs

So to create our stack, we need to make a stack file. These stack files follow a special YAML syntax. You can specify EVERY part of a service in these files. For example, how many replicas, which images to use, what volumes to bind, which network to use, etc. Let’s take a look at my bookit YAML file.

We start be specifying the YAML version we’re using, in this case version 3. Next, we define all of our services, in this case four of them. Let’s talk about some of the basic properties.

image: Specify the image to start the container from. It can either be a repository/tag or a partial image ID.

networks: Networks to join, referencing entries under the top-levelnetworks key (notice these networks reference the networks we make at the end in our top level network key).

volumes: Mount host paths or named volumes.

deploy: Specify configurations related to the deployment and running of a service. This only takes effect when deploying to a swarm with docker stack deploy. (Notice I use a placement constraint for mongo1)

Because I specified a custom overlay network, and all my services are on that network, they can reference each other and are effectively linked. Service names are DNS resolvable, meaning inside any service I can simply talk to another by using its service name. SO COOL!

I can even do this inside my node.js code.

Database connection code snippet from my node.js application

Finally, let’s deploy our stack to the swarm. I can use the docker stack deploy command and specify where my YAML file is (by default it looks in the current directory) and the name of my stack.

docker stack deploy --compose-file compose.yml bookit

Conclusion

Awesome, that’s how I set up my swarm, created an application stack and deployed it. I can scale it up and down depending on traffic. There’s still a ton of more stuff we can do with Docker and if you want to see more stuff I learned about Docker, check out my notes.