Creating a Docker Swarm

Katie Sheridan
8 min readMay 6, 2023

--

In my last post, I talked about how Docker containers are now a pretty standard way to package together applications and all the components necessary to operate them. On of the ways Docker can do this and achieve a degree of scalability is through using Docker Swarm.

Let’s build some context in case you have never used Swarm before. Imagine for a moment a herd of whales- called a Pod (which is ironically a term used for Kubernetes and not Docker)- which swim together to get from Ocean A to Ocean B. Suddenly, our Pod needs to accomplish more than they planned on and we need more help to do the job in order to get where we’re going. The pod has decided it needs to delegate some tasks to accomplish the greater goal their trip. Imagine too that these whales assign roles to one another like who is the lead swimmer, who’s on the prowl for a better meal than krill, and who is on defense from attacking fish. (Cue Finding Nemo’s Bruce- “fish are friends, not food.”) and in order to do that the pod has to recruit more whales. In addition, if they create a ‘swarm’ the goals are broken down and become achievable.

Docker swarm works in a similar way. Stepping back from the whale metaphor, Swarm assigns roles called managers👨‍💼 and workers 👷‍♀️to their nodes to accomplish a task sent from an API. The managers send a job to worker node. You can also have many nodes working simultaneously. As demand increases, we can create replicas in order to scale out.

So why Swarm?

Benefits Include:

  • Easy Installation and Environment Set-Up
  • The tools of Docker work well with Swarm, the CLI, and others like Docker Compose
  • Offers Auto-Scaling on Demand
  • Comes with Internal Load Balancers
  • It is a FAST way to deploy and easy to manage

A lot of companies are quick to pull the trigger and go straight to a service like Kubernetes, but for the reasons above, if you project is relatively small, A Docker Swarm might be the more ideal solution. Kubernetes is great for a project that would involve services like CI/CD, but when that isn’t necessary, Swarm can be a really great alternative. This is especially true if the workload happens to be building from the ground up.

How can we achieve a Docker Swarm?

FOUNDATIONAL GOALS:
Using AWS, create a Docker Swarm that consists of one manager and three worker nodes.

Verify the cluster is working by deploying the following tiered architecture:

  • a service based on the Redis docker image with 4 replicas
  • a service based on the Apache docker image with 10 replicas
  • a service based on the Postgres docker image with 1 replica

ADVANCED
Create a Docker Stack using the Basic project requirements. Ensure no stacks run on the Manager (administrative) node.

Part 1: Create a Docker Swarm

Step 1: Create the Manager and Worker Nodes

First, I opened up EC2>Security groups to create a custom group for all the virtual machines (aka EC2 instances) we will need in order to create the nodes. The inbound rules will be different for both the managers and the workers.

Here’s a quick explanation of why we’re using certain ports.

  • Port 2377 — TCP for communication between nodes
  • Port 7946 — TCP/UDP for overlay network node discovery
  • Port 4789 — UDP for overlay network traffic
Create a custom name.
Manager Security Specs: Open for communication between nodes, overlay, and network traffic, respectively. (Disclaimer — this will need to be debugged later.)

Rinse/repeat:

Worker Security Group Specs.

I went into Ec2 and launched 4 instances with the security group that I just created. In the # of Instances field, I changed it to 4, and kept the other defaults.

Edit the names✏️ in order to “tag” them.

To save myself a step, I used the worker security group on all 4 instances in the initial set up. I went back into the manager EC2 in the security settings and switched its over to the manager security group we made first. These port differences are what’s going to make the nodes able to communicate with each other.

Another step is to configure user data for each node. This should have been done during the initial set up, so I had to stop the instance, go back in, and add this. This can be done under the Ec2 instance page >actions>instances settings>edit user data.

#!/bin/bash

#Update all repos
yum update -y

#Install Docker
yum install -y docker

#Start Docker service
systemctl start docker.service

#Enable Docker service automatically on boot up
systemctl enable docker.service

I booted the instances back up, waited for the status checks and then SSH’d into it. For the sake ease, I decided to run this via AWS instance connect and not my window’s Powershell terminal.

I switched to root for ease, then ran the commands in order to get any updates, enable Docker, start the service, and check the version. I also added the user to the group in each instance. Additionally, you could run a chmod command to achieve the same effect.

I ran the following commands:

sudo su
yum update -y
yum install docker -y
systemctl enable docker
systemctl start docker
usermod -aG docker $USER
docker version

And I repeated the steps for each instance. I had the manager instance in another tab, switched back over to it and ran:

docker swarm init

This initializes the swarm on that instance.

There is extra information here that we will want to use for connect-ability later, but for now, I verified that the swarm was active by running:

docker system info 

Now I used the command it gave a moment ago in each of the worker terminals. I used this to join the workers to the Swarm.

#Join Work Command "outline"
docker swarm join --token <token> <manager-ip>:2377

#The version I ran in the CLI:
docker swarm join --token SWMTKN-1-41ljbgyjvsur3x8i3dsrjtyr8oypyth6asmvn023fyf22wbznc-4s2cqs87v4y3f6xu979d933qf 172.31.92.49:2377

The first one scared me a minute because it gave a “timeout” output. So instinctively I tried it again. I tried on the other workers. Still no luck.

When in doubt…try permissions.

# disconnect the node from swarm
docker swarm leave

#change daemon permissions to read/write
sudo chmod 666 /var/run/docker.sock

#connect worker to manager
docker swarm join --token SWMTKN-1-41ljbgyjvsur3x8i3dsrjtyr8oypyth6asmvn023fyf22wbznc-4s2cqs87v4y3f6xu979d933qf 172.31.92.49:2377
You want it to read “This node joined a swarm as a worker”

I also made sure the security group was set correctly, which it was still on my worker security group. Apparently I hadn’t “applied” the settings when I went to switch it over earlier. I fixed this then moved on.

To make sure they’re connected, I attempted to run docker node ls on the Swarm manager to see if they actually linked up. 🔗

docker node ls

However, when I clicked back over to the manager, I could no longer connect to the instance.

RIP. 💀

I checked my “manager” security group. It was now missing a SSH inbound rule on Port 22, as well as opening an ESP (50) protocol for the VPN security encryption.

That got me back to connectivity, and then I ran docker node ls and got the following:

SUCCESS! We have 3 worker nodes and one manager, notated by leader.

Step 3: Scale the Swarm Out with Replicas

The new goal is to create a YAML file which will list all of the specs we need. It will also deploy as a stack on top of the Docker Swarm. It will also be set up in a tiered structure.

Make a new directory in the manager instance, using mkdir <directory-name>., touch <filename>, vim <filename>.

I then created the file using this code. Be sure to save it as a .yml. It basically treats each service as it’s own block, and we just specify the replicas we need on each. I also customized my postgres password to access postges.

version: "3.9"

services:
redis:
image: redis:latest
deploy:
replicas: 4
placement:
constraints:
- node.role == worker

apache:
image: httpd:latest
deploy:
replicas: 10
placement:
constraints:
- node.role == worker

postgres:
image: postgres:latest
deploy:
replicas: 1
placement:
constraints:
- node.role == worker
environment:
POSTGRES_PASSWORD: 12345

Alright, time to deploy. Go to the manage instance and run:

#deploy outline
docker stack deploy --compose-file <yaml-filename> <stack-filename>

#my deployment command:

docker stack deploy --compose-file WK17-DocSwarm.yml Week17Stack

Let’s take a quick look at what we’ve built….

docker service ls
The replicas show that 10 have started and 10 are running for apache, 4 for redis, and 1 for postgres.
docker stack ps <stack name>

And that’s it! The stack is deployed!

Part 2: Verify Stack/Service is NOT on the Manager node.

This step is relatively easy. We can run a few commands to verify that nothing is running on the manger. Cuz hey, ideally managers don’t work. They just delegate.

docker ps -a
docker images

You can now verify the absence of services and images.

There is nothing listed under our manager, so we’re good to go!

To clean up your workspace run the following. The first tears down the stack, and the second tears down everything else. You will need to run the 2nd one on each node.

docker stack rm <stack_name>
docker image rm $(docker image ls -aq)

Thank you for making it all the way to the end, and also working through the troubleshooting with me. If you have any tips or tricks, please let me know and if you found this demo helpful, please give it a clap and follow. Feel free to connect with me on LinkedIn as well!

--

--

Katie Sheridan

DevOps/ CloudEngineer--- I will be using Medium to blog about my projects and any tips or tricks I come across.