Docker’s Voting App on Swarm, Kubernetes and Nomad

TL;DR

When you work in tech you definitely have to be curious as this is essential to always keep on learning and stay up to date. Things are moving too damn fast in the area.

Container orchestration is such a hot topic that even if you have your favorite tool (my heart goes to Docker Swarm) it’s always interesting to see how the other ones are working and learn from them as well.

In this article, we will use Docker’s Voting App and deploy it on Swarm, Kubernetes and Hashicorp’s Nomad. I hope you’ll have as much fun in the reading than I had experimenting those things.

The Voting App

I’ve used (and abused) the Voting App in previous articles. This application follows a micro-services architecture. It is made of 5 services as illustrated below.

Docker’s voting app architecture (https://github.com/docker/example-voting-app)
  • vote: front-end that enables a user to choose between a cat and a dog
  • redis: database where votes are stored
  • worker: service that get votes from redis and store the results in a postgres database
  • db: the postgres database in which vote’s results are stored
  • result: front-end displaying the results of the vote

The Voting App has several compose files as we can see in the github repository.

docker-stack.yml is the production ready representation of the application. The content of this file is the following.

version: "3"
services:

redis:
image: redis:alpine
ports:
- "6379"
networks:
- frontend
deploy:
replicas: 1
update_config:
parallelism: 2
delay: 10s
restart_policy:
condition: on-failure
db:
image: postgres:9.4
volumes:
- db-data:/var/lib/postgresql/data
networks:
- backend
deploy:
placement:
constraints: [node.role == manager]
vote:
image: dockersamples/examplevotingapp_vote:before
ports:
- 5000:80
networks:
- frontend
depends_on:
- redis
deploy:
replicas: 2
update_config:
parallelism: 2
restart_policy:
condition: on-failure
result:
image: dockersamples/examplevotingapp_result:before
ports:
- 5001:80
networks:
- backend
depends_on:
- db
deploy:
replicas: 1
update_config:
parallelism: 2
delay: 10s
restart_policy:
condition: on-failure

worker:
image: dockersamples/examplevotingapp_worker
networks:
- frontend
- backend
deploy:
mode: replicated
replicas: 1
labels: [APP=VOTING]
restart_policy:
condition: on-failure
delay: 10s
max_attempts: 3
window: 120s
placement:
constraints: [node.role == manager]

visualizer:
image: dockersamples/visualizer:stable
ports:
- "8080:8080"
stop_grace_period: 1m30s
volumes:
- "/var/run/docker.sock:/var/run/docker.sock"
deploy:
placement:
constraints: [node.role == manager]

networks:
frontend:
backend:

volumes:
db-data:

Basically, there are 6 services defined in this file, but only 5 services are defined in the Voting App architecture. The additional one is the visualizer, this is a great tool which provides a clean interface showing where the services’ tasks are deployed.

Docker Swarm

Docker Swarm is a clustering and scheduling tool for Docker containers. With Swarm, IT administrators and developers can establish and manage a cluster of Docker nodes as a single virtual system.

Swarm’s concepts

A Swarm cluster is composed of several nodes, some of them acting as managers, the others as workers:

  • manager nodes are the ones in charge of the cluster’s internal state
  • worker nodes are the ones executing the tasks (= running the containers)
Architecture of a Swarm cluster

As we can see, the managers share an internal distributed store in order to maintain a consistent state of the cluster. This is ensured through the logs of the Raft distributed consensus algorithm.

Note: if you want to know more about Raft logs usage in a Swarm, you might find the following article interesting (well… I hope so).

On a Swarm, services define how a part of the application needs to be run and deployed in containers.

Installation of the Docker platform

In case you do not have Docker installed on your machine, you can download the Community Edition for your OS and install it from the following location.

Creation of a Swarm

Once Docker is installed you are only a single command away from a working Swarm.

$ docker swarm init

Yes ! This is all it needs to have a Swarm cluster, a one node cluster but still a Swarm cluster with all its associated processes.

Deployment of the application

Among the Compose files available in the Voting App’s GitHub repository, docker-stack.yml is the one which needs to be used to deploy the application on a Swarm.

$ docker stack deploy -c docker-stack.yml app
Creating network app_backend
Creating network app_default
Creating network app_frontend
Creating service app_visualizer
Creating service app_redis
Creating service app_db
Creating service app_vote
Creating service app_result
Creating service app_worker

As I run the stack on Docker for Mac, I have access to the application directly from the localhost. It’s possible to select CATS or DOGS from the vote interface (port 5000) and to see the result on port 5001.

I will not go into the details here but just wanted to show how easy this application can be deployed on a Swarm.

In case you want a more in-depth guide on how to deploy this same application on a multi nodes Swarm, you can check the following article.

Kubernetes

Kubernetes is an open-source system for automating deployment, scaling, and management of containerized applications.

Kubernetes’ concepts

A Kubernetes cluster is composed of one of several Masters and Nodes.

  • The Master handle the cluster’s control plane (managing cluster’s state, scheduling tasks, reacting to cluster’s events, …)
  • The Nodes (previously called Minion, yes like in Despicable me) provide the runtime to execute the application containers (through Pods)
Architecture of a Kubernetes cluster

In order to run commands against a Kubernetes cluster, the kubectl command line tool is used. We will see several example of its usage below.

There are several high level Kubernetes’ objects we need to know to understand how to deploy an application:

  • A Pod is the smallest unit that can be deployed on a Node. It’s a group of containers which must run together. Quite often, a Pod only contains one container though.
  • A ReplicaSet ensures that a specified number of pod replicas are running at any given time
  • A Deployment manages ReplicaSet and allows to handle rolling updates, blue/green deployment, canary testing, …
  • A Service defines a logical set of Pods and a policy by which to access them

In this chapter, we will use a Deployment and a Service objects for each service of the Voting App.

Installing kubectl

kubectl is the command line tool used to deploy and manage application on Kubernetes.

It can be easily installed following the official documentation. For instance, to install it on MacOS, the following commands need to be run.

$ curl -LO https://storage.googleapis.com/kubernetes-release/release/$(curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt)/bin/darwin/amd64/kubectl
$ chmod +x ./kubectl
$ sudo mv ./kubectl /usr/local/bin/kubectl

Installing Minikube

Minikube is an all-in-one setup of Kubernetes. It creates a local VM, for instance on VirtualBox, and run a one node cluster running all the Kubernetes processes. It’s obviously not a tool that should be used to setup a production cluster but it’s really convenient for development and testing purposes.

Creation of a one node cluster

Once Minikube is installed, we just need to issue the start command to setup our one node Kubernetes cluster.

$ minikube start
Starting local Kubernetes v1.7.0 cluster…
Starting VM…
Downloading Minikube ISO
97.80 MB / 97.80 MB [==============================================] 100.00% 0s
Getting VM IP address…
Moving files into cluster…
Setting up certs…
Starting cluster components…
Connecting to cluster…
Setting up kubeconfig…
Kubectl is now configured to use the cluster.

Kubernetes descriptors

On Kubernetes containers are not run directly but through ReplicaSet managed by a Deployment.

Below is an example of a .yml file describing a Deployment. A ReplicaSet will ensure 2 replicas of a Pod running Nginx are running.

// nginx-deployment.yml
apiVersion: apps/v1beta1
kind: Deployment
metadata:
name: nginx-deployment
spec:
replicas: 2 # tells deployment to run 2 pods matching the template
template: # create pods using pod definition in this template
metadata:
labels:
app: nginx
spec:
containers:
- name: nginx
image: nginx:1.7.9
ports:
- containerPort: 80

As we will see below, in order to create a Deployment we need to use the kubectl command line tool.

To define a whole micro-services application in Kubernetes we need to create a Deployment file for each service. We can do this manually or we can use Kompose to help us in this task as we will see now.

Using Kompose to create deployments and services

Kompose is a great tool which converts Docker Compose file into descriptor files (for Deployments and Services) used by Kubernetes. It is very convenient and it really accelerates the process of migration.

Notes:

  • Kompose does not have to be used, as descriptors file can be written manually, but it surely speeds up the deployment when it is
  • Kompose does not take into account all the options used in a Docker Compose file

The following commands install Kompose version 1.0.0 on Linux or MacOS.

# Linux
$ curl -L https://github.com/kubernetes/kompose/releases/download/v1.0.0/kompose-linux-amd64 -o kompose
# macOS
$ curl -L https://github.com/kubernetes/kompose/releases/download/v1.0.0/kompose-darwin-amd64 -o kompose

$ chmod +x kompose
$ sudo mv ./kompose /usr/local/bin/kompose

Before applying Kompose on the original docker-stack.yml file, we will modify that one and remove the deploy key of each service. This key is not taken into account and can raise errors when generating descriptors files. We can also remove the information regarding the networks. We will then use the following file, renamed to docker-stack-k8s.yml to feed Kompose.

version: "3"
services:
redis:
image: redis:alpine
ports:
- "6379"
db:
image: postgres:9.4
volumes:
- db-data:/var/lib/postgresql/data
vote:
image: dockersamples/examplevotingapp_vote:before
ports:
- 5000:80
depends_on:
- redis
result:
image: dockersamples/examplevotingapp_result:before
ports:
- 5001:80
depends_on:
- db
worker:
image: dockersamples/examplevotingapp_worker
visualizer:
image: dockersamples/visualizer:stable
ports:
- "8080:8080"
stop_grace_period: 1m30s
volumes:
- "/var/run/docker.sock:/var/run/docker.sock"
volumes:
db-data:

From the docker-stack-k8s.yml file, we can generate the descriptors of the Voting App using the following command.

$ kompose convert --file docker-stack-k8s.yml
WARN Volume mount on the host "/var/run/docker.sock" isn't supported - ignoring path on the host
INFO Kubernetes file "db-service.yaml" created
INFO Kubernetes file "redis-service.yaml" created
INFO Kubernetes file "result-service.yaml" created
INFO Kubernetes file "visualizer-service.yaml" created
INFO Kubernetes file "vote-service.yaml" created
INFO Kubernetes file "worker-service.yaml" created
INFO Kubernetes file "db-deployment.yaml" created
INFO Kubernetes file "db-data-persistentvolumeclaim.yaml" created
INFO Kubernetes file "redis-deployment.yaml" created
INFO Kubernetes file "result-deployment.yaml" created
INFO Kubernetes file "visualizer-deployment.yaml" created
INFO Kubernetes file "visualizer-claim0-persistentvolumeclaim.yaml" created
INFO Kubernetes file "vote-deployment.yaml" created
INFO Kubernetes file "worker-deployment.yaml" created

We can see that for each service, a deployment and a service files are created.

We only got one warning linked to the visualizer service as the Docker socket cannot be mounted. We will not try to run this service and focus on the other ones though.

Deployment of the application

Using kubectl we will create all the components defined in the descriptor files. We indicate the files are located in the current folder.

$ kubectl create -f .
persistentvolumeclaim "db-data" created
deployment "db" created
service "db" created
deployment "redis" created
service "redis" created
deployment "result" created
service "result" created
persistentvolumeclaim "visualizer-claim0" created
deployment "visualizer" created
service "visualizer" created
deployment "vote" created
service "vote" created
deployment "worker" created
service "worker" created
unable to decode "docker-stack-k8s.yml":...

Note: as we left the modified compose file in the current folder, we get an error as this one cannot be parsed. This error can be ignore without any risk.

The commands below show the services and deployments created.

$ kubectl get services
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
db None <none> 55555/TCP 3m
kubernetes 10.0.0.1 <none> 443/TCP 4m
redis 10.0.0.64 <none> 6379/TCP 3m
result 10.0.0.121 <none> 5001/TCP 3m
visualizer 10.0.0.110 <none> 8080/TCP 3m
vote 10.0.0.142 <none> 5000/TCP 3m
worker None <none> 55555/TCP 3m
$ kubectl get deployment
NAME DESIRED CURRENT UP-TO-DATE AVAILABLE AGE
db 1 1 1 1 3m
redis 1 1 1 1 3m
result 1 1 1 1 3m
visualizer 1 1 1 1 3m
vote 1 1 1 1 3m
worker 1 1 1 1 3m

Expose the application to the outside

In order to access the vote and result interfaces, we need to modify a little bit the services created for those ones.

The file below is the descriptor generated for vote.

apiVersion: v1
kind: Service
metadata:
creationTimestamp: null
labels:
io.kompose.service: vote
name: vote
spec:
ports:
- name: "5000"
port: 5000
targetPort: 80
selector:
io.kompose.service: vote
status:
loadBalancer: {}

We will modify the type of service and change the default type, ClusterIP, by NodePort instead. While ClusterIP allows a service to be accessible internally, NodePort allows the publication of a port on each node of the cluster and makes it available to the outside world. We will do the same for result as we want both vote and result to be accessible from the outside.

apiVersion: v1
kind: Service
metadata:
labels:
io.kompose.service: vote
name: vote
spec:
type: NodePort
ports:
- name: "5000"
port: 5000
targetPort: 80
selector:
io.kompose.service: vote

Once the modification is done for both services (vote and result), we can recreate them.

$ kubectl delete svc vote
$ kubectl delete svc result
$ kubectl create -f vote-service.yaml
service "vote" created
$ kubectl create -f result-service.yaml
service "result" created

Access the application

Let’s now get the details of the vote and result services and retrieve the port each one exposes.

$ kubectl get svc vote result
NAME CLUSTER-IP EXTERNAL-IP PORT(S) AGE
vote 10.0.0.215 <nodes> 5000:30069/TCP 15m
result 10.0.0.49 <nodes> 5001:31873/TCP 8m

vote is available on port 30069 and result on port 31873. We can now vote and see the result.

After a basic understanding of Kubernetes’ components, we managed to deploy the Voting App very easily. Kompose really helped us in the process.

Hashicorp’s Nomad

Nomad is a tool for managing a cluster of machines and running applications on them. Nomad abstracts away machines and the location of applications, and instead enables users to declare what they want to run and Nomad handles where they should run and how to run them.

Nomad’s concept

A Nomad cluster is compose of agents which can run in Server or Client mode.

  • Servers take on the responsibility of being part of the consensus protocol which allows the servers to perform leader election and state replication
  • Client nodes are very lightweight as they interface with the server nodes and maintain very little state of their own. Client nodes are where tasks are run.
Architecture of a Nomad cluster

Several type of tasks can run on a Nomad cluster. Docker workload can run using the docker driver, this is the driver we will use to run the Voting App.

There are several concepts (Stanza in Nomad vocabulary) we first need to understand in order to deploy an application on Nomad:

  • A job is a declarative specification of tasks that Nomad should run. It is defined in a job file (text file in hcl, Hashicorp Configuration Language). A job can have one of many groups of tasks. Jobs are submitted by users and represent a desired state.
  • A group contains a set of tasks that are co-located on a machine
  • A task is a running process, a Docker container in our example
  • The mapping of tasks in a job to clients is done using Allocations. An allocation is used to declare that a set of tasks in a job should be run on a particular node

There are a lot more Stanza described in Nomad’s documentation

The setup

In this example, we will run the application on a Docker Host created with Docker Machine. Its local IP is 192.168.1.100. We will start by running Consul, used for the service registration and discovery. We’ll then start Nomad and will deploy the services of the Voting App as Nomad’s Jobs.

Getting Consul for service registration and discovery

In order to ensure service registration and discovery, it is recommended to use a tool, such as Consul, which does not run as a Nomad’s job.

Consul can be downloaded from the following location.

The following command launches a Consul server locally.

$ consul agent -dev -client=0.0.0.0 -dns-port=53 -recursor=8.8.8.8

Let’s get some more details on the options used:

  • -dev is a convenient flag which setup a Consul cluster with a server and a client. This option must not be used except for dev and testing purposes
  • -client=0.0.0.0 allows to reach consul services (API and DNS servers) from any interfaces of the host. This is needed as Nomad will connect to Consul on the localhost interface while containers will connect through the Docker bridge (often something like 172.17.x.x).
  • -dns-port=53 specifies the port used by Consul’s DNS server (it defaults to 8600). We set it to the standard 53 port so Consul DNS can be used from within the containers
  • -recursor=8.8.8.8 specifies another DNS server which will serve requests that cannot be handled by Consul

Getting Nomad

Nomad is a single binary, written in Go, which can be downloaded from the following location.

Creation of a one node cluster

Once Nomad is downloaded, we can run an agent with the following configuration.

// nomad.hcl
bind_addr = "0.0.0.0"
data_dir  = "/var/lib/nomad"
server {
enabled = true
bootstrap_expect = 1
}
client {
enabled = true
network_speed = 100
}

The agent will run both as a server and a client. We specify the bind_addr to listen on any interfaces so tasks can be accessed from the outside. Let’s start a Nomad agent with this configuration:

$ nomad agent -config=nomad.hcl
==> WARNING: Bootstrap mode enabled! Potentially unsafe operation.
Loaded configuration from nomad-v2.hcl
==> Starting Nomad agent...
==> Nomad agent configuration:
Client: true
Log Level: INFO
Region: global (DC: dc1)
Server: true
Version: 0.6.0
==> Nomad agent started! Log data will stream in below:

Note: by default Nomad connects to the local Consul instance.

We have just setup a one node cluster. The information on the unique member are listed below.

$ nomad server-members
Name Address Port Status Leader Protocol Build Datacenter Region
neptune.local.global 192.168.1.100 4648 alive true 2 0.6.0 dc1 global

Deployment of the application

From the previous examples, we saw that in order to deploy the Voting App on a Swarm the Compose file can be used directly. When deploying the application on Kubernetes, descriptor files can be created from this same Compose file. Let’s see now how our Voting App can be deployed on Nomad.

First, there is no tool like Kompose in the Hashicorp world which can smooth the migration of a Docker Compose application to Nomad (might be an idea of an open source project then…). Files describing Jobs, Groups, Tasks (and other Nomad’s Stanzas) needs to be written manually then.

We will go into the details when defining jobs for the redis and the vote services of our application. The process will be quite similar for the other services.

Definition of the redis job

The following file define the redis part of the application.

// redis.nomad
job
"redis-nomad" {
datacenters = ["dc1"]
type = "service"
group "redis-group" {
task "redis" {
driver = "docker"
config {
image = "redis:3.2"
port_map {
db = 6379
}
}
resources {
cpu = 500 # 500 MHz
memory = 256 # 256MB
network {
mbits = 10
port "db" {}
}
}
service {
name = "redis"
address_mode = "driver"
port = "db"
check {
name = "alive"
type = "tcp"
interval = "10s"
timeout = "2s"
}
}
}
}
}

Let’s explain a little bit what is defined here:

  • the name of the job is redis-nomad
  • the job is of type service (which means a long running task)
  • a group is defined, with an arbitrary name; it contains a single task
  • a task named redis using the docker driver, meaning this one will run in a container
  • the redis task is configured to use redis:3.2 docker image, and expose port 6379, labeled db, within the cluster
  • within the resources block are defined some cpu and memory constraints
  • In the network block we specify that port db should be dynamically allocated
  • the service block defines how the registration will be done in Consul: the service name, the IP address which should be specified (IP of the container), and the definition of the health check

To check if this job can be run correctly, we first use the plan command.

$ nomad plan redis.nomad
+ Job: "nomad-redis"
+ Task Group: "cache" (1 create)
+ Task: "redis" (forces create)
Scheduler dry-run:
- All tasks successfully allocated.
Job Modify Index: 0
To submit the job with version verification run:
nomad run -check-index 0 redis.nomad
When running the job with the check-index flag, the job will only be run if the server side version matches the job modify index returned. If the index has changed, another user has modified the job and the plan's results are potentially invalid.

Everything seems fine, let’s now run the job can deploy the task.

$ nomad run redis.nomad
==> Monitoring evaluation "1e729627"
Evaluation triggered by job "nomad-redis"
Allocation "bf3fc4b2" created: node "b0d927cd", group "cache"
Evaluation status changed: "pending" -> "complete"
==> Evaluation "1e729627" finished with status "complete"

From this output, we can see an allocation is created. Let’s see the status of this one.

$ nomad alloc-status bf3fc4b2
ID = bf3fc4b2
Eval ID = 1e729627
Name = nomad-redis.cache[0]
Node ID = b0d927cd
Job ID = nomad-redis
Job Version = 0
Client Status = running
Client Description = <none>
Desired Status = run
Desired Description = <none>
Created At = 08/23/17 21:52:03 CEST
Task "redis" is "running"
Task Resources
CPU Memory Disk IOPS Addresses
1/500 MHz 6.3 MiB/256 MiB 300 MiB 0 db: 192.168.1.100:21886
Task Events:
Started At = 08/23/17 19:52:03 UTC
Finished At = N/A
Total Restarts = 0
Last Restart = N/A
Recent Events:
Time Type Description
08/23/17 21:52:03 CEST Started Task started by client
08/23/17 21:52:03 CEST Task Setup Building Task Directory
08/23/17 21:52:03 CEST Received Task received by client

The redis task (= the container) seems to run correctly. Let’s ckeck Consul DNS server and make sure the service is correctly registered.

$ dig @localhost SRV redis.service.consul
; <<>> DiG 9.10.3-P4-Ubuntu <<>> @localhost SRV redis.service.consul
; (1 server found)
;; global options: +cmd
;; Got answer:
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 35884
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 0, ADDITIONAL: 2
;; OPT PSEUDOSECTION:
; EDNS: version: 0, flags:; udp: 4096
;; QUESTION SECTION:
;redis.service.consul. IN SRV
;; ANSWER SECTION:
redis.service.consul. 0 IN SRV 1 1 6379 ac110002.addr.dc1.consul.
;; ADDITIONAL SECTION:
ac110002.addr.dc1.consul. 0 IN A 172.17.0.2
;; Query time: 0 msec
;; SERVER: 127.0.0.1#53(127.0.0.1)
;; WHEN: Wed Aug 23 23:08:36 CEST 2017
;; MSG SIZE rcvd: 103

We can see that the task was allocated the IP 172.17.0.2 (on Docker’s bridge) and its port is 6379 as we defined.

Definition of the vote job

Let’s now define the job for the vote service. We will use the following job file.

// job.nomad
job
"vote-nomad" {
datacenters = ["dc1"]
type = "service"
group "vote-group" {
task "vote" {
driver = "docker"
config {
image = "dockersamples/examplevotingapp_vote:before"
dns_search_domains = ["service.dc1.consul"]
dns_servers = ["172.17.0.1", "8.8.8.8"]

port_map {
http = 80
}
}
service {
name = "vote"
port = "http"
check {
name = "vote interface running on 80"
interval = "10s"
timeout = "5s"
type = "http"
protocol = "http"
path = "/"
}
}
resources {
cpu = 500 # 500 MHz
memory = 256 # 256MB
network {
port "http" {
static = 5000
}
}
}
}
}
}

There are a couple of differences from the job file we used for redis:

  • the vote task connects to redis using only the name of the task. The example below is an except of the app.py file used in the vote service.
// app.py
def get_redis():
if not hasattr(g, 'redis'):
g.redis = Redis(host="redis", db=0, socket_timeout=5)
return g.redis

In this case, the vote’s container needs to use Consul DNS to get the IP of the redis’s container. DNS request from a container are done through Docker bridge (172.17.0.1). the dns_search_domains is also specified as service X is registered as X.service.dc1.consul within Consul.

  • We defined a static port so that vote service can be accessed on port 5000 from outside of the cluster

We can do pretty much the same configuration for the other services: worker, postgres and result.

Access the application

Once all the jobs have been launched, we can check the status and should see all of them running.

$ nomad status
ID Type Priority Status Submit Date
nomad-postgres service 50 running 08/23/17 22:12:04 CEST
nomad-redis service 50 running 08/23/17 22:11:46 CEST
result-nomad service 50 running 08/23/17 22:12:10 CEST
vote-nomad service 50 running 08/23/17 22:11:54 CEST
worker-nomad service 50 running 08/23/17 22:13:19 CEST

We can also see all the services registered and healthy in Consul’s interface.

From the node IP (192.168.1.100 in this example) we can access the vote and result interfaces.

Summary

Docker’s Voting App is a great application for demo purposes. I was curious to see if it could be deployed, without changes in the code, on some of the main orchestration tools. The answer is yes… and without too many tweaks.

I hope this article helped in the understanding of the very basics of Swarm, Kubernetes and Nomad. I’d love to hear about how you run Docker workload and which orchestration tool you are using.