Docker containers in swarm mode using Consul

marco cipri
9 min readAug 4, 2017

--

Brief intro :

Since I read about microservices as a way of designing software applications as suites of independently deployable servicesI’m very intrigued by the docker and in the last six months I wondered what sorts out when we mix Consul and the last native swarm-mode.
Can we have a reliable system with less effort and less maintenance?
How could we have centralized key configuration and service discovery? Can the system self manages fault conditions and load variations?
I’ll try to show my point of view whit a practical (and implemented) solution for an extendible architecture. This post is intended to be the logbook of this journey.wor

hope you enjoy the recipe…

Ingredients :

  • docker : v1.12 the container swarm
  • consul : service discovery, healt and Key/Value repository
  • nodejs : fast and easy for demo microservices
  • haproxy : software load balancer
  • EnvConsul and Consultemplate :
  • soundtrack : https://en.wikipedia.org/wiki/Algiers_(band)

stirred, not shacked…

Desiderata :

what I want to have is an extensible framework, easy to scale then:

  • centralize the services configurations
  • add new service workers dinamically at “run time”
  • have centralised health checks
  • automatic worker remove in case of fault
  • transparent load balancing

High level Description:

Only for instance the service exposed by the system is a token generator (very basic and trivial implements a Access Token pattern). Token service could involves JWT but I like to write few node’s code lines and to show how we integrate service discover, Key/Value repository and fault management. For this round the token service is written in javascript and runs on node.js, microservice’s configuration is stored on a Consul (KV) cluster which acts also as dynamic discovery, DNS service and basic failover. The same service will be delivered by an appropriate number of independent swarms, in case is needed, a new swarm. Haproxy provides the load balanced entry points from outbound and send the requests to the swarms. Security matters are out of the scope of this post. For now, focus on reliability, balancing and adaptability….

below the system view from hight perspective :

The Code :

in order to follow the recipe you must install docker and virtualbox on your machine. All servers will be deployed locally.

Scripts are stored on github at : (https://github.com/marcocipri/docker-framework-I)

are docker-machine based therefore quite adjustable i . order to deploy on google, amazon etc. etc.

microservice code is on (https://github.com/marcocipri/tokenmanager)

The Core :

this is the heart of the system. In this case I have choosed Consul, scripts provide 3 nodes but could be increased as needed. below a brief description of the usage of the script devoted to the creation of the core (see on git)

create_core_vms.sh : creates the VMfore the Consul servers. For test the local virtual box is enough. in the script will be created just 3 very light VMs using doker-machine tool : core-vm1, core-vm2, core-vm3. Extends as you like. VMs are based on “boot2docker.iso” nad isn’t intented to be used on production enviroment. At the end of this script you can see

and on Virtualbox

deploy_consul_srv.sh : Consul servers deploy. Consul cluster requires at least three nodes (2N+1) and is declared in bootstrap-expect parameter in accordance with the consensus protocol specs. Consul are deployed as dockers using the official image from hub.docker.com . Services are exposed on the default ports but the DNS services are exposed on the standard port (53) instead (8600). Now we have our living DNS. To archive the port binding the usage of the host’s network is required. At the end of the script you can see

and the consul user interface is available on each node on port 8500

all the three nodes are available and are shown as part of the cluster.

set_consul_env.sh : sets the first KV entries on the Consul cluster. very trivial by curl command. Now the KV are visible on the Consul UI :

KY values could be updated by UI or rest API and plays in this architecture as centralized configuration. Next we see how the microservices read the KV by rest.

In the GIT repository are present delete and clean scripts, devoted to restore the environment.

The SWARMs :

the idea behind the swarm is deploy a bunch of servers which share a overlay network. Some of these servers play as Manager and the other as Worker. Managers receive the requests and redirect to the Worker in accordance with the Load Balance policy. Once the consumers throughput is not enough anymore, just deploy a new swarm with the same microservices deployed. let explore the scripts :

create_swarm_vms.sh : create VMs, quite similar as what seen in the Core. when

bash create_swarm_vms.sh a

is invoked, all the VMs needed for the swarm “a” will be created. the script create just three VMs, one for the manager and two for the worker but there isn’t any limit in the number of manager and worker. Refine as you need. Now you can see something like :

create_swarm.sh : create the swarm defining the Manager, the Workers and the overlay network

bash create_swarm.sh a

once the script ends we can inspect the status of the cluster following these instructions :

docker-machine ssh swarm-a-mng-1

connects to the manager node by a shell

docker node ls

ask to the swarm to show how is composed, you can see :

swarm-a-mng-1 is the leader. Now we the swarm infrastructure… let deploy some services.

deploy_consul_cli.sh : creates a consul client which interact with the cluster, resolves names, store KV and manages the service discovery. With this approach we can avoid the DNS management. After the invocation of

bash deploy_consul_cli.sh a

we have the cluster client only on the swarm managers, on the worker isn’t needed. See on the manager

docker ps

and see

now run :

deploy_visualizer.sh : deploys a docker which shows the swarm status, is usefull just for test environment not for prodution porpouse

bash deploy_visualizer.sh a

on this url (update the ip address)

http://192.168.99.103:5001/

is shown the swarm composition and the deployment status

a this point of the story we have only the visualizer, Now we can deploy the microservice :

deploy_token_manager.sh : deploys the microservice packaged as a docker, all the codes and the dockerfile are on git (https://github.com/marcocipri/tokenmanager). Run :

bash deploy_token_manager.sh a

and see on the swarmvisualizer what happen

two istances of tokenmanager are deployed as requested in the parameter

— replicas=2

briefly : the docker uses the consul client in the same host(service registration, KV service), exposes his service on port 8080 and run the service service.js . Let see what happen on the swarm. Access on the manager and run :

docker@swarm-a-mng-1:~$ docker service logs token-manager

Now you see what the microservice writes on console (docker logging is not the scope of this post).

by the rest Consul API the service retrieves the configuration stored on the Consul Cluster as seen before in the Core installation. Refresh frequency is part of the configuration itself. Next the service registers himself as token-manager service, always on the Consul Cluster as seen below :

the Consul Cluster check the health of the node by a http GET on the /healt/

this function can be evolved as needed, now returns only “OK”

but what is the pourpouse of the service? Generate and Verify an authentication Token. The endpoint devoted to provide this service is on the Manager Host by an HTTP POST on:

http://192.168.99.103:8080/generate

header :

Content-Type: application/json

for instance payload :

{
“username”: “123456789012”,
“token”: “-”,
“millis”: “-”,
“gametype”: “othello”,
“reseller”: “12345”,
“clientversion”:”ver01"
}

see the response :

{

“sessionID”: “aec816e4f83ef3dca658a11705c7a2c03b91ecbc8b0fc4e2468f086e0f4919b5cf4a98b4d35da2ff67b1207e51ae03cffa59a4176b915857b6d703ab906a76944f61458950d010e72ba797783d381e84f0f9ea7f75eaaabcb10b22fcce5d172f414293a824a5530df06db2b6556cf54b4d88a6e0982748419e2fc7cdc68957e9”,

“baker”: “c03798ef8511”

}

the baker is the worker whom return the response, few hit and you can see change as expected. In this configuration the “baker” field assume only two values infact we have just two worker.

At the end what we have now? A balanced microservice, deployed as Docker, self subscribing as public Service and using a remote/centralized configuration.

The Load Balancing :

In my plan the default access point isn’t the swarm’s manager. For this reason a load balancer will be created in orfder to export an external access to the clients. This component is based on HAproxy and Cosul-Template. In details HAproxy provides the http service and Consul-Template update the balancing policy following the Consul Cluster service subscriptions, see below.
First create the VMs for the first loadbalancer (lb-01)

bash create_load_balancer_vms.sh 01

then deploy the consulclient with nale consul-lb-01

bash deploy_consul.sh 01

the consulclient subscribes itself o the consul cluster, automatically…

bash deploy_haproxy.sh 01

this image has installed haproxy and consul-template.
Consul-template checks any change on tokenmanager service configuration ad shown in in.tpl

{{ range service “tokenmanager” }} server {{ .Name }} {{ .Address }}:{{ .Port }}{{ end }}

when a new tokenmanager is subscribed (or unsubscribed for fault) the configuration file of HAproxy is rewritten and the http server restarted. This approach is easy and fast in case of basic http traffic. Rules to rewrite the configuration file are defined in haproxy.template

what we have archived? Try to send a http post to the loadbalancer node on the port 1080 (as defined in the docker run), assuming that the loadbalancer

curl -S -s -H “Content-Type: application/json” -X POST -d ‘{ “username”: “123456789012”, “token”: “-”, “millis”: “-”, “gametype”: “othello”,”reseller”: “12345”,”clientversion”:”ver01"}’ “http://192.168.99.107:1080/generate"

the baker (the worker node which generates the response) changes and is one of the available workers.

now we are able to add new swarm or new loadbalancer, the system manages this variations exposing the services mantaining the policies of load balancing and reliability.

Play with the environment :

lets try with add a new swarm.
In the current situation we have one swarm composed by one manager and one worker running this script (contains the same operations described before)

bash create_new_swarm.sh b

now we have a new swarm

new swarm has subscribed itself to the services available on consul cluster

also as node

assuming that, in my local installation the load balancer node has ip address 192.168.99.107 and the port of the service is 1080, the following script

bash check_the_baker.sh 192.168.99.107 1080

shows which baker asks…. now are four different ID.

gently stopping one swarm we have :

and more brutally stopping one of the manager host by VirtualBox :

the systems takes 20 seconds to restore himself (see consul.hlc)

Have fun! :-)

--

--