Deploy Docker Engine in swarm mode to Google Cloud Platform

10 min readJul 9, 2016

At DockerCon, Docker announced the addition of a swarm mode to Docker Engine. Swarm mode requires Docker Engine v1.12.0-rc1 and, is best experienced when run on a cluster of nodes in a so-called Swarm.

I thought it’d be useful to deploy a Swarm to Google Compute Engine (GCE) in order to follow along with the tutorial provided by Docker. So I created some simple Bash scripts to do this before realizing this was a good opportunity for me to teach myself the basics of Google Cloud Deployment Manager (CDM). CDM is Google Cloud Platform’s native deployment tool. Like similar tools ( Chef, Puppet etc.), it’s syntax is intentional rather than declarative and therefore facilitates composability (reuse).

The result of my adventure is a basic CDM Python script and YAML configuration that should permit you to deploy a Docker Swarm comprising one manager and your choice of number of workers to GCE. You may use the Swarm to work through Docker’s Swarm tutorial in order to become familiar with Swarm or as the runtime for your own Swarm deployments.

If you simply want to get a Swarm deployed, please jump ahead to “Deploying the Swarm using CDM”.

If you’re interested in understanding how I built it, please continue reading here.

How to install Docker v.1.12.x on Ubuntu 16.04 LTS

Docker provides a Bash script to install Docker releases. To help with my understanding and debugging and because I chose to deploy only to Ubuntu 16.04 LTS, I can simplify the script:

#! /bin/bash
# Receive the Docker public key and add it to the keystore
# so the package manager can install signed packages from
# Dockersudo apt-key adv \
--keyserver hkp://p80.pool.sks-keyservers.net:80 \
--recv-keys 58118E89F3A912897C070ADBF76221572C52609D# add the main Docker repo to apt’s package sourcesecho "deb [arch=amd64] https://apt.dockerproject.org/repo ubuntu-xenial testing" | sudo tee /etc/apt/sources.list.d/docker.list# update the OS and install Docker enginesudo apt-get update && \
sudo apt-get -y install linux-image-extra-4.4.0-28-generic && \
sudo apt-get -y install docker-engine && \
sudo apt-get -y upgrade && \
sudo apt-get clean# start the Docker servicesudo systemctl start docker

To create a Swarm, one of the nodes must:

sudo docker swarm init

Other nodes then join this swarm with:

sudo docker swarm join [SWARM-MANAGER]:2377

For simplicity, I create a single manager node with a hostname of “swarm-master”

You can run these commands for yourself on multiple nodes to spin up your own Swarm but, I wanted to run my Swam on Compute Engine (GCE)

How to create Docker swarm mode nodes on Compute Engine

The simplest way to create a manager node and worker node(s) on compute engine is to create compute engine instances that use a version of the Bash script outlined in How to install Docker v1.12.x on Ubuntu 16.04 LTS.

If you’ve not used Google Cloud Platform before, start here. Otherwise, assuming you have a project [PROJECT] and have the gcloud command-line tool installed, a script to create the manager node is:

#! /bin/bash# Edit PROJECT to be your project ID and
# ZONE to be your preferred zoneexport PROJECT=[PROJECT]
export ZONE=[ZONE]
export MASTER="swarm-master"gcloud compute instances create $MASTER \
--project=$PROJECT \
--zone=$ZONE \
--image=/ubuntu-os-cloud/ubuntu-1604-xenial-v20160610 \
--metadata startup-script="
#! /bin/bashsudo apt-key adv \
--keyserver hkp://p80.pool.sks-keyservers.net:80 \
--recv-keys 58118E89F3A912897C070ADBF76221572C52609Decho \"deb [arch=amd64] https://apt.dockerproject.org/repo ubuntu-xenial testing\" | sudo tee /etc/apt/sources.list.d/docker.listsudo apt-get update && \
sudo apt-get -y install linux-image-extra-4.4.0-28-generic && \
sudo apt-get -y install docker-engine && \
sudo apt-get -y upgrade && \
sudo apt-get cleansudo systemctl start dockersudo docker swarm init
"

Replace [PROJECT] with your project’s ID. You may find your current project’s ID here. You may choose to use a different zone. If you wish to use a different name for the master, you must not only edit the value of the MASTER environment variable but also change the scripts where you see ‘swarm-master’.

The only difference between the startup script used by the manager and the workers is that the manager initializes the swarm.

It takes GCE under 2 minutes to provision the VM, install Docker and start it.

You may check progress by SSH’ing into the MASTER and tailing the startupscript.sh log:

gcloud compute ssh $MASTER
tail -f /var/log/startupscript.log

When everything has completed satisfactorily, you will see:

Swarm initialized: current node (7sd5v2at8srm0i4es2w1b8jly) is now a manager.Finished running startup script /var/run/google.startup.script

The command to create the worker nodes is:

#! /bin/bash# Edit PROJECT to be your project ID and
# ZONE to be your preferred zoneexport PROJECT=[PROJECT]
export ZONE=[ZONE]
export TEMPLATE="docker-1-12"
export WORKERS="swarm-worker"gcloud compute instance-templates create $TEMPLATE \
--project=$PROJECT \
--machine-type="n1-standard-1" \
--image=/ubuntu-os-cloud/ubuntu-1604-xenial-v20160610 \
--metadata startup-script="
#! /bin/bashsudo apt-key adv \
--keyserver hkp://p80.pool.sks-keyservers.net:80 \
--recv-keys 58118E89F3A912897C070ADBF76221572C52609Decho \"deb [arch=amd64] https://apt.dockerproject.org/repo ubuntu-xenial testing\" | sudo tee /etc/apt/sources.list.d/docker.listsudo apt-get update && \
sudo apt-get -y install linux-image-extra-4.4.0-28-generic && \
sudo apt-get -y install docker-engine && \
sudo apt-get -y upgrade && \
sudo apt-get cleansudo systemctl start dockersudo docker swarm join swarm-master:2377
"gcloud compute instance-groups managed create $WORKERS \
--project=$PROJECT \
--zone=$ZONE \
--template=$TEMPLATE \
--size=3

As before, please replace [PROJECT] with your project’s ID. The only difference between the startup script used for the workers and that used for the manager is that the workers join the swarm that was created by the manager. The script has no dependency checking for the existence of the manager (‘swarm-master’) so you should run the workers script after the manager script.

There are two differences between the manager and the workers script. The workers script creates a Compute Engine Instance Template. This template is passed to the command that creates a Managed Instance Group (MIG). A MIG is Compute Engine’s construct for creating an arbitrary number of VM clones. In this script, we create ‘3’ VMs but you may change this as you wish.

It is common to apply an Autoscaler to a MIG in order to scale the number of clones on demand by CPU, some monitoring metric or by the incoming traffic load from a load-balancer. For our purposes, a static number of VMs is sufficient.

Optional: If you would like to add an autoscaler, the following command will create one that will scale the MIG to a maximum of 5 instances and attempt to maintain an average CPU utilization of 80%:

gcloud compute instance-groups managed set-autoscaling $WORKERS \
--max-num-replicas=3 \
--target-cpu-utilization=0.8

You should not need to do so but you could SSH into any of the worker VMs and tail the startupscript.sh log to see that the worker has started correctly. Each worker VM is ready with Docker installed within approximately 2 minutes.

To test that the Swarm is ready, you can SSH into the master (!) and enumerate its nodes:

gcloud compute ssh $MASTER
sudo docker node ls

When only the manager is running, the command will return something similar to:

As the other nodes come online, the command will return:

You may then proceed with the Docker tutorial for Swarm from the step Deploy a Service. To avoid having to ‘sudo’ every Docker command, you can add your $USER to the Docker group with:

sudo usermod -a -G docker ${USER}
exec sudo su ${USER}

When you are done, you may delete everything using the following commands:

gcloud compute instance-groups managed delete $WORKERS \
--project=$PROJECT \
--quietgcloud compute instance-templates delete $TEMPLATE \
--project=$PROJECT \
--quietgcloud compute instances delete $MASTER \
--project=$PROJECT \
--quiet

Putting it all together using Cloud Deployment Manager (CDM)

To use Deployment Manager is mostly a process of converting the gcloud commands that you used previously into their equivalent REST API calls and scripting them, in this case using Python. Deployment Manager also supports the use of JINJA templates but I found Python to be more…. effective.

CDM uses YAML for its configuration files. Where I used environment variables with the Bash scripts, I converted these into CDM properties. I took advantage of CDM configuration and pulled a few more variables out from the Python script too including the names of the swarm master and workers etc.

CDM expect the Python script to have a method called GenerateConfig that takes a context (that includes the YAML configuration properties) and returns a Python Dictionary containing the resources to be created on GCP. As with the Bash scripts, we need a master VM, an arbitrary number of workers VMs managed by a MIG and built from a Template. So, here’s the skeleton of the Python script:

def GenerateConfig(context):…
 
  return {
    ‘resources’: [
      swarm_master,
      swarm_worker_template,
      swarm_worker_mig
    ],
   }All that’s left to do is define how each of the constituent resources is defined to CDM. Let’s take the Template as it’s a good example.  swarm_worker_template = {
    ‘type’: ‘compute.v1.instanceTemplate’,
    ‘name’: context.properties[‘swarm-worker-template’],
    ‘properties’: {
      ‘properties’: {
        ‘machineType’: context.properties[‘machineType’],
        ‘disks’: disks,
        ‘networkInterfaces’: network_interfaces,
        ‘metadata’: metadata,
        ‘serviceAccounts’: service_accounts
      }
    }
  }

It’s clear that we need a consistent way to define resources to CDM so that it may build these resources on Compute Engine. The Compute Engine API is the definitive way to interact with Compute Engine to build resources. gcloud commands are converted into REST calls. And so, it should come as no surprise, that CDM effectively represents this API’s methods and request objects as the way to define resources. How does the developer know that the Template has a type of ‘compute.v1.instanceTemplate’? That it needs a name, properties etc. etc.? There are two ways to determine this. The easiest is to use the Cloud Console to do the hard work for you and then have it show the REST equivalent:

An alternative and more precise approach is to review the Compute Engine API documentation. This approach shows you the definitive required parameters and helps you drill down as necessary. For Instance Templates, the link is:

https://cloud.google.com/compute/docs/reference/latest/instanceTemplates/insert

Once the Python script and the YAML configuration are ready, we can deploy the template to Google Cloud Platform (GCP) and explore our Docker Swarm.

Deploying the Swarm using Cloud Deployment Manager

At this point we have a Python script (dockerswarm.py) that receives contextual information (e.g. project and zones details) from the CDM configuration file (dockerswarm.yaml) when deployed to the CDM service. The result is a series of orchestrated REST API calls to Google Cloud Platform (GCP) that create:

Swarm manager node called ‘swarm-master’
An Instance Template called ‘swarm-worker-template’ that is used by
A Managed Instance Group called ‘swarm-worker-mig’ to create x swarm workers

Assume both files have not been renamed and exist in the same directory, the command to deploy the result to Google Cloud Platform is:

gcloud deployment-manager deployments create docker-swarm \
--config dockerswarm.yaml

‘docker-swarm’ is the name we want to give the deployment and dockerswarm.yaml refers to our configuration template. This should result in:

Your operation name will be different. You may view the result using Cloud Console:

And, if you click on ‘docker-swarm’:

And, if you click (for example) on swarm-worker-mig, you will see:

And, if you then click on the right hand side “MANAGE RESOURCE”, you will see:

Summary of ‘swarm-worker-mig’ showing the swarm-worker VMs

As before, you can then access the Swarm by SSH’ing into the manager and running commands against it:

gcloud compute ssh swarm-master
sudo docker node ls
sudo docker service create \
--replicas 5 \
--name hellogoogle \
alpine ping google.comsudo docker service inspect --pretty hellogoogleID: 6igblk6vz262d7xsmj5eipvyt
Name: hellogoogle
Mode: Replicated
 Replicas: 5
Placement:
 Strategy: Spread
UpdateConfig:
 Parallelism: 0
ContainerSpec:
 Image: alpine
 Args: ping google.com
Resources:
Reservations:
Limits:

And, to tear everything down, simply delete the deployment:

gcloud deployment-manager deployments delete docker-swarm \
--quiet

Conclusion

My original goal was to investigate the new ‘swarm mode’ in Docker Engine. I detoured through teaching myself Cloud Deployment Manager and hope that this recount of my experience is of interest to you.

Next Steps

Next steps include creating a MIG to autoscale the Swarm managers and learning how to use CDM Runtime Configurator in order to more elegantly spin-up the Swarm nodes.

Files

dockerswarm.yaml
dockerswarm.py

References

https://docs.docker.com/engine/swarm
https://github.com/docker/docker/releases
https://experimental.docker.com
https://cloud.google.com/
https://cloud.google.com/compute/
https://cloud.google.com/deployment-manager/
https://cloud.google.com/free-trial/
https://cloud.google.com/compute/docs/autoscaler/
https://cloud.google.com/compute/docs/instance-groups/
https://cloud.google.com/compute/docs/instance-templates
https://cloud.google.com/sdk/gcloud/