Raft logs on Swarm mode

TL;DR

I’ve always be intrigued by the underlying of the Raft algorithm in Docker Swarm. This article aim to provide information to better understand what is happening behind the hood.

Note: this article is not that long, there are plenty of logs there :)

Raft

Raft is a protocol for implementing a distributed consensus. What does that mean ? Consensus is a problem that needs to be addressed in fault-tolerant distributed systems and it basically involves multiple servers agreeing on values.

As a picture is always better than a long plain text description, let’s add an even better illustration that dynamically shows Raft in action for 2 processes:

  • Leader election
  • Log replication

Raft in Swarm mode

From Docker’s documentation:

using a Raft implementation, the managers maintain a consistent internal state of the entire swarm and all the services running on it

The overall view of a swarm is presented in the picture below.

Docker Swarm overall architecture

Managers nodes handle the cluster states while worker nodes are the one that execute the workload. By default a manager is also a worker.

Among the managers, the leader node is the one that logs all the actions that are done in the cluster (node added/removed, creation of a service, …). Swarm then ensures that the leader’s logs are replicated within each manager so one of them can take the leader role in case the current one becomes unavailable.

Each manager has the same version of the logs, and on each manager the logs are encrypted. We will use a swarm-rafttool from SwarmKit in order to decrypt and make them human readable.

Note: thanks to Docker guys, Andrea Luzzardi and Aaron Lehmann, for showing me this tool and helping me playing with the logs.

Why are Raft logs encrypted in Swarm ?

The secret management, introduced in Docker 1.13, enables to securely provide sensitive information to containers running on a Swarm. Basically, an operator creates a secret (usually containing credentials, certificates, and other private information) and then provides this secret to a service. The secret is saved in the Raft logs and then accessible in a temporary filesystem (/run/secrets/SECRET_NAME) by each container of the service. As the secret is in clear in the Raft logs, having the logs encrypted prevents the attacker from accessing the secret if a manager is compromised.

Alongside the encrypted logs are the public/private keys used for the encryption. This private key (/var/lib/docker/swarm/certificates/swarm-node.key) is used to encrypt the Raft logs and to ensure the secure TLS communication between the nodes.

Lock a Swarm for even more security

In the case a manager is compromised, if the logs (and the encryption keys next to them) are disclosed, it’s easy for an attacker to decrypt the logs and get access to sensitive information. In order to prevent this from happening, the swarm can be locked. Doing so, an unlock key is generated and used to encrypt the public/private keys. This unlock key must be saved offline and provided manually when the docker daemon restart (and also to decrypt the logs as we will see later on).

Learn more about the Swarm locking feature in Docker’s documentation.

How does swarm-rafttool decrypt the logs ?

As we said above, each manager has the swarm’s encrypted Raft logs and the keys used to encrypt/decrypt them. swarm-rafttool uses one of those key to decrypt the logs. If the Swarm is encrypted, the logs can still be decrypted providing the unlock-key to the tool.

In the following, we will setup a Swarm and inspect the logs while performing some operations (add a second manager node, create a service, create a secret).

Create 2 DigitalOcean droplets

Using Docker Machine, I create 2 DigitalOcean Droplets based on Ubuntu 16.04. I provide the $MACHINE variable the node1 and node2 values as a name for each node.

$ docker-machine create \
--driver digitalocean \
--digitalocean-access-token=${TOKEN} \
--digitalocean-image=ubuntu-16–04-x64 \
--digitalocean-region=lon1 \
$MACHINE

Getting SwarmKit

On both node, we will install Go and compile the swarm-rafttool that is inside the SwarmKit Github repository. This tool is used to decrypt the raft logs of a swarm manager.

# Installing Go runtime
$ add-apt-repository ppa:longsleep/golang-backports
$ apt-get update
$ apt-get install golang-go
# Check Go version
$ go version

go version go1.8.1 linux/amd64
# Set GOPATH
$ export GOPATH=$HOME/go

We can now install swarm-rafttool. This can be done directly with the following command.

$ go get github.com/docker/swarmkit/cmd/swarm-rafttool

Let’s run the binary created.

$ $GOPATH/bin/swarm-rafttool
Usage:
/root/go/bin/swarm-rafttool [command]
Available Commands:
decrypt Decrypt a swarm manager’s raft logs to an optional directory
dump-wal Display entries from the Raft log
dump-snapshot Display entries from the latest Raft snapshot
dump-object Display an object from the Raft snapshot/WAL
Flags:
-h, --help help for /root/go/bin/swarm-rafttool
-d, --state-dir string State directory (default “/var/lib/swarmd”)
— unlock-key string Unlock key, if raft logs are encrypted
Use “/root/go/bin/swarm-rafttool [command] — help” for more information about a command.

swarm-rafttool must be provided a swarm directory, and the unlock key is the swarm is locked, as we will see later on.

Setup a Swarm

From node1, we initialize the Swarm with the now common docker swarm init command. As several IPs are available on the default interface, we select the one that exposes the current instance.

$ docker swarm init --advertise-addr $NODE1_IP
Swarm initialized: current node (lirh0kaycnqx95466z3puso89) is now a manager.
To add a worker to this swarm, run the following command:
docker swarm join \
--token SWMTKN-1-0t1umu0aihk80palcon5ezhykh6cpg253g6op3hta1mf98zr87-4ck517l19iyhz3s9oasn3npzy \
46.101.44.158:2377
To add a manager to this swarm, run 'docker swarm join-token manager' and follow the instructions.
$ docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS
lirh0ka...uso89 * node1 Ready Active Leader

The current instance is the leader of the manager, as for now this is the one responsible to handle the cluster’s state.

Now that the instance is in Swarm mode, a new folder as been created, within the Docker installation folder, /var/lib/docker, to store all Swarm related data.

$ find /var/lib/docker/swarm/
/var/lib/docker/swarm/
/var/lib/docker/swarm/docker-state.json
/var/lib/docker/swarm/raft
/var/lib/docker/swarm/raft/wal-v3-encrypted
/var/lib/docker/swarm/raft/wal-v3-encrypted/0000000000000000-0000000000000000.wal
/var/lib/docker/swarm/raft/wal-v3-encrypted/0.tmp
/var/lib/docker/swarm/raft/snap-v3-encrypted
/var/lib/docker/swarm/certificates
/var/lib/docker/swarm/certificates/swarm-node.crt
/var/lib/docker/swarm/certificates/swarm-root-ca.crt
/var/lib/docker/swarm/certificates/swarm-node.key
/var/lib/docker/swarm/worker
/var/lib/docker/swarm/worker/tasks.db
/var/lib/docker/swarm/state.json

As we can see from the output above, the encrypted logs are stored in the raft subfolder.

The folder /var/lib/docker/swarm/certificates contains the public and private keys as well as the public key of the certificate authority.

Decrypt and dump the logs

To ease the thing, we create a shell script that first copies the swarm directory to a temporary location, postfixing the folder with the current timestamp, and then use the swarm-rafttool binary to dump the logs.

Note: without this copy step, we would get an error saying that the log file is locked.

# dump.sh
d=$(date "+%Y%m%dT%H%M%S")
SWARM_DIR=/var/lib/docker/swarm
WORK_DIR=/tmp
DUMP_FILE=$WORK_DIR/dump-$d
STATE_DIR=$WORK_DIR/swarm-$d
# Copy swarm folder to another location
cp -r $SWARM_DIR $STATE_DIR
# Get human readable version of wal logs
$GOPATH/bin/swarm-rafttool dump-wal --state-dir $STATE_DIR > $DUMP_FILE
echo $DUMP_FILE

If we run this script, we get the following output that shows the setup of the swarm: creation of Raft store, setup of the certificate, creation of the manager and worker’s join token, election of the leader, setup of the network, …

The output is quite long (660+ lines of logs), but it’s interesting to see all it contains at this stage even if it’s quite complicated to understand all the details of what is going on behind the hood.

$ ./dump.sh
/tmp/dump-20170517T130624
Raft logs created when the swarm is initialized

Those logs are composed of 10 different entries (lines starting by Entry Index=…) , each one with a dedicated Index value. The log with Index:3 shows a STORE_ACTION_CREATE action, whereas all the following indexes are relative to a STORE_ACTION_UPDATE action.

Add a second manager

Before adding node2 as another Swarm manager, we can check that at this stage the /var/lib/docker/swarm is empty (no swarm related stuff in there yet).

$ find /var/lib/docker/swarm/
/var/lib/docker/swarm/

We now add node2 in our Swarm as a manager.

$ docker swarm join \
--token SWMTKN-1–0t1umu0aihk80palcon5ezhykh6cpg253g6op3hta1mf98zr87-c4g1htxe8bvxka5etkife86o2 \
$NODE1_IP:2377

This node joined a swarm as a manager.
$ docker node ls
ID HOSTNAME STATUS AVAILABILITY MANAGER STATUS
lirh0kaycnqx954... * node1 Ready Active Leader
nvajg1fm52nd2hx... node2 Ready Active Reachable

If we dump the logs from node1 and then from node2, we should get exactly the same content as they are constantly replicated from node1 to node2

# On node1
$ ./dump.sh
/tmp/dump-20170517T131442
$ md5sum /tmp/dump-20170517T131442
617c05cf0c41ae9505a645c5a2be47a6 /tmp/dump-20170517T131442
# On node2
$ ./dump.sh
/tmp/dump-20170517T131502
$ md5sum /tmp/dump-20170517T131502
617c05cf0c41ae9505a645c5a2be47a6 /tmp/dump-20170517T131502

As we can see, the md5sum of the logs on node1 as the same value as the one calculated on node2, which implies that (normally) both logs are the same. The log replication ensure another manager has all the information it needs to manage the cluster’s state if the current leader goes down.

From node1, by comparing the 2 dumps we have done so far, we can display only the part of the logs that shows the addition of node2.

Logs created when the second manager is added to the swarm

Create a service

We now create a service from node1. This service is a simple http server based on nginx and exposing its port 80 onto 8080 in the Swarm routing mesh.

$ docker service create --publish 8080:80 nginx
5v6q08sfnnny4g6wb5ydrslna
Since --detach=false was not specified, tasks will be created in the background.
In a future release, --detach=false will become the default.

We can then create a dump of the logs and only check the part that is linked to the service we have created. From this following output we have 8 more entries.

$ ./dump.sh
/tmp/dump-20170517T144048
Raft logs added when the service was created

From this output we can also see, among a lot of information, that the service goes through different states

  • PENDING
  • ASSIGNED
  • PREPARING
  • RUNNING

Create a secret

When creating a secret, this one is encoded in the Raft logs, let’s see that in action.

We create a secret named PASS with the value of “mypass”

$ echo "mypass" | docker secret create PASS -
3kgqwqakmf256vf8s9t1gxtlj

Let’s dump the logs and only check the last part of the logs.

$ ./dump.sh
/tmp/dump-20170517T144859
Raft logs added when the secret was created

As the logs are encrypted in a Swarm, the secret seems to be securely stored. If specified in a service declaration, this secret will be provided to the service’s instances (read containers) through a temporary filesystem (tmpfs) and will never be saved on disk. What if a manager get compromised and the logs and the encryption keys disclosed ?

Lock the Swarm

In order to add an additional level of security, a swarm can be locked. This creates a new key used to encrypt the public/private keys used to encrypt/decrypt the logs.

Let’s look at the public and private keys before locking the swarm.

$ cat /var/lib/docker/swarm/certificates/swarm-node.crt
-----BEGIN CERTIFICATE-----
MIICNDCCAdugAwIBAgIUQgOUiFU3PoQZDdHn2saYwqtk/C4wCgYIKoZIzj0EAwIw
EzERMA8GA1UEAxMIc3dhcm0tY2EwHhcNMTcwNTE3MTIwMDAwWhcNMTcwODE1MTMw
MDAwWjBgMSIwIAYDVQQKExlyOTk1dTk4NGoxbmdzNHJxcTA1dTlzbDQ2MRYwFAYD
VQQLEw1zd2FybS1tYW5hZ2VyMSIwIAYDVQQDExlsaXJoMGtheWNucXg5NTQ2Nnoz
cHVzbzg5MFkwEwYHKoZIzj0CAQYIKoZIzj0DAQcDQgAEFRFmn2susRlbZR4H68vz
8WinZWLcCf/5P3ZvNL3vVqWDdC0LfkxE+RToMa58Drjc+S6dwXkos8O72K2MI4Sy
NqOBvzCBvDAOBgNVHQ8BAf8EBAMCBaAwHQYDVR0lBBYwFAYIKwYBBQUHAwEGCCsG
AQUFBwMCMAwGA1UdEwEB/wQCMAAwHQYDVR0OBBYEFPtaOjTpdD7U6m9njBjpYTA3
Qew0MB8GA1UdIwQYMBaAFJQnpAFYg8vpnuLSNgyBCuDJ9G6YMD0GA1UdEQQ2MDSC
DXN3YXJtLW1hbmFnZXKCGWxpcmgwa2F5Y25xeDk1NDY2ejNwdXNvODmCCHN3YXJt
LWNhMAoGCCqGSM49BAMCA0cAMEQCIHCRmgdBE7wHzLKZcFHVlTXxBmxkU8F0l1xP
sfcjPvZXAiARA6S3IhJTuC0DzCgATuAczTLkhp4opezDO6oGJppHiw==
-----END CERTIFICATE-----
$ cat /var/lib/docker/swarm/certificates/swarm-node.key
-----BEGIN EC PRIVATE KEY-----
kek-version: 0
raft-dek: EiDd4YVkRR2rZtV5ZL6XV/F8YGJNbnfhKbnyCuXe2Q08pw==
MHcCAQEEIK1pH0I9dBkzWS8zIPSxRj/4LxzLdsCM+4ZsttjOxxxaoAoGCCqGSM49
AwEHoUQDQgAEFRFmn2susRlbZR4H68vz8WinZWLcCf/5P3ZvNL3vVqWDdC0LfkxE
+RToMa58Drjc+S6dwXkos8O72K2MI4SyNg==
-----END EC PRIVATE KEY-----

With a Swarm already running, we can activate the lock feature with the following command.

$ docker swarm update --autolock=true
Swarm updated.
To unlock a swarm manager after it restarts, run the `docker swarm unlock`
command and provide the following key:
SWMKEY-1-y6NfvtRiP9e2BXuVSarZvaYKLfQc0qruFasI6LFjPS4
Please remember to store this key in a password manager, since without it you
will not be able to restart the manager.

If we now look once again at the public and private keys with the swarm locked, we can see they have been encoded.

$ cat /var/lib/docker/swarm/certificates/swarm-node.crt
-----BEGIN CERTIFICATE-----
MIICNDCCAdqgAwIBAgITD9n1D59LI2GAuzoevKR0dfcTkTAKBggqhkjOPQQDAjAT
MREwDwYDVQQDEwhzd2FybS1jYTAeFw0xNzA1MTgxOTM5MDBaFw0xNzA4MTYyMDM5
MDBaMGAxIjAgBgNVBAoTGXI5OTV1OTg0ajFuZ3M0cnFxMDV1OXNsNDYxFjAUBgNV
BAsTDXN3YXJtLW1hbmFnZXIxIjAgBgNVBAMTGWxpcmgwa2F5Y25xeDk1NDY2ejNw
dXNvODkwWTATBgcqhkjOPQIBBggqhkjOPQMBBwNCAARl7ylpk1eKEXXlYx4biAQP
Pd2il6/iR+rWgFFL9n7KWid008mFdL1ag7uf3twIDpBOWnczr617q0k6m4jXCt8x
o4G/MIG8MA4GA1UdDwEB/wQEAwIFoDAdBgNVHSUEFjAUBggrBgEFBQcDAQYIKwYB
BQUHAwIwDAYDVR0TAQH/BAIwADAdBgNVHQ4EFgQUAIjvjfK6bPU51yQ2/3fGy+qE
Rz8wHwYDVR0jBBgwFoAUlCekAViDy+me4tI2DIEK4Mn0bpgwPQYDVR0RBDYwNIIN
c3dhcm0tbWFuYWdlcoIZbGlyaDBrYXljbnF4OTU0NjZ6M3B1c284OYIIc3dhcm0t
Y2EwCgYIKoZIzj0EAwIDSAAwRQIhAN2SgxGwYGIDyDZnme33RGZRoZ4h3R2gpGEK
77+BqGLxAiB6UZvjlGDr8NW0dY9m0lGWDRe3ajyEIXz4tN3gA5yHig==
-----END CERTIFICATE-----
$ cat /var/lib/docker/swarm/certificates/swarm-node.key
-----BEGIN EC PRIVATE KEY-----
Proc-Type: 4,ENCRYPTED
DEK-Info: AES-256-CBC,068c8611cc82d72c10deec7c314d0d7c
kek-version: 27
raft-dek: CAESMIG4f26DpITRPiNALp1qQ9RzeYmvD7nTSTOk7vh4cB0L5B2wS8kNtbSZ3IDKL7qw1xoY48cUHslsIrfiu0m4HW+USoFA3xCJqRZk
AQna/0kSzsolBJmMyTtZFF0VV0FfktUaBDz/J1iaQEcjQsaOVhuscp9GUWHHr0pi
tRSAollVp/cOJ6MkPy+pKQEJqIFtzh2EJmfmX9b1FjNO43AtgDruYtnQaN6hn47z
DjgUyRW8Nj3DaQaRemqBVpdej/VZI+LO3Yw9CxwdLZk=
-----END EC PRIVATE KEY-----

From this point, we will need to provide the unlock-key to swarm-rafttool each time we need to decrypt the logs as it provides an additional level of encryption. The dump will then not work without the --unlock-key option provided to swarm-rafttool.

$ ./dump.sh
Error: x509: decryption password incorrect
Usage:
/root/go/bin/swarm-rafttool dump-wal [flags]
Flags:
— end uint End of index range to dump
— start uint Start of index range to dump
Global Flags:
-d, — state-dir string State directory (default “/var/lib/swarmd”)
— unlock-key string Unlock key, if raft logs are encrypted

To decrypt the log, we change the dump.sh file so it look like the following

# dump.sh
d=$(date “+%Y%m%dT%H%M%S”)
SWARM_DIR=/var/lib/docker/swarm
WORK_DIR=/tmp
DUMP_FILE=$WORK_DIR/dump-$d
STATE_DIR=$WORK_DIR/swarm-$d
UNLOCK_KEY="SWMKEY-1-y6NfvtRiP9e2BXuVSarZvaYKLfQc0qruFasI6LFjPS4"
# Copy swarm folder to another location
cp -r $SWARM_DIR $STATE_DIR
# Get human readable version of wal logs
$GOPATH/bin/swarm-rafttool dump-wal --state-dir $STATE_DIR --unlock-key $UNLOCK_KEY > $DUMP_FILE
echo $DUMP_FILE

The log can then be decrypted without any error messages

$ ./dump.sh
/tmp/dump-20170518T153543

Of course, this is just for the example as the unlock key must not be used in a script. It must be kept offline and entered manually when unlocking a swarm or for any other manipulation of the logs.

Restart the manager

Each time the docker daemon starts, the unlock key needs to be provided.

$ service docker stop
$ service docker start
$ docker swarm unlock
Please enter unlock key:
$

Summary

Quite a lot of logs in this post, but it’s interesting to have a decrypted version in order to try to understand the steps involved behind the hood.

Hopefully, this helps to make Raft a little bit less obscure :)