Secure Credential Management on a Budget: DC/OS with HashiCorp’s Vault — Part 1 of 3

This tutorial is a basic guide on how to manually set up a production-level prototype of HashiCorp’s Vault (version 0.6.0) on your Debian-based DC/OS Community cluster. Bonus if you run Docker containers on your cluster!

What are you doing running DC/OS Community version? Maybe your organisation is an NGO that delivers critical services to its service demographics, but can’t really afford an Enterprise license (hint hint). Maybe your company finds the Community version good enough for its needs. Maybe you’re running your own pet cluster, which you don’t share with anyone.

This tutorial series is targeted at non-SRE people with a moderate degree of Linux knowledge. It is intended to help you with prototyping Vault on your system to see if it’s right for you. It uses Zookeeper as the storage backend. This is by no means the most secure or robust setup — do not actually use this in production. If you’re demonstrating this to your team at work, make sure you also have a production roadmap that actually addresses the miscellaneous security and maintenance concerns of the prototyping stage. Provided all goes well, your Vault architecture should look similar to the below diagram:

Figure 1: High-level architecture of basic Vault setup on DC/OS

If, by the time you are reading this paragraph, you have no idea what Vault is, how it works, or why you want to use it, make yourself a drink of your choice and read the intro and startup guide to Vault on the official site.

Lastly, these commands are tested on Debian Jessie servers. Although you should be able to adapt most (if not all) of these commands to your distro, consider this fair warning for the instructions that lie ahead. Do not mail me your boogers if they crash your Gentoo server.

Let’s get started.

Make Some Decisions

First, you’ll need to choose where you’d like to run your Vault server, and how many Vault nodes you’d like to run. Personally, I am uncomfortable running Vault as a container on the cluster because if you use Vault to manage the credentials of your cluster frameworks (ie. Marathon, Mesos), you may run into an unpleasant deadlock situation if that cluster framework goes down. If Marathon relies on credentials managed by Vault, yet Marathon is the framework that actually runs Vault on the cluster…you get the point.

If you can guarantee to yourself that you’ll never encounter use cases like this, you can launch the Vault server as a cluster container.

How many Vault nodes do you want to run? Vault nodes are unsealed instances of Vault. Regardless of how many Vault nodes you’ve set up, only one will be active at any given time. The rest of the Vault nodes will proxy any HTTP requests to the active node. If the active node fails, one of the standby nodes are chosen to become the new active node. Note that this does not actually help with the scalability of your storage backend (more on that later).

[Unless otherwise specified, the sudo command is used for any comands that require root privileges; ie. you’ll need root permission on your target machine to follow the instructions here.]

Setting up Vault and Zookeeper

We’ve chosen Zookeeper as our Vault storage backend. The advantages of Zookeeper is that it is a supported component of the DC/OS ecosystem, and provides highly-available storage through redundancy. It’s also the storage backend used in DC/OS Enterprise’s own secret management service, which is based on Vault.

Install Required Packages on Vault Nodes

Download the latest Vault package for your system into a temporary location:

$ cd /tmp
$ curl -O https://releases.hashicorp.com/vault/0.7.0/vault_0.7.0_linux_amd64.zip
$ curl -O https://releases.hashicorp.com/vault/0.7.0/vault_0.7.0_SHA256SUMS
$ curl -O https://releases.hashicorp.com/vault/0.7.0/vault_0.7.0_SHA256SUMS.sig

The above instructions are for 64-bit Linux systems. As of time of writing, the latest Vault release was version 0.7.0. Head to the Vault downloads page to locate the correct package for your needs. Replace all package names and checksum names in any following commands as needed.

Import HashiCorp’s RSA public key into your GPG keyring:

$ gpg --keyserver ha.pool.sks-keyservers.net --recv-keys 91A6E7F85D05C65630BEF18951852D87348FFC4C

Now verify your downloaded package:

$ gpg --verify vault_0.7.0_SHA256SUMS.sig
$ gpg --verify vault_0.7.0_SHA256SUMS.sig vault_0.7.0_SHA256SUMS
$ sha256sum -c <(grep vault_0.7.0_linux_amd64.zip vault_0.7.0_SHA256SUMS)

After you’ve downloaded and verified the Vault package, extract it and move the vault binary to /usr/local/bin. Create a symlink to the binary in /usr/bin for your convenience:

$ unzip vault_0.7.0_linux_amd.zip
$ sudo mv vault /usr/local/bin
$ sudo ln -s /bin/vault /usr/bin

Next, install LetsEncrypt’s Certbot. Certbot is a LetsEncrypt (LE) utility that helps you fetch and deploy TLS certificates on your webservers:

#Add the Jessie-backports repository to your sources.list
$ echo 'deb http://ftp.debian.org/debian jessie-backports main' | sudo tee /etc/apt/sources.list.d/backports.list
#Update your local package index
$ echo apt-get update
#Install LE's certbot
$ sudo apt-get install python-certbot -t jessie-backports

Generate Your TLS Certificates

You’ll notice in the Basic Vault Tutorial that it tells you to disable TLS for the Vault API. In a production scenario, you would ideally serve connections to the Vault API over HTTPS, since you’ll be handling sensitive requests that could contain access tokens, credentials, and other secrets.

Here, we’ll use Let’s Encrypt as our CA and ask it for some certificates. Unfortunately, this does present a couple of security issues (see Miscellaneous Security Concerns section in Part 3), but the point of this tutorial is to demonstrate the basic workflow for a budget Vault setup. If you have a cluster-internal CA, use that instead.

Run Certbot in standalone mode to fetch the certificates:

$ sudo certbot certonly --standalone -d [domain name for your host]

Note that you may need to open port 80/443 to the public internet in order for the LE server to reach you. Make sure that these ports are firewalled to only allow ingress from internal hosts plus the LE server.

You should end up with a couple of certificates in the default LE path for certificates: /etc/letsencrypt/live/[your domain]/. Double-check that you have all 4 certificates of the following: cert.pem, chain.pem, fullchain.pem, privkey.pem.

Make symlinks of these files in your SSL Vault directory for convenience:

$ sudo mkdir /etc/ssl/vault
$ sudo sh -c 'ln -s /etc/letsencrypt/live/[your domain]/*pem /etc/ssl/vault'

Make a combined PEM certificate in your Vault SSL directory:

$ sudo sh -c 'cat /etc/ssl/vault/[fullchain,privkey]*.pem > /etc/ssl/vault/fullcert.pem'

This combined certificate can be used to verify the issuer of the certificate and encrypt HTTP connections in one fell swoop. Note that you may have to recreate this combined certificate whenever your certificate is renewed.

You’ll need to update your system to recognise LetsEncrypt as a trusted Certificate Authority (CA) if it doesn’t already trust a CA in its certificate chain. To do this, create a new directory to store the LetsEncrypt root certificate:

$ sudo mkdir /usr/share/ca-certificates/letsencrypt.org

Now download LE’s root certificate into the newly-created directory:

$ sudo curl -o /usr/share/ca-certificates/letsencrypt.org/isrgrootx1.pem "https://letsencrypt.org/certs/isrgrootx1.pem"

Add a reference to the new certificate in /etc/ca-certificates.conf by adding the line letsencrypt.org/isrgrootx1.pem to the bottom of the file:

$ echo 'letsencrypt.org/isrgrootx1.pem' | sudo tee -a /etc/ca-certificates.conf

Finally, update your root CA certificates:

$ sudo update-ca-certificates

You will need to do this (ie. set up Vault, install certificates) for every Vault node you want to run in your Vault instance. If you’re following this tutorial as an SRE, congratulations on your added advantage of being able to automate these setup steps on Puppet.

Zookeeper Access Control

You’ll need to set up some basic access controls on your Zookeeper instances. To prevent confusion, zNodes will be used to refer to Zookeeper storage paths, and Zookeeper Nodes will be used to refer to the Zookeeper server nodes themselves.

Choose a username and password for your Vault user on Zookeeper. Store this in a safe place. We will use the built-in Zookeeper auth method digest. According to the Zookeeper programmer’s guide:

digest uses a username:password string to generate MD5 hash which is then used as an ACL ID identity. Authentication is done by sending the username:password in clear text. When used in the ACL the expression will be the username:base64 encoded SHA1 password digest.

As you can tell, this really isn’t the safest authentication method out there. I find Zookeeper ACL’s security models quite perplexing. Feel free to swap this out with a stronger authentication method if you’re comfortable, but the rest of this section will proceed with this method.

Generate the SHA1 hash in Zookeeper’s desired format:

$ echo -n [username]:[password] | openssl dgst -binary -sha1 | openssl base64

Keep this in a safe place.

Now, SSH into any host that has access to any Zookeeper node, and connect to the Zookeeper CLI:

$ sudo docker run --rm -it zookeeper zkCli.sh -server [any Zookeeper server IP]

You can use any Zookeeper server IP, as any requests to Zookeeper follower nodes will be proxied to the leader. Note that Zookeeper will typically reside on your cluster controllers.

Now that you’re connected to the Zookeeper CLI, create the persistent zNode that will be used to store your Vault data:

[zk] create /vault

Now, add the authentication information for your planned Vault user to that node:

[zk] addauth /vault digest [username]:[password]

Note that you’re using the plaintext version of your username and password in this command.

Apply the ACL to your zNode. Here, we are allowing the Vault user read, write, create, delete, and admin permissions on the /vault zNode:

[zk] setAcl /vault digest:<username:[username:base64 sha1 digest of password]>:rwcda

I don’t feel the Zookeeper programmer’s guide does a good job of explaining what exactly you’re supposed to put in your arguments when setting a digest ACL. To clarify, if your username is vault and your password is password, you would have generated the base64-encoded SHA1 hash Rehuy9OJQRY041dWCywMg/IaGwg= using the OpenSSL command earlier in this section. Your Zookeeper CLI command to set the ACL for that user would be:

[zk] setAcl /vault digest:vault:Rehuy9OJQRY041dWCywMg/IaGwg=:rwcda

Please don’t use these example credentials in a production system.

Writing Your Vault Server Config(s)

Congratulations! Your initial setup is pretty much done. Now it’s time to configure your actual Vault nodes.

The environment variable that needs to be set on each Vault is VAULT_ADDR. Set it to the (hostname-based) address for your Vault instances on each node. As far as I know, the VAULT_ADDR environment variable will override any file-based Vault configs for that paramater.

$ echo -n "VAULT_ADDR='https://your-vault-server[1-2].org:8200'" | sudo tee -a /etc/environment

The environment changes will be effective on your next tty session.

You can configure most of the other parameters for Vault server in environment variables as well, or you can do this in a configuration file in Vault’s JSON-compatible HCL format. Environment variables will take precedence over a configuration file. Below, I’ve provided the configuration files for a two-node Vault instance with a Zookeeper backend, without a load balancer. These should be saved as init.hcl:

Node 1

storage "zookeeper" {
address = "[comma-separated list of Zookeeper nodes]"
path = "/vault"
znode_owner = "digest:[znode user]:[SHA1 hash of <username:password> string]"
auth_info = "digest:[znode user]:[cleartext password]"
redirect_addr = "https://your-vault-server1.org:8200"
}
listener "tcp" {
address = "0.0.0.0:8200"
tls_cert_file = "/etc/ssl/vault/fullcert.pem"
tls_key_file = "/etc/ssl/vault/privkey.pem"
}

Node 2

storage "zookeeper" {
address = "[comma-separated list of Zookeeper nodes]"
path = "/vault"
znode_owner = "digest:[znode user]:[SHA1 hash of <username:password> string]"
auth_info = "digest:[znode user]:[cleartext password]"
redirect_addr = "https://your-vault-server2.org:8200"
}
listener "tcp" {
address = "0.0.0.0:8200"
tls_cert_file = "/etc/ssl/vault/fullcert.pem"
tls_key_file = "/etc/ssl/vault/privkey.pem"
}

As is fairly obvious, the only differing information between the two configuration files are the redirect_addr arguments. This parameter is the address that that Vault node advertises to other Vault nodes for the purpose of proxying requests to the active node.

Because both Vault nodes share the same storage backend, the addresses and authentication to the Vault zNode stay the same.

These configuration files are compatible with the setup described in the previous section. I choose to store these in the /etc/vault directories on each Vault node.

Side note: You can prevent sensitive shit from being cached in bash by setting the HISTIGNORE flag in your user(s)’ ~/.bashrc file, like so:

$ echo -n "HISTIGNORE='vault *'" | sudo tee -a ~/.bashrc

This should be in the ~/.bashrc file of a user if you intend to use that account to connect to the Vault CLI — so ideally, all your Vault unseal keyholders’ accounts (see the section Unsealing Your Vault Nodes).

Running Vault

Start Your Vault Nodes

Start your Vault server and detach it from your tty session:

$ sudo sh -c 'vault server --config=[path/to/init.hcl] & disown'

Congratulations! Your Vault nodes have been started. You can query their status by typing $ vault status .

If you’re keen, you’ll notice that we’re running Vault as a root-equivalent user. This is, unfortunately, something that is required if you want to run Vault without leaking unsealed Vault data to swap. I will update this article with more information as soon as it’s possible to set individual syscall capabilities to specific users. For now, you can read more about this slightly unhappy situation here.

If this is the first time you run your Vault instance, you will need to initialise your instance. The initialisation step will generate the root token for your Vault instance, as well as the unseal keys. The below command will generate 5 unseal keys, any 3 of which are needed to unseal your Vault instance:

$ vault init -key-shares=5 -key-threshold=3

Feel free to tweak these numbers to your liking. This will spit out the plaintext unseal keys and the root token. Note that this is the ONLY time you get to see those pieces of information in cleartext, so make sure you store them in a safe location for later use.

If you prefer to integrate your unseal key generation with PGP encryption, you can do it the easy way by ensuring your keyholders are registered on Keybase and running this command instead:

$ vault init -key-shares=5 -key-threshold=3 \
-pgp-keys="keybase:[user1],keybase:[user2],keybase:[user3],keybase[user4],keybase[user5]"

This command will spit out unseal keys that are encrypted with the specified Keybase users’ PGP public keys. Give them to the target keyholders.

Alternatively, if you want to do this the hard way, you can manually import the target keyholders’ keys onto your Vault host(s) GPG client, export the keys as .asc files and run the following command:

$ vault init -key-shares=5 -key-threshold=3 \
-pgp-keys="[key1].asc,[key2].asc,[key3].asc,[key4].asc,[key5].asc"

For more information on using Vault with PGP, visit their tutorial here.

Unsealing Your Vault Nodes

Every Vault node in your Vault instance starts off in a sealed state unless it’s the first time you launch and initalise your Vault instance.

Unsealing is straightforward — simply get your keyholders to run the $ vault unseal [decrypted unseal key] command on the Vault hosts with the unseal keys in their posession. Vault will keep track of how many successful unseal attempts have been made and automatically unseal once the threshold number of successful attempts have been made.

Authenticating You Vault Server Session

Vault supports several means of authenticating to its CLI. The below diagram shows where the authentication backend is located in Vault’s core architecture, and provides a list of currently available auth backends:

Figure 2: Vault authentication backends

The official Vault architecture diagram refers to the auth backend as the credential backend. I have changed it in my own version of the diagram, as the term auth backend is used in the text-based tutorials for the project.

The token backend is the only one that is enabled by default. We’ll authenticate to Vault with the root token that was provided to us during init:

$ vault auth [root token value]

Note that this command will cache the raw token value to ~/.vault-token , which should be only accessible to the user from which the auth command was issued. Future Vault sessions will use this token to authenticate any Vault CLI commands. If you do not like this behaviour, you will need to manually delete the ~/.vault-token file when you’re done with your session. See here for a discussion of this behaviour by the Vault developer group.

Switch On Syslog Audit Backend

Once your Vault instance is fully up and running, you can switch on your preferred audit backend. Vault allows you to enable multiple audit backends. If it is unable to log any requests to all audit backends, the request(s) will fail. Enable the Syslog backend via the CLI:

$ vault audit-enable syslog tag="vault" facility="AUTH"

Kindly stay tuned for the next part, where we’ll use some manual commands to set up Vault’s PostgreSQL secret backend for automated credential management.

Show your support

Clapping shows how much you appreciated Racter’s story.