Deploy a Production Ready Vault Cluster on AWS in ~5 Minutes

Sean Carolan
10 min readMay 1, 2020

--

What is this Vault thing, anyway?

HashiCorp Vault is a multi-cloud, API-driven secrets management system. You can use it to store passwords, keys and certificates. Vault can also handle many kinds of encryption and credentials management. You can use Vault to dispense temporary cloud credentials, encrypt sensitive data like credit card numbers, or to manage SSL certificates for your applications. You can find a complete list of Vault Secrets Engines on the Vault Project website.

Think of it as a Swiss Army knife for secrets management.

The Wenger Giant comes with a whopping *87* different tools

Sounds cool, right? But how do I get this nifty multipurpose secrets engine installed into my cloud account? This is not a simple process because there are quite a few moving parts. Vault is a highly-available application meaning that it runs on a cluster of machines. If one or two of those machines fails the cluster is designed to stay up and running. A properly configured Vault cluster should be able to withstand a natural disaster like a tornado or meteor strike. Well, maybe a small meteor strike.

Highly Available and Disaster Resistant

Here’s an example from our reference architecture that shows a typical garden variety Vault cluster running across three separate zones:

Open Source HashiCorp Vault Cluster with Consul Storage Backend

You might be curious why we have eight separate machines in the Vault cluster. Three of them are Vault servers, and there are five Consul servers on the backend that serve as our storage device. You can think of those Consul servers as a low-latency distributed storage disk. A little bit like a SAN or RAID array. The basic idea is to distribute the data across multiple locations so that if any two of them fail, your cluster will still remain operational. All the data is encrypted, but we store it in five separate locations just in case part of the cluster becomes unavailable.

Besides the eight cloud instances required to run the standard production architecture Vault cluster, there are several other parts that need to be configured. A valid SSL certificate is required for secure communication, and we also need a load balancer in front of the Vault cluster to route our traffic to the cluster nodes.

NOTE: If you want a more compact cluster that still offers high availability, check out the new integrated storage option for Vault. This allows you to run a three-node Vault cluster that can tolerate the loss of a single node. This feature is currently in beta at the time of this writing.

Complex Doesn’t Have to be Complicated

That’s a lot of complexity. When you’re solving complex problems (such as managing your secrets safely on someone else’s network), sometimes you need a complex solution. That doesn’t mean installing Vault has to be complicated. You can use Infrastructure as Code to distill all the build steps into a simple document that defines the entire environment.

You’re probably thinking to yourself…why aren’t we using Terraform to do this? Terraform is an infrastructure as code tool and language that allows you to build cloud infrastructure on any platform. You can even install Vault with Terraform. But not every shop uses Terraform. You might already be using Cloudformation for your other infrastructure. The great thing about HashiCorp tools is you can use them separately or together. In other words, you don’t require any Terraform expertise to get up and running with Vault.

Easy Automated Deployment of Vault

This blog post is for new and intermediate AWS users who want to get up and running quickly and securely with HashiCorp Vault, with minimal effort and setup time. We have built an AWS Cloudformation template that builds a reference architecture Vault cluster from start to finish with only a few inputs required by the user. Wherever possible we have utilized AWS native services such as Route 53 DNS, Key Management Service (KMS), Secrets Manager and AWS Certificate Manager. The template and Packer scripts for building your AMIs can be found here:

https://github.com/scarolan/vault-aws-cf/

Here’s a quick overview of what gets built:

  • VPC with 3 public and 3 private subnets
  • Operating system for Vault and Consul is CentOS 7
  • Operating system for the Bastion host is AWS Linux (latest)
  • 3 Vault servers and 5 Consul servers distributed across the private subnets
  • A bastion host for connecting to the other servers, which are not directly accessible from the Internet
  • A real SSL certificate tied to your FQDN, managed by Amazon Certificate Manager
  • Automatic unsealing of Vault using AWS Key Management Service to store the unseal key
  • The Vault cluster will be ready in 10–15 minutes. The cluster comes up in an uninitialized state. The API listens on port 8200 and is accessible from the Internet.

Installation Prerequisites

There are a couple of prerequisites that you’ll need to use this template. The first is building your source AMIs. HashiCorp has another great tool called Packer that allows you to easily build custom AMIs with your own software and configuration on them. If you’ve never used Packer before, go take it for a spin and build your first image. The Packer templates will build one AMI for Vault, and a second AMI for your Consul storage backend. We won’t cover Packer details in this blog post.

The second prerequisite is a DNS zone hosted in AWS Route 53. If you purchase your domain name from AWS, they will also handle renewals automatically for you. The Route 53 zone is what allows you to automatically generate DNS host names and SSL certificates. In this blog post we’ll use fto.hashidemos.io as our DNS zone in all the examples.

Once you’ve got your AMIs built with Packer, and a domain or subdomain in a Route 53 Zone, you can use the Cloudformation template. Make sure your AMIs are configured in the Mappings section of the template. If you intend to build Vault clusters in different regions, you’ll need to build Packer AMIs for each region. For our example below we’ll be using the us-east-1 region.

Build the Cluster via the AWS Console

The rest of the steps can be completed on the AWS Console. Let’s walk through them. First you’ll log onto the AWS console and browse to the Route 53 controls. Find the Hosted Zone ID for the zone you want to use with your Vault cluster. Make note of it as you’ll need it in a moment.

Save the Hosted Zone ID for later

Next, head over to the Cloudformation settings and click on Create Stack. We already have a template so you can select Template is Ready. You can store your template in Amazon S3, or you can upload it from your machine. If you’re uploading from your local machine, browse to the aws_vault_cf.yml file and upload it into the console:

Upload Your Cloudformation Template

Click Next and fill in the required parameters. First you’ll need to give your Cloudformation stack a name. This name shows up in the console and can only have letters, numbers, and dashes.

Set a hostname for your vault cluster. This must be the entire, fully-qualified domain name. Example: vaultdemo.fto.hashidemos.io

Next, choose three availability zones from the drop down list. It doesn’t matter which three you choose but you have to pick three, no more, no less.

Select an SSH key from the drop-down list. This is in case you need to SSH into the bastion host, which is the only way to remotely connect to your backend machines.

Finally select the correct Route 53 zone from the drop-down list. This must match the FQDN you used for your Vault cluster!

Example Configuration for Vault Cluster.

That’s it! All the main setup steps are done. Click Next to continue and add any optional tags to your stack. The rest of the settings may be left at their defaults.

Before you hit the Create Stack button you’ll need to check this box to agree that this template will be allowed to create IAM resources. This is because our template creates roles that allow your Vault and Consul instances to talk to AWS services such as Secrets Manager and KMS. You can inspect the roles in the Cloudformation template if you want to see what types of permissions are granted.

Check that box and click “Create Stack” to start building your Vault cluster

There’s one last step you’ll need to do to finish building the Vault cluster. It only has to be done once per domain name. Head on over to the AWS Certificate Manager page and you’ll see your new domain name with a Pending Validation status. Click the small arrows to reveal the blue Create record in Route 53 button. Click the button to create a DNS record to verify your SSL certificate. This only needs to be done the first time you build a cluster with this domain name. Subsequent rebuilds will go faster and automatically use the already verified DNS.

Verify Ownership of your Domain Name

Now you can go back to the Cloudformation page and watch your build finish. It can take up to 30 minutes to complete. Now, I know what you are thinking…“You said 5 minutes!”

Before you grab your torches and pitchforks, this only happens on the first build with a new DNS name and ACM certificate. Once the DNS record for your SSL certificate has been created and verified, subsequent rebuilds of the cluster go much faster as long as you use the same name. There’s nothing we can do about this initial setup delay. You may wish to do this step in advance for clusters that need to be up and running fast, for example as part of a disaster recovery plan. During testing we were timed the Cloudformation build which took just a little over five minutes:

Ok, so we rounded down a little bit.

If you click on the Resources tab for your Cloudformation stack you can watch each of the different parts of your Vault cluster being built. There are 72 separate components defined in the template and they are all built in order. First Cloudformation lays down the network infrastructure, then it builds out your certificates, virtual machines, DNS record and load balancer.

When the stack is done building you’ll see the status change to CREATE_COMPLETE. This means that all the core infrastructure has been built, but your Vault cluster may not be ready for traffic yet. You can click on the Outputs tab to see the public URL of your Vault cluster:

A Wild Vault Cluster Appears!

Click on that link to launch the Vault UI. You should see the initial Vault setup screen:

Next you’ll create the master key that will be used to unseal your Vault. You can enter 1 for both Key shares and for Key threshold. This simply means that we don’t want to split the key into multiple parts. Click on the Initialize button.

Next you’ll want to copy your root token and unseal key and store them in a safe place. We recommend printing these out or saving them on a USB key, and storing them in a physical safe. These are important for disaster recovery or emergency situations. An encrypted copy of the master key is also saved in the storage backend, and it can only be decrypted with the correct AWS KMS key.

It can take two to three minutes after initialization for your Vault cluster to settle and start receiving requests. Click on Continue to Authenticate and use the root token to log in. If you get an error message just wait a minute or so, reload the page and log in again. Once the load balancer has found the primary vault node the cluster stabilizes and will be ready for API traffic. You’ll know it’s working when you are able to log in and see this screen:

Voila! A production-grade, highly available Vault cluster with auto-renewing SSL

Congratulations! Your Vault cluster is ready for initial configuration and setup. We’ve got a whole bunch of tutorials on how to configure your Vault cluster with different secrets engines. You might want to try our Secrets Management learning track to see what your new cluster can do.

Disclaimer: All of the software used in this blog post is open source. While you can use it in production, you need to be prepared to support it yourself. If you require commercial support for Vault please reach out to your local HashiCorp representative, or drop us a line via our contact page:

--

--