Hashicorp Vault on AWS with Auto Unseal and DynamoDB Backend Built with Terraform

Nick Lunt
Version 1
Published in
11 min readDec 22, 2021

About Myself

I am a DevOps engineer working for Version 1. I joined Version 1 in 2008 as a Linux administrator and an Oracle Applications DBA.

Version 1 gives all employees the option to move to other internal roles, and in 2019 I decided to try and get into the Version 1 Digital and Cloud team as a DevOps engineer.

To facilitate that I started learning some primary DevOps tools such as Terraform and Docker. I also got the AWS Certified Solutions Architect certification.

So in January 2020 I moved from Linux and Oracle to DevOps, and have never looked back.

Purpose

The aim of this document is to show how to run a single node Hashicorp Vault instance in Amazon Web Services, with an AWS DynamoDB backend and using AWS Key Management Service to automatically unseal Vault when it is installed or restarted.

Why

When I needed to implement Vault in AWS with auto unseal enabled, it took me some time to work it out. There was not a lot of info out there that I could find. So now I have it working why not share it, hopefully, it may help someone with their Vault install on AWS.

What Will We Cover

This will cover a single node Vault instance on AWS with DynamoDB backend. In production, there would very likely be a Vault cluster using an AWS Auto Scaling Group with an AWS Load Balancer, but that would make this more complicated than it needs to be for getting Vault running in a test environment.

So this guide will cover

  1. Creating an EC2 instance to run vault.
  2. Writing EC2 user data to install and configure vault.
  3. Setup the dynamodb backend for vault storage.
  4. Demonstrate logging in to vault from the Terraform workstation, the Vault EC2 instance, and the Vault web console.

Terraform Version

This code has been tested with Terraform versions 0.13, 0.14, 0.15 and 1.0, for this guide I will be using Terraform 1.0 only.

Let’s Get Started

I will assume you already have Terraform installed and using a Linux server for your development.

Here download the source code, then we will go through how it works.

Create a directory and download the source code for this Vault build from GitHub — https://github.com/nicklunt/vault.git

Change directory to the new vault directory. Here create an SSH key that will be used by AWS to allow SSH access to the AWS instance. Here is the method I used

ssh-keygen -y -f ~/.ssh/id_rsa > public_key

This public_key will be referenced in main.tf.

Create the Instance — main.tf

Create the AWS key pair to allow us to access the Vault instance once it’s online.

This uses the public_key we just created.

resource “aws_key_pair” “ssh-keypair” {
key_name = “ssh-keypair”
public_key = file(var.ssh-key)
}

The main.tf file will also create the instance, nothing fancy here, but the user_data is what we will be interested in shortly.

resource “aws_instance” “vault” {
ami = var.ami
key_name = aws_key_pair.ssh-keypair.id
instance_type = var.aws_instance_type
subnet_id = aws_subnet.public-subnet.id
vpc_security_group_ids = [aws_security_group.sg-vault.id]
associate_public_ip_address = true
iam_instance_profile = aws_iam_instance_profile.vault-kms-unseal.name
user_data = data.template_file.userdata.rendered
root_block_device {
volume_size = var.root_volume_size
}
tags = {
Name = “Vault Server”
}
depends_on = [aws_kms_key.vault-unseal-key, aws_dynamodb_table.vault-table]
}

The depends_on statement in main.tf tell Terraform to create the unseal key and the DynamoDB table before the instance, we specify this to ensure both the DynamoDB table and key are ready before we install Vault, otherwise, the Vault install would fail as it needs DynamoDB to store data in, and the KMS key to unseal itself.

The aws_instance resource references the aws_key_pair also in main.tf, so we can SSH to the instance.

Before getting into the user_data and the iam_instance_profile, lets get the DynamoDB table online.

Note — The DynamoDB plugin for Vault is from the community and not officially supported by Hashicorp.

DynamoDB Backend — dynamodb.tf

resource “aws_dynamodb_table” “vault-table” {
name = var.dynamodb-table
read_capacity = var.dynamo-read-write
write_capacity = var.dynamo-read-write
hash_key = “Path”
range_key = “Key”
attribute {
name = “Path”
type = “S”
}
attribute {
name = “Key”
type = “S”
}
tags = {
Name = “vault-table”
}
}

This is as simple as creating a DynamoDB table can get for Vault, and it is all Vault needs to work with DynamoDB as its backend.

The hash_key, range_key and attributes are all required by the vault itself.

The KMS Key to Unseal Vault — kms.tf

Simply create a key

resource “aws_kms_key” “vault-unseal-key” {
description = “KMS Key to unseal Vault”
deletion_window_in_days = 7
}

Vault Root Token and Recovery Key — secrets-manager.tf

When Vault starts up for the first time, it initialises and emits a root token and unseal key.
We will want to keep these safe and secure, so we put them into AWS Secrets Manager.
The recovery_window_in_days is set to zero here so we can recreate the secrets as often as required for testing.

resource “aws_secretsmanager_secret” “vault-root-token” {
name = var.vault-root-token
description = “Vault root token”
# recovery set to 0 so we can recreate the secret as required for testing.
recovery_window_in_days = 0
tags = {
Name = var.vault-root-token
}
}
resource “aws_secretsmanager_secret” “vault-recovery-key” {
name = var.vault-recovery-key
description = “Vault recovery key”
# recovery set to 0 so we can recreate the secret as required for testing.
recovery_window_in_days = 0

tags = {
Name = var.vault-recovery-key
}
}

User Data — template/vault.sh.tpl

The instance user_data is the meat and potatoes of getting Vault working, and is referenced as a Terraform template_file in data.tf to populate the ${variable}s.

Below is described everything required in the instance user_data.

# Pull Vault from Hashicorp releases and unzip it ready for configurationwget https://releases.hashicorp.com/vault/1.7.3/vault_1.7.3_linux_amd64.zip -O /tmp/vault.zipunzip /tmp/vault.zip -d /usr/bin/ && rm -f /tmp/vault.zipwget https://github.com/stedolan/jq/releases/download/jq-1.6/jq-linux64
chmod +x jq-linux64
mv jq-linux64 /usr/bin/jq
# Create a vault user and group
groupadd --force --system vault
if ! getent passwd vault >/dev/null; then
adduser --system --gid vault --no-create-home --comment “vault owner” --shell /bin/false vault >/dev/null
fi
# A default profile for local vault addresscat << EOF > /etc/profile.d/vault.sh
export VAULT_ADDR=http://127.0.0.1:8200
EOF
# Enable systemd on Amazon Linux for Vaultcat > /usr/lib/systemd/system/vault.service <<-EOF
[Unit]
Description=Vault Service
Requires=network-online.target
After=network-online.target
[Service]
Restart=on-failure
PermissionsStartOnly=true
ExecStartPre=/sbin/setcap ‘cap_ipc_lock=+ep’ /bin/vault
ExecStart=/bin/vault server -config /etc/vault/vault.conf
ExecReload=/bin/kill -HUP $MAINPID
KillSignal=SIGTERM
User=vault
Group=vault
[Install]
WantedBy=multi-user.target
EOF
# Reload systemd and enable vaultsystemctl daemon-reload
systemctl enable vault
# Vault requires a config file, so here we create that config file, set ownership and permissions then start the Vault service.mkdir /etc/vault
cat > /etc/vault/vault.conf <<-EOF
listener "tcp" {
address = "0.0.0.0:8200"
tls_disable = 1
}
ui = true
storage "dynamodb" {
region = "${region}"
table = "${dynamodb-table}"
read_capacity = 3
write_capacity = 3
}
seal "awskms" {
region = "${region}"
kms_key_id = "${unseal-key}"
}
EOF
chown -R vault:vault /etc/vault
chmod -R 0644 /etc/vault/*
mkdir /var/log/vault
chown vault:vault /var/log/vault
systemctl start vault

The config file specifies the DynamoDB table we have created as the backend, and the AWS KMS Key we created to unseal Vault.

This checks if Vault is already initialised and if so it simply checks the instance can login with it’s instance profile then the user data script exits.

Otherwise, if Vault is not initialised we initialise it and save the root token and recovery key to AWS Secrets Manager.

Continuing with the user data.

export VAULT_ADDR=http://127.0.0.1:8200# Check if vault is already initialisedINITIALIZED=$(curl $VAULT_ADDR/v1/sys/init | jq '.initialized')
if [ "$${INITIALIZED}" != "true" ]; then
echo "[] Vault DB not initialised, initialising now"
## Initialise vault and save token and unseal key
vault operator init -recovery-shares=1 -recovery-threshold=1 2>&1 | tee ~/vault-init-out.txt
echo “[] Vault status output”
vault status | tee -a ~/vault-status.txt
# Get the VAULT_TOKEN so we can interact with vault
export VAULT_TOKEN=$(grep '^Initial Root Token:' ~/vault-init-out.txt | awk '{print $NF}')
# Get the unseal key
export RECOVERY_KEY=$(grep '^Recovery Key' ~/vault-init-out.txt | awk '{print $NF}')
# Save the root token and recovery key to aws secrets manager, then we can delete ~/vault-init-out.txt
# The secret resources have already been created by terraform (secrets-manager.tf) aws secretsmanager update-secret — secret-id ${secret_token_id} — secret-string "$${VAULT_TOKEN}" — region ${region} aws secretsmanager update-secret — secret-id ${secret_recovery_id} --secret-string "$${RECOVERY_KEY}" — region ${region} # Remove the temp file which has the root token details
rm -f ~/vault-init-out.txt
else
# Vault already initialised, which means the db is up which has our role, so login with that role, then exit this script.
echo "[] Vault DB already initialised. Check we can login with aws method and exit"
vault login -method=aws role=admin
echo “[] Userdata finished.”
exit
fi

If Vault is not already initialised, the if statement above does not exit and user data continues.
It enables the Vault audit file, enables aws auth and a kv secrets engine.
Then it pulls an admin policy we have uploaded to S3, and writes that policy to Vault assigning the policy to the Vault instance role.

Continuing with the user data.

# Enable vault logging
vault audit enable file file_path=/var/log/vault/vault.log
# Enable the vault AWS and kv engine
vault auth enable aws
vault secrets enable -path=secret -version=2 kv# Get the vault-admin-policy.hcl file that was uploaded to S3 in s3.tf
aws s3 cp s3://${vault_bucket}/vault-admin-policy.hcl /var/tmp/
# Create the admin policy in vault
vault policy write "admin-policy" /var/tmp/vault-admin-policy.hcl
# Give this instance admin privileges to vault, tied to this instances vault_instance_role.vault write \
auth/aws/role/admin \
auth_type=iam \
policies=admin-policy \
max_ttl=1h \
bound_iam_principal_arn=${vault_instance_role}
echo "[] Userdata finished."

That is the user_data for Vault complete. Look at data.tf for the ${variables} passed into the file.

User Data Template File — data.tf

data "template_file" "userdata" {
template = file("${path.module}/templates/vault.sh.tpl")
vars = {
region = var.region
dynamodb-table = var.dynamodb-table
unseal-key = aws_kms_key.vault-unseal-key.id
instance-role = aws_iam_role.vault-kms-unseal.name
vault_instance_role = "arn:aws:iam:${var.region}:${var.account_id}:role/${aws_iam_role.vault-kms-unseal.name}"
vault_bucket = var.bucket_name
secret_token_id = aws_secretsmanager_secret.vault-root-token.id
secret_recovery_id = aws_secretsmanager_secret.vault-recovery-key.id
}
}

Instance Profile — instance-profile.tf

As seen in the user data for Vault, the instance itself needs access to S3, Secrets Manager, KMS and DynamoDB. We create an IAM Role to match those requirements, then that is assigned to the Vault iam_instance_profile in main.tf.

data “aws_iam_policy_document” “assume_role” {
statement {
effect = “Allow”
actions = [“sts:AssumeRole”]
principals {
type = “Service”
identifiers = [“ec2.amazonaws.com”]
}
}
}
data “aws_iam_policy_document” “vault-kms-unseal” {
statement {
sid = “VaultKMSUnseal”
effect = “Allow”
resources = [“*”]
actions = [
“kms:Encrypt”,
“kms:Decrypt”,
“kms:DescribeKey”
]
}
statement {
sid = “VaultDynamoDB”
effect = “Allow”
resources = [aws_dynamodb_table.vault-table.arn]
actions = [
“dynamodb:DescribeLimits”,
“dynamodb:DescribeTimeToLive”,
“dynamodb:ListTagsOfResource”,
“dynamodb:DescribeReservedCapacityOfferings”,
“dynamodb:DescribeReservedCapacity”,
“dynamodb:ListTables”,
“dynamodb:BatchGetItem”,
“dynamodb:BatchWriteItem”,
“dynamodb:CreateTable”,
“dynamodb:DeleteItem”,
“dynamodb:GetItem”,
“dynamodb:GetRecords”,
“dynamodb:PutItem”,
“dynamodb:Query”,
“dynamodb:UpdateItem”,
“dynamodb:Scan”,
“dynamodb:DescribeTable”
]
}
statement {
sid = “IAM”
effect = “Allow”
resources = [“*”]
actions = [
“iam:GetInstanceProfile”,
“iam:GetRole”,
“iam:CreateAccessKey”,
“iam:DeleteAccessKey”,
“iam:GetAccessKeyLastUsed”,
“iam:GetUser”,
“iam:ListAccessKeys”,
“iam:UpdateAccessKey”,
“sts:AssumeRole”
]
}
statement {
sid = “S3”
effect = “Allow”
resources = [“arn:aws:s3:::${var.bucket_name}/*”]
actions = [
“s3:GetObject”
]
}
statement {
sid = “SecretsManager”
effect = “Allow”
resources = [
aws_secretsmanager_secret.vault-root-token.id, aws_secretsmanager_secret.vault-recovery-key.id
]
actions = [
“secretsmanager:UpdateSecret”,
“secretsmanager:GetSecretValue”
]
}
}
resource “aws_iam_role” “vault-kms-unseal” {
name = var.instance-role
assume_role_policy = data.aws_iam_policy_document.assume_role.json
}
resource “aws_iam_role_policy” “vault-kms-unseal” {
name = var.instance-role-policy
role = aws_iam_role.vault-kms-unseal.id
policy = data.aws_iam_policy_document.vault-kms-unseal.json
}
resource “aws_iam_instance_profile” “vault-kms-unseal” {
name = var.instance-profile
role = aws_iam_role.vault-kms-unseal.name
}

Not Included

I have not included here the VPC setup, Security Groups, Internet Gateway and variables.tf as this post is about Vault with Terraform.

To get the full code please download it from https://github.com/nicklunt/vault.git and change the below variables in variables.tf to suit your environment.

# Change region for your env
variable “region” {
type = string
default = “eu-west-2”
}
# Change this to your aws account
variable “account_id” {
description = “AWS Account ID”
type = string
default = “01234567890”
}
# Change this to your external IP
variable “my_ip” {
description = “my external IP address”
type = string
default = “aa.bb.cc.dd/32”
}
# May need to change the AMI if not in eu-west-2
variable “ami” {
description = “eu-west-2 Amazon Linux 2 AMI”
type = string
default = “ami-0a669382ea0feb73a”
}
# I’m using a small instance type for testing
variable “aws_instance_type” {
description = “aws instance type”
type = string
default = “t2.micro”
}
# Size of the OS volume
variable “root_volume_size” {
description = “size of the os volume in GB”
type = string
default = “50”
}
# May need to change the bucket to something unique
variable “bucket_name” {
description = “Bucket to upload any required files”
type = string
default = “vault-conf-bucket-01010101”
}
# If running terraform from linux, generate the ssh-key with
# $ cd <vault terraform directory>
# $ ssh-keygen -y -f ~/.ssh/id_rsa > public_key
variable “ssh-key” {
type = string
description = “File holding my public ssh key: ssh-keygen -y -f ~/.ssh/id_rsa > public_key”
default = “public-key”
}

Running the Terraform

Before running Terraform, remember to create the key in the same directory as the Terraform files so you can SSH to the instance once it’s built.

$ ssh-keygen -y -f ~/.ssh/id_rsa > public_key
$ terraform plan
Terraform used the selected providers to generate the following execution plan. Resource actions are indicated with the following symbols:
+ create
<= read (data resources)
Plan: 18 to add, 0 to change, 0 to destroy.

Run terraform apply to create the resources

$ terraform apply -auto-approveApply complete! Resources: 18 added, 0 changed, 0 destroyed.Outputs:
Authenticate_to_vault = “vault login -method=aws role=admin”
Connect_to_vault_UI = “http://123.123.123.123:8200"
Connect_to_vault_instance = “ssh
ec2-user@123.123.123.123”

So now we have Vault running in AWS, using DynamoDB as the backend, and be automatically unsealed.

Let’s try connecting to it.

Connecting to Vault

The Terraform outputs (outputs.tf) show us how to connect.

Testing Vault from our development station

$ export VAULT_ADDR=http://123.123.123.123:8200
$ vault status
Key Value
— — — -
Recovery Seal Type shamir
Initialized true
Sealed false
Total Recovery Shares 1
Threshold 1
Version 1.7.3
Cluster Name vault-cluster-3a7ac91b
Cluster ID 12f3f95e-4be3–75bb-735e-f4a485317b1d
HA Enabled false

Success, we can see Vault is initialised and unsealed.

The test we can access the Vault UI at http://123.123.123.123:8200 from our development station

If you go to AWS Secrets Manager and search for the ‘vault-root-token’ secret, you can log into Vault with it here.

Finally, let’s SSH to the Vault instance and try to login to Vault using the instance profile

$ ssh ec2-user@123.123.123.123
[ec2-user ~]$ vault login -method=aws role=admin
Success! You are now authenticated. The token information displayed below is already stored in the token helper. You do NOT need to run “vault login” again. Future Vault requests will automatically use this token.Key Value
— — — -
token s.BqtXOSb59vbBi9qyn52zEwRt
token_accessor UY7sX672lOgZaxdZqYpJTKLO
token_duration 1h
token_renewable true
token_policies [“admin-policy” “default”]
identity_policies []
policies [“admin-policy” “default”]
token_meta_account_id 01234567890
token_meta_auth_type iam
token_meta_role_id ae364a87–95t7–84n7–5835–73h74b875

More success! This proves the Vault instance itself can log into Vault due to the instance profile.

In Closing

Hopefully this has shown how to auto unseal Vault when deploying Vault with Terraform, and while there is quite a bit of code in this guide, the actual unsealing is just a stanza in the vault config file referencing the KMS key to unseal itself with.

In production, we would have vault clustered and in an ASG fronted by an ELB. And of course, have it running HTTPS instead of HTTP.

If you are interested in testing all this out, the full code is available on GitHub, https://github.com/nicklunt/vault.git, which includes the VPC, security groups etc. Just change the well-commented variables at the top of variables.tf to suit.

Thanks for reading this far, and good luck.

About The Author
Nick Lunk is a DevOps Engineer here at Version 1.

--

--

Nick Lunt
Version 1

DevOps Engineer at Version 1. Fascinated by all things cloud and automation.