Managing cloud secrets the right way: Hashicorp Vault on k8s

Published in

Kuranda Labs Engineering

11 min readApr 17, 2019

When it comes to security management within an organization, you have to admit that, no matter the organization’s size, a crucial direction is how you setup strong credentials and how you protect them.

The topic is becoming even harder with the explosion of third party cloud-based services (Slack, Gmail, Notion, etc.) with a major emphasis on the cloud providers (AWS, Azure, etc.): they all need their own credentials.
What is hard with the cloud is that, most of the time, credentials are used by an application to connect to a given service.

It means two things:

This is not a person’s responsability to remember credentials
The application still needs those credentials to access the service

As we deploy workloads to the cloud more frequently, sometimes in an automated way with CI/CD pipelines, the struggle of managing secrets increases.

How companies deal with this topic today?

Designing a good solution to the problem of secret management is key as it can help companies deploy faster and better with a higher level of confidence.

They write credentials in the code

Sounds easy right ? Credentials are tracked by git: simple, fast and shareable to other people in the same repository. Besides, it is easier to deploy and to go to production!

Never do that, no matter your company’s stage!

The main issue with this solution is that if one single person with access to this repository gets his account hacked, you already lost the battle. Imagine you have database credentials with elevated privileges written in the code (say the postgres user in postgresql). If I get hold on them, I can dump your database in the matter of minutes! Afterwards, if I am really mean, I can also drop your database (you better have backups)! Worst case scenario, I now have all your data and you have nothing left: you can call your lawyer and shut down your business the next week.

Uber just got fined 400k€: they wrote credentials in the code, and were hacked

They use an internal server

This was our approach at the very begining of Kuranda Labs. We bought a NAS server and we plugged it into our internet box. Doing so, only computers connected to our internal network could login into the server. We setup a git server on the NAS using SSH so that we could have a repository with all the different credentials, which was isolated from the internet.

We would configure our applications to load credentials from the file system: there were no credentials in the code. Because we deploy on kubernetes, we would use the Secret object to directly load credentials from the NAS server. On each deployment, we would mount those secrets as volumes and containers would load what they needed from those volumes.

We decided last month to drop this approach for several reasons:

Cannot access the NAS from another network
We did not want to setup a VPN and deal with the system administration pain that comes with it.
Pressure on the physical security
You could see the NAS as a vault protecting all our secrets, but who was protecting the NAS? That’s tricky, especially when your server has a reset button!
The exact same issue existed with the internet box. Forget about co-working spaces if you have to manage the physical security of the network!
Keys rotation
The NAS was storing the data we wanted him to store, that was all. We still had to create heavy manual processes to rotate the keys and this put pressure on the team and on the deployments.
No lease time
This is related to the previous aspect: we needed to manually rotate credentials. If we forgot about it, we would lower the security level.
Coupled to deployments
We needed to load credentials from the NAS server into our kubernetes cluster. This was a manual work which took time and was prone to error. A modification in the NAS server was adding heavy additional work on the production environment (ex: update Secret in kubernetes).

Enter HashiCorp Vault

HashiCorp Vault is a server designed to store and serve secrets in a programmtic way with a very high level of trust. The main use-case we use Vault for is its ability to create dynamic secrets with a given lease time.

As said before, Vault is a web server, but it has a massive amount of built in features that are all designed for great secret management workflows:

It can be sealed/unsealed using multiple shares of an encryption key thanks to the Shamir algorithm. In the seal state, the vault cannot be accessed or read!
It already contains a great authentication system with integrations to external services. For example, you can login to the server using a Github personal access token.
It has deep cloud providers integrations so that it becomes really easy to issue new secrets for a given service (ex: AWS S3). Those secrets are higly customizable at runtime (ex: lease time!).
Encrypt all the data in the storage backend. Again, it is deeply integrated (you can use Google cloud storage as a storage backend) and can decouple computation from storage.
Can be deployed in high availability mode given the right backend storage.
Expose a great dashboard to easily maintain your system.

You can easily understand that it clears all the concerns listed in the internal server solution case. Let’s deep dive!

Deploy a containerized HashiCorp Vault on k8S

In this part we’ll learn how to deploy a highly available HashiCorp Vault server on kubernetes.

Choose a backend storage

You can find the documentation about storage backends here. This choice is important for several reasons:

Some backends cannot start in high availability (HA) mode.
Some backends are not production ready (testing purposes only): you will never use the in memory storage in production!
Some backends are complicated to master in the context of a kubernetes deployment.

We decided to store vault’s data in Google Cloud storage because we already used the platform on a daily basis and because it had support for the HA mode!

Note that this decouples storage from computation: we could take down the server and restart it, the data would still exist. You should just remember that the vault always restart in a sealed mode.

Safely create a bucket in Google Cloud

In the example, we create a bucket named kurandalabs-vault. When configured, all the data from the vault lies in this bucket. Do not worry, the data is encrypted by design.

As the storage is not coupled to the computation, the single point on failure is the storage. If you delete the bucket, you lose all your secrets. This is why you should be extra careful on how you setup the bucket.

Create a new service account to get access from the k8s cluster
Under the IAM and admin tab, create a new service account but do not assign any role to it! We named it kurandalabs-vault.
Issue the related service account key
Under the API & services tab, click on Create credentials and select the service account key option. Of course, you must create it for the service account kurandalabs-vault. Keep this JSON file safe on your machine.
Create the bucket
Under the Storage tab, create a new bucket and choose the Bucket Policy Only.
Grant access to your service account
So far, using the service account key, you are not able to access this bucket. Use the following command which grants the storage.objectAdmin role:

Grant role to your service account.

You are now setup to access your bucket with the service account key. In the Permissions tab, verify that you have only two members with access to the bucket:

As a user, you might be owner of the entire Google Cloud project. It also means that you can delete the bucket, so be careful! This is why you see my personal email address.

Create the vault configuration file

Vault reads a configuration file when it starts. This file configures several options including the backend storage, TLS, etc…

Configure the storage

Configure the listener

Here we disable TLS because Kuranda Labs uses a service mesh (Istio) that already configures the TLS policy for us.
If you do not use a service mesh, you must absolutely configure TLS in the listener. Otherwise, your data will not be encrypted during server communications.

Final configuration

Your configuration should look like this. We enable the UI and we specify the api_addr to point to our load balancer url (check here for the docs).

Create secrets from the service account key / vault configuration

Secrets in kubernetes are encrypted and are used to store sensitive information within the cluster. We use them to load the service account key and the vault configuration. Run the following command within the directory containing both files (here, named sa-creds.json and vault-config.hcl):

Create the k8s deployment

HashiCorp released an official docker image that you can use for the deployment. As you can see, we mount the two secrets as volumes. We also specify the path to the google credentials using the GOOGLE_APPLICATION_CREDENTIALS env variable.

We create on top of this deployment a service to access the containers. Because we use Istio, we do not need to create a LoadBalancer service: we use the SNI routing of Istio and specify in the gateway that we accept https trafic for the vault. You might want to use a service of type LoadBalancer to directly expose your vault server with an IP.

If everything went right, printing the container logs should display something like that in your shell:

Initialize your vault server

In the current state, your server is not initialized. This is exactly what we are going to do now. This is a crucial step because you receive the cryptographic keys to seal/unseal your vault.

Make sure that you have the vault CLI installed in your computer. If not, follow their great installation tutorial.

When using the CLI, the vault client reads the VAULT_ADDR env variable and hits the related url. Replace it accordingly:

export VAULT_ADDR='[SERVER_URL]'

After having exported the server url, initialize the server by running the following command:

vault operator init

This command is run at the very begining and is not used afterwards

As a response, your terminal should return the 5 key shares to seal/unseal the vault and the root token:

Those 5 key shares are used to unseal your vault. You should share them within your organization (1 key share to one person is the perfect setup). As explained by vault, you must have at least 3 shares to unseal the vault. If you lose more than 2 shares, your vault is lost!

Note that you can change the number of shares and the quorum to seal/unseal

The root token is a special token that has all the rights on the vault. It is mainly used to issue new tokens with specific rights (lower privileges than the root token). You should keep it safe!

Unseal the vault

The vault always starts in sealed mode, at the very begining but also at every server restart. You can check it by running

vault status

Vault is in sealed mode and it needs 3 distinct shares to be unsealed. Let’s unseal it! You have to run 3 times the following command and provide a valid share each time:

vault operator unseal

Run the status command and check the response:

Great! Vault is unsealed, HA mode is active, everything looks perfect.

What does Vault do during the unsealing process? Vault takes the 3 shares to re-build the master key. Then, it uses this key to decrypt the encryption key that is actually encrypting all the data!

Understanding vault with an example: PostgreSQL

So far, we have a server that can be sealed/unsealed. What’s next?

Let’s say you just started a PostgreSQL instance in the cloud. You have a single user postgres with a strong password and the instance url. In the next sections, we demonstrate how you can setup a great secret management workflow for your new database.

Authenticate to the vault server

You must first authenticate to the vault server. Here, we use the root token. Later, you should not directly use this token (check the docs):

vault login [YOUR_ROOT_TOKEN]

Great! Now you are able to interact with the vault API.

Enable the database engine on a given path

The vault server is actually a dynamic API gateway. It exposes endpoints to interact with secret engines. For simplicity, consider a secret engine as an integration for a given service. Here, the service is a database which turns out to be PostgreSQL. Let’s enable it:

vault secrets enable -path=postgresql database

By doing so, the server exposes a set of endpoints from the base path postgresql. You can have an overview of those endpoints in the documentation of the database secret engine (the only difference is that the base path is /database in the docs instead of /postgresql).

Configure the secret engine using the API

Let’s use the endpoints to configure the database connection:

You write a new configuration object named my-newly-created-db. We provide the postgres password, the database name and url directly from the env.bash file. This information does not change when generating new credentials. However, the username and the password will be created at runtime by Vault so we need to use templates.

Vault is now able to connect to the database instance! We also passed a flag allowed_roles to describe which type of roles can be used to create new credentials. A role is another resource exposed by the Vault API (very similar to the PostgreSQL role object), so we also need to create a readonly role:

Doing so, we create the readonly role for the my-newly-created-db instance. Let’s explain the different arguments used here:

default_ttl is the default lease time for the PostgreSQL credentials that can be created from this role.
max_ttl is the maximum lease time that can be assigned to credentials created from this role. Credentials will be valid for maximum a day.
creation_statements is a file containing the SQL statements that will be run by the database instance to create the PostgreSQL credentials (which is also called a ROLE in PostgreSQL by the way). Again, we use the template engine as this is Vault’s responsability to set the password, etc…

Create new credentials

We are almost done! Now that everything is configured, you can run a single command to issue new credentials:

vault read postgresql/creds/readonly

It creates for you a new set of PostgreSQL credentials with a lease time of 1h (remember the default_ttl flag?). Grab a coffee, wait an hour, and check that the credentials have indeed expired :)

About Kuranda Labs

Kuranda is a platform to collect and manage health data with built-in security and compliance. Using our API, you can secure health data in a matter of minutes, not months. Check our developers documentation. If you are launching a digital health business, let’s get in touch, we would love to hear from you!