Methods to Incorporate Vault into your Cloud Environment — Pros and Cons

Published in

AI+ Enterprise Engineering

10 min readJul 12, 2021

Introduction

Over the course of my time working in the Cloud Engagement Hub (CEH), I’ve seen and heard of multiple clients that leverage Vault as the solution to secrets management in their cloud environments. Vault, an offering by HashiCorp, is a tool that allows you to securely store and access your organization’s secrets within one central location. More information about Vault and secrets management in general can be found at their website here.

I’ve found that there are 2 main ways to leverage Vault in a cloud environment — making changes to your deployment configuration or making changes to your source code. This article will dive into the pros/cons of each method and hopefully give you a better understanding of which one is right for your use case. All of these are personal insights that I’ve gained while learning about and incorporating Vault into the CEH cloud environment, which you can read about here.

Changing Deployment Configurations

The method of changing deployment configurations refers to any changes made in the YAML files that create resources such as deployments, services, etc. This is separate from any of the application’s source code. Some general advantages of using this method of Vault integration are as follows:

Pro: Secrets aren’t stored in YAML files.

This greatly reduces risk by decreasing the human error related to storing secrets on disk and, especially, sending YAML files to others on your team. Sharing plaintext secrets across the internet opens up multiple attack vectors for bad actors who could obtain and read said secrets. An example is the CustomResource (CR) YAML file for the Cloud Engagement Hub’s IBM Stock Trader demo application. There are multiple sections for secrets (i.e. database credentials, API keys, etc.), but incorporating Vault means that those secrets no longer need to be specified in the CR file.
This makes it easier to update and deploy your application through a CI/CD pipeline. Again, I’ll use the aforementioned IBM Stock Trader application as an example. The CR file contained secrets, meaning we couldn’t put it directly in a public GitHub repository. Otherwise, anyone could view our secrets and access our services + data. We took the approach of masking the secrets with placeholder values, but this meant we had to manually create a Kubernetes Secrets so that the application could still connect to downstream services. An alternative would have been to use a private GitHub repository, but this adds the overhead of creating/managing a GitHub access token which has the authority to read the private repository. In both approaches, a team would waste time with extraneous steps. Moving the credentials from the CR file into Vault made the process easier because we could safely store the CR in a public GitHub repository.

Pro: Can fit into existing applications that use environment variables. Pulling from environment variables is a very common method that applications use to consume secrets. As such, the transition to using Vault wouldn’t require any source code changes for any application which already uses environment variables. This helps in cutting down transition time if you plan to use this type of Vault integration method.

There are two sub-methods under this umbrella, either of which you can use to integrate Vault into your cloud environment — Kubernetes Secrets or the Vault Agent Injector. Each of these sub-methods have the advantages listed above, as well as their own specific pros/cons.

Kubernetes Secrets (Sub-Method 1)

This is the simplest method of integrating Vault into your cloud environment. Specifically, the secrets stored in Vault will be read, Base64 encoded, and stored in a Kubernetes Secret. From there, deployments can pull in the secrets defined in the Kubernetes Secret as environment variables which a developer can then reference in their code. Multiple tools can be used to pull the secrets from Vault and store them in your cloud environment and the choice of tool is a team’s preference. For example, if you wanted the tool to run in your continuous integration and continuous delivery (CI/CD) pipeline, then you could use a tool like the argocd-vault-plugin. On the other hand, if you wanted to deploy and monitor the tool in-cluster, you could use a Helm deployment like kubernetes-external-secrets.

Pro: Create non-pod Kubernetes resources. Since this method creates Kubernetes Secrets, any Kubernetes resource that you need to create can natively pull secrets through the secretKeyRef field in their deployment YAML. This is useful for non-pod resources like ingresses which can’t spawn sidecars but still need to use the secrets stored in Vault.

Con: Kubernetes Secrets are stored as Base64 encoded values. Even though Kubernetes Secrets should be secured through RBAC/IAM access in a cloud environment, it is still not good practice to keep secrets in Base64 encoded strings. Base64 is non-secure way of storing values and is essentially the same as plaintext because it can be easily decoded. There are multiple websites online that can instantly decode Base64 encoded strings, so it would be trivial for a malicious individual to read your secrets if they obtained the encoded values.

Vault Agent Injector (Sub-Method 2)

The Vault Agent Injector is a complementary tool offered by HashiCorp which injects an init container and sidecar into an application’s pods to mount a shared volume of secrets from Vault and keep them updated. More information about the tool can be found here.

Pro: Secrets are only stored in necessary pods. This is an advantage on two levels, scoping and mitigation. An example to illustrate scope would be to imagine a microservices-based application where each microservice is developed by separate teams. There’s no reason for Team B to have access to database credentials for Team A, and there’s no reason for Team A to have access to credentials for a service (i.e. IBM Event Streams) that only Team B uses. This just leads to more avenues for potential data leaks since the credentials are exposed to more people than necessary. This falls in line with the (more formal) Principle of Least Privilege, which states that…

In a particular abstraction layer of a computing environment, every module (such as a process, a user, or a program, depending on the subject) must be able to access only the information and resources that are necessary for its legitimate purpose.

On the mitigation side, this is an advantage because it means only a subset of your secrets are exposed at once if a container is exploited. More importantly, this means that only a fraction of your sensitive data would be exposed, instead of every bit of data used by your application. Another small advantage is the time saved when revoking leaked secrets and generating new ones — the difference in time for this process with 5 secrets vs. 500 secrets could heavily mitigate the amount of data exposed.

Pro: Ability to refresh secrets without creating a new pod. The traditional Kubernetes Secrets method of updating secrets in a pod looks something like this:

Edit the Kubernetes Secret with updated credentials
Terminate all pods which use the updated secret
Wait for all the terminated pods to be re-created and ready

There are some glaring issues here, the main one being downtime. If all the pods of a particular service need to be terminated because they need updated secrets to work, this means some portion of the application won’t be able to receive/serve any requests for a period of time, which is unacceptable. Another issue here is in scale, where there could be hundreds of pods that need to be terminated depending on which secrets need to be updated. Keeping track of the links between secrets and pods which require certain secrets can quickly become a tangled mess in an environment with a large volume of pods. With the Vault Agent Injector method, the problems above become irrelevant because the secrets in the pods can be refreshed without having to terminate any pods in the first place. This is possible through the Vault sidecar, which updates the secrets in the shared vault mount in each container’s file system.

Con: Secrets in containers are stored in plaintext. Even though containers should be secured by administrators through RBAC/IAM, it is still not completely secure for container secrets to be mounted in plaintext. One reason is that bad actors can gain access to the container itself and have easy access to secrets by simply combing through the filesystem. Another reason is that cluster backups such as etcd stores will save those plaintext secrets in the backup, which adds another point of vulnerability that an attacker could exploit to obtain your secrets.

Con: Extra overhead needed to inject and refresh secrets in pods (init container + sidecar). The Vault Agent Injector uses an init container to pull the secrets from Vault and mounts them in a shared volume in all containers in the pod before creation. Afterwards, a sidecar container is created which monitors Vault and updates the shared volume if the relevant secret is updated in Vault. Both the init container and the sidecar consume cluster resources when they are created and (in the case of the sidecar) maintained. The default resource limits for these containers are 500m of CPU and 128Mi of memory. These might seem small initially, but the total consumed resources can grow out of hand extremely quickly if your application scales out to large volumes of pods. Another point to keep in mind is that the sidecar might be completely irrelevant — i.e. if your application only uses secrets as creation time, and doesn’t leverage the sidecar’s ability to refresh secrets while keeping the pod alive.

Changing Source Code

An alternative method of incorporating Vault would be to use it directly in your application source code. Vault instances uncover a REST API that developers can call to authenticate clients, read/write secrets, create new secret paths, etc. For example, executing the following cURL command would read the secret stored in the foo/bar Vault path.

curl -H "X-Vault-Token: <vault_auth_token> \
     -X GET \
     http://<vault_address>/v1/foo/bar

For those who don’t want to deal with refreshing authentication tokens, parsing API responses, and the other side effects of coding directly to a REST API, Vault provides a list of Official (read: maintained by HashiCorp) and Community client libraries which you can find here.

Pro: Most secure way to consume secrets. By directly coding to the Vault REST API or using a client library in your source code, none of your secrets will ever be exposed in Kubernetes resources. The secrets also won’t be found anywhere in the container’s filesystem, meaning they won’t be found in etcd or any other cluster backup storage. Instead, they will be requested and consumed in-memory inside the container only. To illustrate, this means a bad actor could gain access to your Kubernetes cluster and gain access to the containers themselves but it will still be extremely difficult for them to extract your secrets from memory. This drastically reduces the attack vectors that bad actors have to steal your secrets and gain access to your sensitive data. There are avenues to make your secrets even more secure in-memory, such as HSMs, however those are more hardware focused and are outside the scope of this article.

Con: More developer overhead. When referencing overhead, most people mean hardware overhead like CPU and memory usage, file storage, etc. However, there is also the difficult-to-measure overhead of time. Although the HashiCorp Vault client libraries have multiple use case examples, there will still be developer time spent learning the full library specs. Additionally, the use case examples provided by HashiCorp don’t encompass all the functionality that you can leverage from the client library, some examples being dynamic credentials and Encryption-as-a-Service. If your team ever wants to use these Vault features, it would mean more developer time away from the final product. Another point to mention is that there aren’t many official libraries offered by HashiCorp, most of them are open-source community libraries. With security in mind, it might be necessary to spend the time and vet the source code of the libraries that you want to use, or even create an in-house Vault library that suits your needs.

Con: Vendor Lock-in. Although Vault is a very common solution to secrets management, there are alternatives which you could use in the future. Some examples are IBM Cloud Secrets Manager, AWS Secrets Manager, and Azure Key Vault. Other than the IBM Cloud Secrets Manager (which has Vault API integration), the other secrets management solutions don’t interface with the standard Vault API. This means you will need to make significant code changes and use their specific libraries if you wanted to switch to one of the managed service secrets managers listed above (excluding IBM Cloud SM). Note, this is not a direct disadvantage to using Vault but more of a future-proofing consideration. If you are fine with managing your own Vault instance, this won’t pose a problem.

Conclusion

This article only scratches the surface of Vault’s functionality and secrets management in general, but I hope that it has helped you become more educated about the topic and better prepared to make your cloud environment more secure. For more information about Vault and its use cases, head over to Vault’s website. Thanks for reading!

Special thanks to Greg Hintermeister and Grzegorz Smolko for reviewing this article!

Methods to Incorporate Vault into your Cloud Environment — Pros and Cons

Introduction

Table of Contents:

Changing Deployment Configurations

Kubernetes Secrets (Sub-Method 1):

Vault Agent Injector (Sub-Method 2):

Changing Source Code

Changing Deployment Configurations

Kubernetes Secrets (Sub-Method 1)

Vault Agent Injector (Sub-Method 2)

Changing Source Code

Conclusion

Written by Raunak Shrestha