Securing udaan with HashiCorp Vault

Sai Sharan Tangeda
engineering-udaan
Published in
6 min readMay 17, 2023

At udaan, we had always relied on kubernetes secrets for storing application secrets which were accessible by our Infrastructure Gatekeepers. But as the use-cases grew along with engineers it became more and more challenging to continue with this process.

We started observing scaling challenges that affected productivity and maintainability of the systems, such as

  • Handling cloud secrets expiry
  • Manual rotation of secrets
  • Granting database accesses to engineers
  • Local secrets sharing
  • Updation of secrets for all micro-services

These shortcomings prompted us to explore and leverage the HashiCorp Vault. Vault’s documentation (https://developer.hashicorp.com/vault/docs) are very descriptive and detailed enough to help anyone get started. Therefore in this blog we will avoid the standard Vault setup and focus on how we operationalised Vault at scale in udaan along with practices around securing Vault

Vault usage architecture at udaan

1. Authentication & Authorization

For authentication and access control we classified different ways of accessing Vault into following categories

a. User involved login

b. Containerised Services & Jobs

1.1. User Involved Login

Vault supports many authentication methods such as OIDC, GitHub, LDAP, tokens etc. We used OIDC backed by Azure AD to enable authentication and authorization.

1.1.1. Access Control

Once Azure Service Principal was integrated, we were easily able to integrate Azure User Groups into Vault and configure it as External Group. To further simply the process of creation and integration of an external group we wrote a shell script that uses Vault’s CLI to auto-generate the External Group in Vault.

1.2. Containerised Services & Jobs

Vault provides multiple ways to integrate with Kubernetes, Containerised Systems, Databricks Job or traditional VM based deployment.

We wanted to adopt a single environment agnostic solution, which led us to AppRoles authentication method. Through this, we are now able to completely isolate secret access for micro-services from the type of compute they use.

1.2.1. Vault for Kotlin & Python

Once Auth Method was locked-in, we wanted the integration into our application framework to be very straightforward for our engineering teams. Hence we had built our custom client SDKs for Vault which could,

a. Cache secrets based on use-case to avoid the repeated API calls. Unless real-time secret update is needed, this is recommended for all of the use-case.

b. Seamlessly fit into different environments like Local, Dev, Stage & Production without any code change.

1.2.2. Secret Engines Setup:

For isolating secrets and permissions across environments, we created four secret engines namely kv-local, kv-dev, kv-stage, kv-prod. Based on individual organizational needs, this can change.

Once this was done, next step was isolating permissions and providing space for plethora of services hosted within udaan. For this we wrote another shell script as a part of our Vault’s IAC that automates creation of secrets and corresponding AppRoles for every new micro-service that is built.

Based on individual organisation’s policy we can configure the Azure Group can access any secret in all four of the secret engines. For example, if kv-prod may contain sensitive secrets which can be shared only on-demand whereas other environments can be configured in more lenient fashion.

We also had setup a common secret in each of these engines, which contains keys that are used by multiple services, this could be the keys to your private maven repository or PyPI server etc.

Now that secret engines are setup, its’ just a matter of having a client that will allow the dev team to seamlessly integrate Vault into their applications while being infrastructure and environment agnostic.

1.2.3. Vault Clients:

To achieve this, we built our own Vault clients that abstract away the process for development team. Below shown is a code snippet that depicts the way we created environment agnostic Vault Client object.

In local environment, when VAULT_APP_ROLE_TOKEN is not available, the user vault token kicks in automatically allowing dev team to run applications in test/local environment without any configuration or code change.

2. Credentials Management & Password Sharing

To avoid password sharing between employees over different channels, we moved responsibilities of all secrets creation/sharing to Vault.

2.1. Postgres DB Management:

We leverage Vault’s Database Secrets Engine capability to manage our Postgres Databases.

Vault ensures dynamic secret creation/revocation while providing us the capability to manage TTLs and rotation periods.

Example of Dynamic Role

For Dev Databases, we allow more lenient 90 days secret rotation period where as production databases have read-only and read-write access profiles with 1-2 days secret rotation.

Even with secret rotation, it is important and recommended to setup firewall rules on your databases to allow connections through trusted networks only

This process has allowed us to maintain audit of all database accesses and helped us move away from the insecure method of password sharing over threads.

2.2. Service Principals & Azure Credentials:

Creation of Azure Service Principals and following up on their timely secret rotation is a huge hassle, even a single miss could potentially cause a Service Disruption across the systems. Hence we now leverage Vault’s Azure Secret Engine, that has allowed our applications to lease permissions for accessing Azure Resources, while managing key rotation and dynamic access revoke at micro-service level.

We configured each service/job to have it’s own AppRole which is configurable with our IAC script. And we manage the permissions needed by this AppRole via Vault. Once this setup is done, as shown below Vault seamlessly manages and leases credentials for accessing azure

Image Credit: https://developer.hashicorp.com/vault/tutorials/secrets-management/azure-secrets

3. Reliability & Security of Vault

Now that Vault has streamlined and is managing all our operations, the most important problem that’s left to solve would be securing Vault to avoid abuse or vulnerabilities.

3.1. High Availability & Storage

We configured our Vault in High Availability mode (https://developer.hashicorp.com/vault/docs/concepts/ha) with enough redundant deployments across geo-regions to ensure availability even during a major network or datacenter failures.

Unseal keys which are required to unlock or initialise Vault instance should be distributed strategically as well. And these should be safely kept with the gatekeepers, so that Vault can be re-unsealed in case of disaster.

3.2. Firewall & Network Security

In this digital era, one can never be too safe, and to ensure this we have taken the extra important step to ensure Vault is only accessible within our Private & Trusted Networks.

This guarantees the additional security to prevent unauthorised access even if the Unseal keys or tokens are leaked.

4. Conclusion

There are several ways to adopt HashiCorp vault, based on organisational needs the adoption strategies could change and what we have shared above is one such way among them. As Vault evolves and our use-cases evolve we will continue to improve and further optimise our processes.

Feel free to reach out to us for any questions or to discuss about the HashiCorp Vault implementation in further detail.

--

--

Sai Sharan Tangeda
engineering-udaan

Founder @ vishwa.ai | Fastest way to automate your workflows with custom AI