Tenant-aware spring-boot application having shared PostgreSQL DB secured with HashiCorp Vault

5 min readOct 12, 2021

In this article, I will try to cover two parts:

How do we implement a spring-boot application that stores tenant data into a database shared by multiple tenants?
How do we secure the shared database using the credentials rotation mechanism offered by HashiCorp Vault?

The code is available on this GitHub repository containing instructions to run the application step-by-step. These two cases are from the references mentioned at the end of this article. Let us get into some details.

Imagine you want to build a SaaS application that aims to serve 200+ tenants. So, how do we go about architecting such an application from a database point of view? Well, there are two ways to handle this problem of serving 200+ customers. Here is the first approach.

At some point in time, most of us have implemented the single-tenant application where there is no sharing of resources. While the isolation is baked right into the design, there are lots of issues with this approach. Depending on the customers, we might have to support 2–3 application servers, 2–3 DB vendors and 1–2 OS. Also, we have to maintain the old version of the code since not all customers would shift to the latest version. Since the database is not shared, we will have to maintain 200+ databases which is a nightmare. Here is the other approach to serving multiple tenants.

Now, this picture is a lot simpler. Here we have just a single database to maintain, and other resources are also shared. But there is a visible challenge of tenant data isolation. How do we ensure the query performance for one tenant is not affected by other tenants uploading lots of data? For one tenant, how do we ensure that query performance is not affected by the cold data? I will address query performance-related solutions at the end of the article.

In this section, we will see one of the approaches for data isolation on the shared PostgreSQL database. The PostgreSQL database comes with a feature of row-level security (RLS). Here is a link to an article by EDB that showcases the RLS feature of PostgreSQL. Essentially,

For all tables in our database, we have to add a tenant-discriminator column containing tenant-id representing the owner of the row.
Then we have to define a policy that uses this discriminator column to filter the rows based on the current setting of tenant-id for the DB session.
Then we have to enable the RLS for the tables.

RLS is an important feature available in most DB vendors that offer data isolation out of the box. Now, let us see how we can use it for our spring-boot application. First, we need to define some of the classes and interfaces for the base entity.

All of the entities have to extend the AbstractBaseEntity class. TenantListener class automatically sets the tenant-id for the entity whenever update, deletion, and creation happens. Next, we have to define a data-source that sets the value of the tenant-id for the database session.

Notice that DataSourceConfiguration class uses the DB credentials from the environment variables. The spring vault is responsible for setting these variables. We will check this in the next section. The TenantAwareDataSource class sets the value for tenant-id on line 41. Next, we have to create the tables with RLS enabled.

That’s it for the data isolation of the tenants! With this mechanism in place, the accidental spillover of the data won’t be an issue.

Now, let us come to the periodical credential rotation mechanism, one of the best security practices. One of the challenges here is related to updating the data source credential whenever the credentials expire.

First, we have to set up the Hashicorp Vault, create the database connection, a database role, a policy to allow the database role to read the credentials, a user with this role and policy, and finally provide an app role for this user and put it in the application.properties file. The unseal-vault-enable-approle-databases.sh and setup-spring-vault-approle-postgresql.sh files performs these steps. With these things in place, our minimal properties file looks as follows.

Notice that we have not put the DB credentials anywhere in the file. Next, we have to put the configuration to read the DB credentials from the vault and update the data source credentials as they expire.

With this mechanism in place, the HashiCorp Vault periodically updates the DB credentials, later consumed by our spring-boot application with a tenant-aware data source. That’s it for the vault related configurations!

Now, let us address the query performance that I mentioned earlier for the multi-tenant database. PostgreSQL has a feature of the multilevel partitioning of the database table. Here is another article by EDB that showcases this feature. Essentially,

We can first partition the tables by tenant-id. A typical approach would be to hash partition the parent table. A table-partition in PostgreSQL is essentially a table.
We can further partition this table-partition by the creation date of the record.

The first point addresses the query performance issues related to noisy tenants. The second point addresses the query performance issues of the cold data within the same tenant. The query performance on the partitioned table has been improved dramatically since version 12 of PostgreSQL. Here is the link by 2ndquadrant.com showcasing the partitioning performance.

Summary:

To summarise, we have discussed the RLS, multilevel partitioning features of PostgreSQL for isolation, query performance respectively, and automatic credential rotation of PostgreSQL DB using the HashiCorp vault. The working spring-boot application showcasing the RLS and Vault related part is available on https://github.com/shashankkhasare/effective-fiesta.

References:

Tenant-aware spring-boot application having shared PostgreSQL DB secured with HashiCorp Vault

Written by Shashank Khasare