Multi-Tenancy with JPA

Gregor Tudan
5 min readAug 8, 2017

Multi-Tenancy-Support was a big thing for Java-EE 7 and many were disappointed after it got dropped pretty much last minute. With containers, IT went into a different direction, so multi tenancy is no longer a big thing anymore: You can just spin up an instance (i.e. a docker container) for each tenant and your good to go.

In a project of mine, we did exactly that but later noticed that all those containers wasted quiet a lot of resource for less active tenants, so we wanted to evaluate whether running the „big-iron“-approach was feasible with Java-EE. One of the bigger challenges was JPA, as all persistent data had to be clearly and securely separated.

Another goal was to change as little as possible, as we had an existing application and wanted to keep running it in full separation mode (no application level multi-tenancy) for bigger clients.

The code shown below is available on github, along with a small example project: GitHub — Cofinpro/jpa-multitenancy: Proof of Concept for JPA multi-tenancy

Choosing a separation mode

Hibernate (our preferred persistence provider) supports multi-tenancy for quiet a while. There are three modes to choose from:

  • Dicriminator-Column: there is a tenant-Column in every table
  • Schema-Separation: all tenants use the same database, but tenant has their own schema
  • Datasource-Separation: there is a datasource for every tenant that connect to different databases

The first option was a no-go for us, as the separation was not strict enough for our purposes. Tenants usually don’t like their data lying right next to those of others and it’s easy to make mistakes that result in security issues. Option 3 on the other hand might cause quiet a lot of connections to the database server, as each tenant requires a separate connection-pool. So Option 2 promised the best balance between level of separation and performance for us.

Since JPA has no build in support for multi tenancy, we took a look at how Hibernate addressed the Issue: Hibernate User-Guide: Multi-Tenancy

There are two parts to the schema approach:

  • Determining the current tenant
  • Selecting the right connection or schema to use

Determining the current Schema

The easiest way to do select the tenant to use is to just specify it when creating the entityManager. In pure Hibernate, you could just do this:

But, while it’s quiet easy to get a session or sessionFactory from an EntityManagerFactory, there is no easy way to this in reverse.

The second strategy is to implement a CurrentTenantIdentifierResolver and configure it in the persistence.xml. The interface is quiet simple: It’s basically a method that returns the current tenant as string. Where you get the tenant identifier from the context is up to you. A very simple implementation for a http based service could look like this:

Looking up the HttpServletRequest became very easy with CDI 1.1. From there, we just grab the tenant from an URL-Query-Param. This is horribly insecure and must not be done in production - this is just a proof of concept! One proper way to do this would be to check of the user using httpServletRequest.getUserPrincipal().

Getting a database connection

Next up is the connection provider: Hibernates default is to give every tenant its own datasource, which it derives from the tenant name. When using schema separation, the connection can be shared from a single connection pool which makes better use of the resources available. There are at least two ways this could be done:

  1. specificy the database credentials when getting a connection from the datasource. JCA will match the connection if there is already one available for this user or crate a new one.
  2. Use a single user and set the schema search path accordingly

For the sake of simplicity I’m going with with the 2nd option in this tutorial. First, we need to create own ConnectionProvider. There’s an AbstractMultiTenantConnectionProvider, which can use.

I used hibernates own injection mechanism to get the configuration and grab the datasource from there, so there’s no extra configuration for the provider necessary and the provider integrates nicely with JPAs persistence.xml.

We need to implement methods for getting a connection for a specific tenant and one for getting any connection. Hibernate will use the later one for internal stuff (i.e figuring out the database dialect). Since we use schema separation, we can just go ahead and pass the same connection provider.

Here’s the tricky part: Hibernate needs a method for getting a connection for a tenant. I assume that the schema matches the tenant name and configure the search path for the current connection. This part is database specific, so the statement might vary on your setup. The releaseConnection-Method is the counterpart for this and resets the search-path back to the default schema (if configured).

Persistence XML

That’s it regarding the coding part. Let’s look at the configuration. Here’s my persistence.xml:

As you can see, we need to configure our two MultiTanant-Classes and tell hibernate which separation strategy to use. The rest is straight forward JPA config.

Schema generation has been disabled, as hibernate does not support it when running in multi-tenant-mode. This is probably fine, as multi-tenancy is something that is done in more complex applications that seldom rely on this feature. I guess it shouldn’t be to hard to come up with a way to schema generation with liquibase or flyway in such environments. I’ll blog about it once I figure it out :-) (Update: here it is Multi-tenancy support for Liquibase)

Test Run

Let’s do a quick check. The github-Repository also contains a small example project. The database gets populated using these statements:

Querying http://localhost:8080/user?tenant=tenant1 will return:

[{“id”:1,”name”:”ernie”}]

If we use „tenant2“ we get:

[{“id”:1,”name”:”bert”}]

As you can see, both entities have the ID 1, since they are stored in different tables. Hibernates 2nd-Level cache is able to deal with this by appending the tenant to the identifier, so we don’t have to worry about this. I haven’t checked if the query cache is also fine with this.

Finally

This setup allows you to have multi-tenancy in your JPA set without changing any application code, as long as you have a way to figure out the current tenant (i.e. by security realm, URL, Hostname…). There are limits to portability as this setup heavily relies on Hibernate and the statements for changing the schema search path are database specific as well. But it’s close enough to „real“ multi-tenant support in JPA :-)

If you’re reading this and figured out how to select the tenant on the entity manager itself I would love to hear about it!

Originally published at gist.github.com.

--

--