Transparent datasource routing in Spring

4 min readSep 3, 2019

Level: medium / advanced

The Spring framework is said to be extremely flexible. But what does it mean? Authors claim that Spring allows you to adjust almost any piece of behavior to your needs and make the framework work totally for you. It’s no different with the DataSource interface, which can be overridden in any manner. The question is how to manage multiple datasources when the endpoint depends on the user?

Use case

A while back I was working on a medium-size SaaS B2B system for clients like banks or call centers. One of our key concerns was data privacy — our users were crazy about it. Most of them didn’t accept the idea of mixing their and other clients’ data within the same database, so logical separation was not an option. We thus decided to go for physical data separation. Each client had their own discrete database, totally separated from the others.

In this case, three challenges were identified:

How to deal with routed datasources?
How to manage them dynamically at runtime for each new client?
How to make it transparent in code?

Databases that can be added or removed by the user are not a common problem faced by developers, but they are extremely important when dealing with multiple DB’s. What is more, we wanted to get rid of the human factor so no single developer would need to be aware of datasource routing per client while implementing new features. How did I solve that? Let’s dig into it deeper.

Solution

To route between datasources, all clients have to be stored along with a database URL. Because the datasource should ensure extremely fast read speeds, we decided to go for DynamoDB with local cache. DB instances for clients are created in advance. New clients are stored as they register in the app in the following steps:

Get a random database from the databases pool
Store the client with a database endpoint
Send an event to the router with the new client
Add a brand new DB to the pool

Then, every request is processed as follows:

Add the client key (ex. name) to the request header
Read the client key before Spring controller and store it in thread-local
Process the request as usual
Get the current client domain, retrieve the db connection string and connect to the DB in RoutableDatasource which implements the standard Datasource interface

With this approach, the developer does not have to be aware which client is currently serviced besides asynchronous calls or actions on more than one client. Let’s now see how it’s coded.

Transparency

The servlet processes requests in a single thread. Spring uses ThreadLocal to store information about the request, and so did I. Using @ControllerAdvice and @InitBindermakes it easy to save client data.

@ControllerAdvice
public class DomainBinder {    
    @InitBinder
    public void bindDomain(@RequestHeader Optional<String> domain) {
       domain.ifPresent(ThreadLocalClientResolver::setClientDomain);
    }
}

This example shows how to get the client domain from the request header, but we can extract this data from path, cookie or any other part of the request as well. @InitBinder allows to inject HttpServletRequest and get any data.

To store a client domain, I use ThreadLocalClientResolver, which is a simple wrapper for ThreadLocal

public class ThreadLocalClientResolver {private static final ThreadLocal<String> CLIENT_DOMAIN = new ThreadLocal<>();public Optional<String> resolve() {
        String domain = CLIENT_DOMAIN.get();
        if (StringUtils.isBlank(domain)) return empty();return Optional.of(domain);
    }public static void setClientDomain(String clientDomain) {
        CLIENT_DOMAIN.set(clientDomain);
    }public static String getClientDomain() {
        return CLIENT_DOMAIN.get();
    }public static void cleanClientDomain() {
        CLIENT_DOMAIN.remove();
    }

Now we can get the domain in almost any part of code BUT NOT in the @Async code. The asynchronous function uses a new thread to service request but DOES NOT rewrite theThreadLocal variables.

Async functions

To use ThreadLocal variables in async code, we need to rewrite all variables passing to the async function and setting it as a thread-local variable. To simplify this operation, create a function that takes action as a parameter:

public static <T> T call(String domain, Callable<T> action) throws Exception {
    String old = ThreadLocalClientDomainResolver.getClientDomain();
    try {
        ThreadLocalClientDomainResolver.setClientDomain(domain);
        return action.call();
    } finally {
        ThreadLocalClientDomainResolver.setClientDomain(old);
    }
}

This approach can be used while executing actions for all clients or for changing the context for a while. For example:

public void runForAllClients(Callable task) {
    clientDao.findAll().forEach(c -> call(c.getDomain(), task));
}

Dynamic datasource routing

We know how to set the client before the controller function. To make the code fully transparent it’s crucial to provide the client datasource just before the database call. My solution is to use AbstractRoutingDataSource from the org.springframework.dbc.datasource.lookup package. Initialising is as simple as:

public class DataSourceRouter extends AbstractRoutingDataSource {public DataSourceRouter(Map<String, DataSource> clientsDataSources) {
        setTargetDataSources((Map) clientsDataSources);
    }@Override
    protected String determineCurrentLookupKey() {
        return ClientDomainResolver.resolve();
    }
}

This class extends AbstractRoutingDataSource, which extends AbstractDataSource which implements javax.sql.DataSource and can be provided like a usual datasource for Hibernate or JOOQ, which gives you full transparency. The database is determined just before the actual call to the endpoint.

Initialize Datasource

Before you route to any database, you need to create a router at the application start like any other datasource. It can be done as follows:

public DataSourceRouter create(List<Client> clientsData) {
    Map<String, DataSource> clientDataSources = clientsData.stream()     
     .collect(toMap(Map.Entry::getKey, e -> create(e.getValue())));return new DataSourceRouter(clientDataSources);
}

This approach does not affect the connection pool or any other datasource features. The create function uses the Apache DBCP2 datasource with connection pooling:

public static DataSource create(Client c) {
    PoolableConnectionFactory factory = new PoolableConnectionFactory(
            new DriverManagerConnectionFactory(c.endpoint, c.user, c.password), null);
    GenericObjectPool<PoolableConnection> connectionPool = new GenericObjectPool<>(factory);
    connectionPool.setConfig(poolConfig(maxTotalConnections, maxIdleConnections));
    connectionPool.setAbandonedConfig(abandonedConfig());factory.setPool(connectionPool);
    factory.setMaxOpenPrepatedStatements(maxTotalConnections);
    factory.setConnectionInitSql(singletonList(INIT_SQL));return new PoolingDataSource<>(connectionPool);

Summary

Routing datasources is a feature that comes with Spring out of the box thanks to the AbtractRoutingDataSource class which implements DataSource. To make it transparent for developers, add a client identifier to thread-local and read it using the determineCurrentLookupKey method.