Querying data across microservices

12 min readSep 10, 2018

TL;DR

Many benefits of microservices are dependent on them being developed and functioning independently. Dependencies between microservices can undermine the benefits of adopting a microservice architecture.

The more microservices directly query other microservices the more dependent they are on one another. Dependencies between microservices make them more subject to change, more effort to write, and less resilient in production.

Selective data replication allows you to use data from another microservice whilst minimizing the scope of the dependency. Microservices using this approach are dependent on the schema of their microservice dependencies (and data replication support); they are not dependent on the remote API of those microservices. This minimises the cost of the microservice dependency in development and production.

Introduction

When writing monolithic applications, we take for granted how easy it is to query a single relational database for all our data.

With a microservice architecture, your data is spread across multiple databases. Each microservice can only access its own database. There’s also no out of the box solution for joining together data from multiple microservices.

The following covers some approaches for querying and joining data across microservices.

Selective data replication

With this approach, we replicate the data needed from other microservices into the database of our microservice. The only coupling between microservices is in the data replication configuration.

Consider using a publish-subscribe replication approach; this will result in loose-coupling between the database schemas. Loose-coupling provides a degree of insulation when dependencies change.

Advantages:

Simplifies implementation
Provides a uniform method of querying data (e.g. you’re not mixing queries to the database with REST calls to other microservices).
You can also use the database’s native method of joining data (e.g. SQL Join / map-reduce).
Simplifies testing
You can test the data replication separately from the microservice functionality.
The other microservices don’t need to be online while you are testing the microservice API.
More resilient
Other microservices don’t need to be online to query their data (as long as the data is already replicated).
Separation of concerns
The data integration with other services is separate from the implementation of the microservice API.
Internal queries are not part of remote APIs
As data access between microservices is done through data replication, these queries don’t need to be part of your remote (e.g. REST) API.
Query criteria are applied by the microservice consuming the data
You’re no longer dependent on the microservice supplying the data to provide a remote API with the query criteria you need. The consuming microservices implement their own database queries (with whatever criteria they need).
This eliminates the need to change the microservice supplying the data when the requirements of the consuming microservices change.
You also won’t have to track and remove remote query APIs (from the supplying microservice) that are no longer required by consuming microservices.
Lower latency
You can get the data directly from the database rather than calling other microservices.
There are normally fewer network calls at query time.
Avoids unnecessary I/O at query time
REST APIs of microservices typically return complete entities when you query them. Using selective data replication, and your database’s query API, you can avoid querying more data than you need.
More resistant to denial-of-service attacks
The microservice under attack doesn’t pass the load on to its dependencies.

Disadvantages:

Not all database management systems are supported for replication
Even for a given database management system, you may be limited in what versions are supported.
Does not work with all database-as-a-service solutions
You may need to install database extensions, or make configuration changes, to get near real-time replication.
Risk of data getting out of sync
At a minimum, you need to monitor the status of the data replication process. It’s best to schedule a recurring task to detect/fix synchronization errors. You want to detect synchronization problems before they affect your customers.
The replication will likely be eventually-consistent; for each change, there will be a short window where the data will be out of sync between microservices.
It’s more tech to learn
Most other options reuse technology you’re already using. Data replication introduces technology most developers aren’t familiar with.
It’s another thing to configure, run, test and maintain
This is non-trivial, particularly as you need to run and test it at all levels (from development to production).
It requires an extra step in the deployment process
When the database schema changes, it may be necessary to update the configuration for data replication.
Initial setup effort
Between the new tech to learn and all the configuration, it takes a reasonable amount of effort to introduce data replication.
Overhead of duplicated data
The more data is duplicated between services the higher the maintenance overhead and database backup costs etc.
Increases your data security and privacy concerns
You’ll have duplicate copies of data stored in multiple databases, and likely on your messaging server as well.
This approach may not be suitable for the most sensitive information (e.g. credit card details).

Direct queries between microservices

Each microservice queries other microservices for data when it’s needed.

This is the instinctive choice for developers from a monolithic application background. The developer replaces local method calls with remote calls (e.g. REST).

Advantages:

Live data
The data you get back is the current state of the supplying microservice. However, you still have to deal with eventual-consistency and the lack of transaction isolation between microservices.
Either it works or it doesn’t
Failures are associated with a given request; this makes them easier to identify and resolve. Unfortunately, stack-traces are not passed between microservices. You have to setup distributed tracing to get easy access to the root cause (see OpenTracing).
There are no additional artifacts to release/deploy/run
The integration is contained within the existing microservices.
Code generation support
You can generally define your remote API specification (e.g. in OpenAPI specification) then code generate the model, server and client from it.

Disadvantages:

Microservices are tightly-coupled
This makes them very susceptible to change imposed on them by API changes in the services they call. It’s also easy to fall into the trap of making your microservices very domain/application specific. This undermines many of the benefits of the microservice approach.
Remote services need bulk query APIs for performance
If you’re joining multiple rows of data across microservices, you’ll run into the n+1 problem. The n+1 problem is where each row in the results leads to an additional query (in this case to another microservice). One solution for this is to add a bulk query API, so you can fetch the additional data for all the rows at once.
Handwritten joins
You can’t leverage the database to do the joins for you, you have to write the joins and the associated tests. This may lead to bugs and performance issues.
Requires dependencies to be online
Each microservice needs its dependencies to be online for it to function. This can lead to a cascade when one microservice goes offline taking down many others with it.
Even when testing locally, you’ll need the dependencies running. You may be able to use simulators instead of the actual microservices. However, simulators need development effort, and are often different enough to cause problems.
Wherever possible you should integration test against real microservices rather than simulators.
Due to transitive dependencies, you may have to run most of your microservices to do development/testing.
Querying more than you need
It’s common to query an entire entity just to get the entity name associated with an identifier. This is wasteful of I/O and serialization/deserialization.
Latency
Every additional network hop (e.g. to another microservice/database) adds significant latency to the request.

Composite service layer

With this approach, you introduce composite services that aggregate data from lower-level microservices.

Advantages:

Low-level microservices aren’t coupled to each other
In theory, this makes them more generic, stable and reusable. In practice, they may not be that usable without the composite service layer (which is tightly-coupled).
Live data
The data you get back is the current state of the supplying microservice. However, you still have to deal with eventual-consistency and the lack of transaction isolation between microservices.
Either it works or it doesn’t
Failures are associated with a given request; this makes them easier to identify and resolve. Unfortunately, stack-traces are not passed between microservices. You have to setup distributed tracing to get easy access to the root cause (see OpenTracing).
Code generation support
You can generally define your remote API specification (e.g. in OpenAPI specification) then code generate the model, server and client from it.

Disadvantages:

Not a solution if the low-level microservice needs to use the data internally
If you need to do this use another solution that doesn’t suffer from the same problem.
It’s tempting to pass implementation specific data in the request to low-level microservices. This breaks API encapsulation and shows poor separation of concerns. It pushes some of the microservice implementation on to the caller. It’s also harder to change the microservice implementation (if it requires a request API change).
The composite services are tightly-coupled to the underlying microservices
You still have as many places in the code that are tightly-coupled, you’ve just moved these to the composite services. You may have reduced the number of coupled microservices, but also made the composite services a bottleneck for change.
Poor cohesion
For a given remote API endpoint, you have some code in the composite service, and more spread between at least two low-level microservices. You can end up with 3 (or more) pull requests, code reviews, releases and deployments, to implement each endpoint.
As the composite services change, some remote API endpoints (on the low-level microservices) may no longer be required. As unused endpoints are difficult to identify, they are often kept and unnecessarily maintained.
Remote services need bulk query APIs for performance
If you’re joining multiple rows of data across microservices, you’ll run into the n+1 problem. The n+1 problem is where each row in the results leads to an additional query (in this case to another microservice). One solution for this is to add a bulk query API, so you can fetch the additional data for all the rows at once.
Handwritten joins
You can’t leverage the database to do the joins for you, you have to write the joins and the associated tests. This may lead to bugs and performance issues.
Requires dependencies to be online
Each microservice needs its dependencies to be online for it to function. This can lead to a cascade when one microservice goes offline taking down many others with it. The composite service is another layer that needs to be online.
Even when testing locally, you’ll need the dependencies running. You may be able to use simulators instead of the actual microservices. However, simulators need development effort, and are often different enough to cause problems.
Wherever possible you should integration test against real microservices rather than simulators.
Due to transitive dependencies, you may have to run most of your microservices to do development/testing.
Querying more than you need
It’s common to query an entire entity just to get the entity name associated with an identifier. This is wasteful of I/O and serialization/deserialization.
Even more latency
It’s an additional network layer each request has to go through. Every additional network hop (e.g. to another microservice/database) adds significant latency to the request.
You have another remote API to develop and maintain
This approach turns what could have been an internal API (within a microservice) into a remote API (between two layers of microservices).
Largely duplicate specifications
The composite service will duplicate large parts of the API of low-level microservices. The duplicated parts of these specifications may need to be kept in sync.
More difficult to debug requests
Code debuggers can’t step across microservice layers.
Increased hosting costs
You have another layer of microservices to host.
Another round of artifacts to version, release and deploy
API changes will often involve a change to both the composite service and the low-level microservices.

This approach typically requires more effort than the others. It’s debatable whether it reduces the problem of microservice coupling or just moves it.

Joining data in the user interface

When the user interface (UI) can’t get all the data it needs from a single endpoint, the UI calls additional endpoints to get the related data.

This approach works well for reference data. For example, you have an API that returns a mapping of country code to country names, and an API for address information. You may be using the country API to populate a drop-down for selecting a country. The API that returns saved address information provides the country code but not the name. It would be easy to reuse the country mapping data to map the country code to the name in the UI.

Advantages:

Microservices aren’t coupled
However, the coupling has moved up to the UI layer.
Live data
The data you get back is the current state of the supplying microservice. However, you still have to deal with eventual-consistency and the lack of transaction isolation between microservices.
Code generation support
You can generally define your remote API specification (e.g. in OpenAPI specification) then code generate the model, server and client from it.
HTTP caching
If you use this for relatively static reference data it caches really well.

Disadvantages:

Not a solution if the low-level microservice needs to use the data internally
If you need to do this use another solution that doesn’t suffer from the same problem.
Remote services need bulk query APIs for performance
If you’re joining multiple rows of data across microservices, you’ll run into the n+1 problem. The n+1 problem is where each row in the results leads to an additional query (in this case to another microservice). One solution for this is to add a bulk query API, so you can fetch the additional data for all the rows at once.
Handwritten joins
You can’t leverage the database to do the joins for you, you have to write the joins and the associated tests. This may lead to bugs and performance issues.
Higher latency
Calls from the front-end to the back-end have higher latency than calls between microservices.
Moves work to the front-end developers
Front-end developers are often a scarcer resource than back-end developers.
Less control over the version of the front-end
Particularly a problem if you have native mobile/desktop applications. If a user can put off upgrading the application, you may have to maintain backward compatibility with several versions. This can make it very difficult to change your APIs once they’re used by the front-end. It also increases the amount of testing required.
Richer API not available to other microservices
You may find one of the other microservices needs this combination of data as well. At that point, you enhance the back-end to provide this additional data. If you end up adding the additional data to the back-end anyway, it makes performing the join in the front-end a waste of effort.

Views between database schemas

Instead of entirely separate databases, each microservice has a separate schema in the same database. Database views are used to expose data between schemas (i.e. microservices).

This strategy is a good approach for migrating from a monolith to microservices. It’s also relatively easy to migrate from this approach to the data replication approach.

Advantages:

Low implementation effort
It’s quick and easy to add database views. There are no extra services to run.
Data can’t get out of sync
Everything is all within the same database and the data is only stored once.
Protection from breaking schema changes
Many database management systems will protect you from making schema changes that break a database view.

Disadvantages:

Requires support for multiple schemas and views between schemas
Not all database management systems support these features.
Database schemas are tightly-coupled
You’re reliant on getting the data from a specific schema and the structure of that schema.
Coordinated database changes
For breaking changes, you’ll have to update the associated database views in the same upgrade script as you make schema changes.
The database is a single point of failure
If this database goes offline all your microservices that use it go offline.
One big database makes integration testing slower
It takes longer to rebuild/reset the database between tests.
Single database management system
Some microservices are more suited to a relational database and others are more suited to a document database. With this approach, you’re limited to a single database management system (though some support more than one database model).
Less secure
You can’t use network rules to limit which database a microservice can access. You’re limited to using database permissions, which are easier to get wrong.

Conclusions

Having data dependencies between microservices leads to additional configuration, process and runtime dependencies. All of these have a cost and they differ with the approaches you use.

With selective data replication, there is a smaller dependency between services than the other approaches. Generally, this makes selective data replication a better fit for the microservice architecture.

While the above approaches work, they’re all more work than a monolith with a single database. If your domain model has a high degree of coupling, you may want to consider if it’s well suited to a microservice architecture.

Where to go from here

Watch Jimmy Bogard’s presentation Avoiding Microservice Megadisasters. Its a cautionary tale of microservices directly querying other microservices.

Watch Ben Stopford’s presentation Building Event Driven Services with Apache Kafka and Kafka Streams. He’ll show you how data replication can fit into the wider picture of microservice architecture.

Watch Frank Lyaruu’s presentation Embracing Database Diversity with Kafka and Debezium. He discusses replicating data between databases with different database models using Debezium.

Querying data across microservices

TL;DR

Introduction

Selective data replication

Direct queries between microservices

Composite service layer

Joining data in the user interface

Views between database schemas

Conclusions

Where to go from here

Written by John Freeman