Querying Microservices with the CQRS and Materialized View Pattern
How to speed up and scale-out inter-service queries using a dedicated materialized view database that caches queries.
The Microservices architecture mandates keeping service’s data private, promoting the use of database per service pattern.
That brings up challenges when one service tries to access the data owned by another service. There are two patterns to implement that; the API composition pattern and the CQRS based materialized view pattern.
This post digs into the details of using materialized views to scale out the inter-service query operations.
Background
Before we dive deep, let me set the stage for our discussion.
Imagine we have the following customer loyalty platform built on top of Kafka.
The TransactionPerformed events coming from POS and E-Commerce channels are written to the Transactions topic in Kafka. Those events may or may not contain the customer’s loyalty membership ID.
LoyaltyService maintains customer membership profiles in its local database, which keeps track of the membership ID and the current point balance for a given member. Then, it consumes TransactionPerformed events from the Transactions topic to determine the loyalty points earned for a given transaction. The outcome could be earning points (PointsEarned) or a redemption (PointsRedeemed) based on the transaction.
The LoyaltyService then writes either PointsEarned or PointsRedeemed events to the LoyaltyActivities topic. At the same time, it updates the relevant membership account to reflect the point balance.
The StatementService consumes events from the LoyaltyActivities topic to build an ordered history of loyalty activities in its local database. These activities are later materialized into a loyalty statement.
Updates made to the customer’s master account are captured as ProfileUpdated events and then written to the Profile topic. Those events carry change information of first and last names, email, and other related attributes.
The Membership portal UI
The company decides to introduce a customer-facing web application to show their loyalty status and past loyalty activities. As you can see below, the UI suppose to pull out certain information from the Microservices we discussed above.
Implementation choices
Now, what are the options we have for building the above UI?
Distributed queries
The easiest way is to run a distributed query to join databases of CRM system, loyalty, and statement services and expose it as an operation from LoyaltyService.
But that’ll break the encapsulation.
Microservices principles mandate us to keep a database per service to store the data owned by that service. So one service can not directly access the database of another service. That makes distributed database queries impossible.
API composition pattern
Another option is to build a new service, MembershipView, to orchestrate CRM, Loyalty, and Statement services invocations. It performs an in-memory join from the invocation result and delivers it to the UI. That follows the API Composition pattern.
Although that is simple to get started, there can be challenges when the system scales over time.
- Latency — Synchronous invocations of multiple services introduce latency. It could be a sub-seconds range to minutes, based on each service’s response time.
- Availability — Synchronous service invocations force you to deal with the availability of downstream services. Also, you’ll have to introduce additional measures like circuit breakers and exponential backoff to the code.
- Scalability and performance — When the number of records grows in each database table, querying takes more time. Apart from that, the UI access pattern forces you to run the query whenever the UI is refreshed. You’ll have to scale the MemberShipView and the downstream services proportionally to meet the increasing demand.
- Integration — Some packaged systems like CRMs may not have APIs exposed with the format that MemberShipView is interested in. So the service has to rely on integration middleware to get the data out and perform inefficient in-memory joins.
So it is clear that on-demand querying of these services to render the UI does not scale well.
What if we maintain the data set required to populate the UI local to the MembershipView service?
That way, we don’t need to make expensive API invocations every time a user loads the UI. Instead, data will be served directly from the local database.
Let’s see how we can use materialized views and CQRS pattern to that done.
Querying with CQRS and materialized views
Instead of fetching data from services in real-time, the MembershipView service can maintain a local materialized view to feed the queries that render the UI. That would be far more effective than making synchronous API calls over the network.
The challenge is to ensure that materialized view is always fresh and consistent with the source databases. That can be solved by the MembershipView service subscribing to the domain events raised by services and updating the materialized view accordingly.
Well, all that sounds relatively straightforward. Let’s get to the implementation and see what the hidden challenges are.
The implementation requires three things:
- Building the materialized view
- Event-driven update of the materialized view
- Serving queries
1. Building the materialized view
The main goal here is to keep the data required by the UI in a denormalized format to enable fast retrieval. Millions of customers might have concurrent access to the portal. Hence, query performance is crucial for a better user experience.
The result of the query can be in a denormalized form to enable fast retrieval. A materialized record for a given member will have the following format.
If we use customer id as the key, then the UI can fetch it with just one trip to the database. No joins will be required.
The materialized view is owned and maintained by the MembershipView service. So the team members are given a choice to pick up the view storage and the schema. As far as storage is concerned, any storage with a good read performance would be ideal. It could be either a document database such as MongoDB, a key-value store like Redis, or a search index like Elasticsearch.
2. Event-driven update of materialized view
The most challenging task is keeping the materialized view up to date to reflect the changes in the source services. For example, if the point balance has been updated in the Loyalty service, the local membership view must be updated.
The strategy is to use query projectors to update the view in an event-driven style. We can use two query projectors here, one to subscribe to ProfileChanged events and another to subscribe to Points* events. Just with these three events, you can build the UI that powers the membership view.
Each query projector consumes domain events from relevant Kafka topics. The Event-carried state transfer pattern may be applicable here as domain events like PointsEarned carry the state changes in their respective services.
For example, assuming the PointsEarned event has the following format, the relevant projector can determine the exact record to update in the view, along with new data.
Serving the queries
The final task is to feed the queries coming from the UI. The MembershipView service can expose a simple API method to return a membership by customer id. The query will perform a primary key lookup on the materialized view to make it efficient.
Once we are done with the above, the final architecture would like below.
Benefits
Performance is the key benefit you gain by not performing on-demand API calls to fetch data. The data required to render the UI is cached locally. The MembershipView service will use a fast primary-key-based lookup to load the cached membership view from the materialized view.
Availability of the architecture will be increased as the MembershipView service continues to function amidst the outages of upstream services. In the event of a failure of Profile, Loyalty, and Statement services, the UI will function with locally cached data.
The autonomy of MembershipView services is improved as its data model is no longer driven by the schemas of upstream schemas. It can build and maintain a rich view much closer to the business use case.
Extensibility plays a key role here. Adding a new UI/query to the application is about updating the appropriate projector and the query service. They can leverage the view storage and the update mechanism already established.
Challenges
Eventually-consistent materialized views
Due to the asynchronous, event-driven nature of the view update mechanism, the membership view data may not sync with source systems immediately. For example, it might take some time to show up loyalty points on the UI after a transaction.
Handling concurrent view updates
Multiple event projectors may try to update a record in the view at the same time. For example, consider the handling of PointsEarned and PointsRedeemed events for the same customer ID.
In a situation like that, the projector must be written to apply updates in an orderly fashion.
Handling the duplicate delivery of change events
Query projectors read change events from Kafka topics, ensuring that events are received in the order they are written. But there can be situations where the same event can be delivered to a projector more than once. You’ll never know.
So the projector must be capable of handling events idempotently.
Takeaways
Although the Materialized view approach appears clean, it comes with an additional complexity and maintenance overhead as a trade-off. It will cost you a new view database, several query projectors modules, and a query service.
Therefore, when it comes to Microservices queries, always strive to go with the API composition pattern if the requirements are simple.
Choose the materialized view if you think that the benefits would outweigh its challenges.