CQRS & Event Sourcing: Sounds cool, but is it worth it?

14 min readMay 5, 2022

With the rise of Microservice Architecture and systems becoming more and more complex, our old perceptions about storing and managing data might be insufficient to fulfill all the business requirements in an elegant fashion and to ensure fast delivery. Choosing the right patterns and having a clear view over the solution architecture from the beginning is non-negotiable, in particular when dealing with more intricate applications.

This article is about the flawless symbiosis of CQRS — a well known DDD pattern and Event Sourcing — a way of handling operations on data. It will start with the ‘Why’ and ‘When’ of this approach, continuing with a brief explanation and description of the mentioned patterns and ending up with a simple proof of concept — an application which implements CQRS and Event Sourcing.

Spoiler alert — CQRS and Event Sourcing is not a panacea. Do not implement it if this is not what the business nature requires.

“Sometimes, using CQRS/ES is like learning Java — It’s interesting and fun, but you don’t have to.” — Confucius, .NET Developer.

Why and When to use CQRS and Event Sourcing?

Generally speaking, CQRS & Event Sourcing is a combination of best practices and well known patterns. This architectural approach is deeply rooted into a bunch of DDD patterns which turned out to be efficient and valuable over time. Developers and architects started to consider CQRS/ES as it offers the luxury of fast and continuous feature delivery and a higher degree of scalability by keeping the components loosely coupled and making operations as atomic as possible.

Both CQRS and Event Sourcing have their specific use cases and can exist independently, however they work well together if the use case is within the intersection of these two.

The following paragraph is probably the most important thing from this article, therefore I’ll highlight it and please take notes:

It is worth to consider CQRS and Event Sourcing when there is some data which, by it’s nature, is represented better via a sequence of chronological events (changes) rather than it’s final state, whilst the final state is also somehow important and you’d like to offer read possibilities over it.

E.g: Your bank account balance. How do you think, should your balance be a simple numeric value representing your current amount of money or it would be nicer to contain all the previous incomes and expenses that have occurred up to the present moment and the balance to be calculated on fly? The second option definitely fits better this scenario. Well, that’s Event Sourcing.

Now imagine that your balance contains millions of transactions, should the server go through the transaction history and to re-calculate your current balance in order to show it on the home screen of your mobile app? Not necessarily, the current balance can be cached for reading purposes. Well, that’s CQRS.

Note: Event sourcing is about storing data and is not directly related to Event driven architecture.

“Event sourcing and Event driven is like moon and honeymoon.” — Aristotle.

CQRS in a nutshell

CQRS stands for Command Query Responsibility Segregation. Are you swill with me? Even if it sounds like it was taken straight from Elon’s Falcon 9 design sheets, in essence it’s just separating the reads from writes. By ‘reads’ we mean operations for retrieving data (get, get by id) whilst writes are those that changes state, such as create, update or delete.

CQRS pattern is based upon CQS — Command Query Separation, where each operation (usually referring to CRUD) is defined as either command or query:

It is a command when it alters the state. For instance, create, update and delete operations are considered to be commands because they all change the state of the system. Ideally, a command should not return anything. If you ask me, I’m OK with the fact that a command sometimes can return some data, and I’ve got a strong and unbeatable justification for that: Even Microsoft violated this rule when they wrote the Stack.Pop() method.

It is a query when it surprise surprise does not alter the state and has a return value. Usually, read operations take the form of a query. Note that in the case when it is accompanied by some writings, that operation should be rather treated as command. (E.g: Get some data from an external API and persist it on our system before retuning it to the client is not a query because it alters the state).

Commands can call other commands and queries, whilst queries can call only other queries. If a query calls a command, the it’s not a query anymore because it mutates the state.

Ok, but why to separate them?

The code becomes simpler to understand and to work with when this separation is enforced. Also, it allows us to be confident that an operation is idempotent and can be called millions without breaking or modifying anything.

The idea with CQRS is to separate the paths for operations that change the system from those that simply request data and to have them separated into different conceptual models for update and display, instead of having a single compromised model that handles both of them (and none of them properly). If separated, reads and writes can be scaled independently, this being the main advantage of CQRS.

Starting with the idea that queries perform best if the data is stored in a rich schema format using a NoSQL database with great support for queries, having two separate databases for reads and writes starts to sound tempting, isn’t it? If a system uses a relational database with complex dependencies and relational schemas, where each get request results in a tremendous SQL query with bunch of joins, then it might be worth to add one more non-relational database with the data saved in a digestible form, and to keep them in sync. This approach adds a bit of complexity because of the database synchronization mechanism, which means that pros and cons have to be balanced first before deciding whether to implement this separation or not.

CQRS can be applied on two levels:

Intra-service CQRS: When this separation is enforced within the service. In this case, we might end up with separate services for handling queries and commands or to having separate command and query models, each of these being handled by their handlers.

E.g: We have a traditional order management application. When someone places an order, the request goes through Controller — OrderService — OrderRepository. The same path is triggered when someone issues a get request in order to view all the registered orders. Commands and queries are coupled to a single conceptual model — OrderService.

Figure 1 — Traditional separation of concerns.

Following CQRS principles, the easiest way to decouple the reads from writes would be to take that shinny axe and to cut the OrderService into OrderCommandService (responsible for creating, updating and removing incidents) and OrderQueryService, responsible only for retrieving information about existing records.

Figure 2 — Separating requests into commands and queries

A more elegant way to solve this problem would be to have separate request models for commands and queries, each of them having their own handler, and to apply the mediator pattern in order to mediate the correlation between requests and handlers, thus ending up with more granular commands and queries like CreateOrderCommand, UpdateOrderCommand, GetOrderByIdQuery, GetOrdersQuery etc. The request itself serves for both showing the intention (create, update, delete or get something) and encapsulating the data that has to be delivered to the handler. For instance, CreateOrderCommand encapsulates the data that is needed to create an order (author, description, address) and the CreateOrderCommandHandler does the actual weightlifting by creating the order model from the data taken from the command and calling the repository in order to persist it.

Figure 3 — Using the mediator pattern to separate requests into command & queries and correlating them with their handlers

Most probably, CQRS won’t bring that much value to an order management application. It’s very unlikely for such an application to be that overloaded, therefore it might not be necessary to make heavy weather of it. If we replace this example with Google’s Youtube platform, where most of the users are consumers, scaling the reads is crucial. By the way, Google indeed uses CQRS and Event Sourcing for some of it’s products.

“If Google does it, then it’s legit.” — Socrates

Inter-service CQRS: When read and write operations are directed through different microservices. E.g: commands are handled by a command service, which mutates it’s internal state, and queries are handled by a separate query service, which has it’s own database (usually a NoSQL) with denormalized read models. When a change occurs, the data from command service has to be synchronized by the one from query service. Having this separation in place, whenever there is a spike in reads we can increase the number of service instances.

Event Sourcing in a nutshell

If you are not aware of Event Sourcing pattern (ES) at the moment of reading this, here is a disclaimer for you: At the beginning it will sound a bit strange, unusual, and you’ll most probably be something like Hold up, what? But bear with me, the things will get clear and we’ll recognize the true value of ES as we progress throughout this section. Just accept the fact that Event sourcing is nothing more than an alternative to the traditional state based approach. Just another way of storing data.

In a stateful system, we always store the last state of something. E.g: If an order record is created, it is inserted into the database. If it gets updated afterwards, we find the record and update it. If we update it the second time, it goes thorough the same process. If we delete it, it’s completely wiped out from our data store. Note that we always have to deal with the last state of the order. Once updated, the original state is lost for good (unless we have some kind of an audit mechanism in place).

Almost always, a stateful system is more than enough whilst for some use cases, it simply can not fulfill all the requirements by itself, especially when the history matters or when an entity is represented better via a sequence of event rather than its final state. This problem can be resolved either with a stateful system accompanied by an auditing mechanism, or with the event sourcing technique.

These are some advantages of ES:

Data does not get lost,
Flexibility over designing read models,
High degree of observability.

And these are some disadvantages:

Data retrieval is slower, but this can be attenuated with snapshots,
Eventual consistency can not be guaranteed,
The code becomes more complex by default due to the synchronization mechanism between event and read store.

The fundamental idea behind ES is ensuring that every change to the state of an application is captured in an event object, therefore offering history tracking out of the box and allowing traversing to the particular state of the system to any point of its lifetime. The concept of state becomes transient. You can replay the events and get the object’s representation at any given time and the stream of events is the unique source of truth for an entity’s state. It is similar to how git version control system is designed, where each commit is an event and the chronology of commits defines project’s final state.

Do I always have to go through all events of an entity in order to get it’s current state?

Well, yes and no.

Yes because the history of changes is the most reliable source of truth and this way you’ll get the most accurate current state of the entity.

No because an additional data store can be added that will hold the final entities’ state, something that acts like a cache, often referred to as the Read Store (do you feel the smell of CQRS in the air?). As the name states, the read store should be used exclusively for reading data. The data consistency between event store and read store has to be ensured. Usually it is done by notifying the read store about the last events that occurred within the system.

It is absolutely OK to apply event sourcing only to a specific boundary context of an application. In other words, we can have both state-based and event sourced data in our system but with strongly defined boundaries between them.

Event sourcing is made out of events, event streams, snapshots, projections and aggregates.

Events

An event is something that happened in the past and represents a small and granular transition from state A to state B. There are some general truths about events:

Events should be named in the past tense (OrderCreated, OrderRemoved),
Events are immutable (once created, it can not be deleted or modified),
Events are broadcasted, (they do not have a specified listener or handler).

In most of the cases, an event has the following structure:

Id: events have to be uniquely identified,
Type or Name: identifies the event’s intent, like OrderCreated,
Timestamp: the date of creation,
Stream Id: the sequence to where the event belongs,
Stream position: the event’s position within the sequence, also referred to as event version,
Data: the data carried by the event, most often formatted as JSON or XML,
Metadata: optional, but might come handy in more complex systems. It stores information that is not directly related to the event, such as correlation ID or other data used for logging or tracking.

Streams

The stream is the sequence of events referring to domain entity which starts with the first ever event and ends up with the most recent one. The order of events within an event stream is extremely important and the event itself has to encapsulate it’s position. The stream position attribute helps to both define the order of events and to detect any concurrency issues. If an order is updated at the same time by 2 people, both events will have the same stream position. We have control over how to solve this issue: accept both events (but increase the position of the second event), accept only the first one or to hook some conflict-resolving mechanism that will handle this.

A stream always has an identifier.

Snapshots

A snapshot is the state of an entity at a specific point in time. Snapshots are a must-have when dealing with streams that can have thousands or event millions of events. Snapshots can capture the state up to the nth event. The next time you’ll need a projection of a stream, you can start with that snapshot and apply the events starting with the nth up to the most recent one. It is a measure of optimization. It is a considered a good practice to store snapshots in a separate collection without altering the original stream of events.

Projections

An event stream can be projected to several read models. A projection is created by itterating through the stream and calculating the object’s state by apply the events one after another in the order of appearance. Projections are needed to generate read models for the read store or to get the last state of a domain object before applying the next event in order to ensure consistency. E.g: If the last event is OrderDeleted, the next one can not be OrderUpdated because that specific order once deleted can not be subject to an update operation.

Figure 7 — Project OrderDetails and OrderSimplified read models out of an event stream

Aggregations

Aggregations serve for reconstituting the current state from a single event stream. The aggregate contains the current state, the history of changes and it knows how to apply events to the current state. In other words, an aggregate is an instrument that is used to generate projections by merging the events and offers an interface to play with the final state.

Proof of concept

Since we’ve familiarized a bit with CQRS and Event Sourcing, let’s proceed to applying it in practice and having some hands-on experience with this architectural approach. As a proof of concept I built a simple application for managing shipping orders called Logistify (very original). The following are the business requirements of Logistify:

Logistify allows both companies and individuals to create shipping inquiries for their goods that have to be moved from point A to point B which, in consequence, can be picked up by other companies or individuals that are available to offer this service.
Any user can either place a or deliver a shipping order.
This platform is going to be used world-wide, therefore the system should be always up and available to display the nearest shipping orders.
Initially, Logistify will operate only in Moldova, but the system has to be designed in a way to facilitate further expansion to other countries.
A shipping order’s lifetime is composed out of several possible events:
Placing the order — When an user creates the shipping inquiry itself, along with other relevant details about the order such as description, weight, volume etc.
Updating order details — When the details about the an existing shipping order get updated by the author.
Canceling the order — When an user decide to cancel his order for any reason. This event occurs when the order was placed but wasn’t picked up yet.
Marking the order as delivered — When the company or individual finishes the delivery and the order has arrived to the final destination.

Starting from business requirements, there will be way more reads than writes, using CQRS here might be a good choice. Assuming that the application is intended to go world-wide, the solution has to be highly scalable. Also, due to a shipping order being represented more via its history records rather than it’s final state, it is worth to implement the Event Sourcing pattern. The application will have a REST API gateway talking to instances of order query service, order command service. GRPC will be used for inter-service communication.

Ideally, this is how the system would look like:

Figure 8 — The ideal architecture for Logistify

For the sake of simplicity and to remove the service bus part, the command service notifies query service via calling a gRPC endpoint directly when an event occurs. Also, goodbye UserService.

Figure 9 — The over-simplified architecture for the proof of concept

The code for Logistify can be found on GitHub.

End note

Indeed CQRS and Event sourcing are some absolutely amazing patterns, especially when combined. For some types of businesses, CQRS/ES applies well, whilst for others — it simply doesn’t, and we have to accept and come to terms with this. Make sure to do your own research before implementing this pattern into a real world application. As any other technology, it has its own advantages and disadvantages.

If this article has brought any value to use, you know what to do ❤

P.S: Sorry for my handwriting, it’s the first time I use a drawing board.