Doing CQRS + Event Sourcing without building a spaceship
In most projects using web frameworks - where the state of models is managed using the framework’s own ORM through CRUD operations - sometimes managing the state of complex models (usually aggregate roots) can be difficult. Specially if different people/apps update the model at the same time. One of the ways to tackle this complexity is to separate how you persist and retrieve these models from the datastore, as done by the CQRS pattern.
And a big misconception about CQRS is that people relate it straight away to how Java and C# community skillfully combined it with DDD and created a more complex solution — which is great by the way — but it’s not the only way to do it.
But as Martin Fowler wrote on his post about CQRS “At its heart is the notion that you can use a different model to update information than the model you use to read information.”
And the same goes about Event Sourcing, as written on CQRS and ES Deep Dive by Microsoft “If you capture changes in your write model as events, you can save all of your changes simply by appending those events to your database or data store on the write side using only Insert operations.”
So the DDD Java/C# solution is great when working with a reactive microservice architecture where events are fired from all over the place and you need Sagas and so on. Calculating state in this scenario can become a nightmare indeed. But this is not the case I want to discuss. The objective of this post is to tame a difficult aggregate root in a single app using CQRS + Event Sourcing.
CQRS + ES is very appealing, what could we learn from it?
- Read and write separation — OK, no way around it, there can be some serious disparity about how you do reading and writing that tears them apart.
- Having a stream of events where events are logged and can be used to calculate the current state of the model — also OK, it’s a nice concept.
- Using CRON jobs, replaying events all the time to calculate the current state — NOT going to happen.
- Having to convert events if you need to introduce changes to ensure backward consistency — NOT going to happen.
The ideal solution in my opinion should not require supporting systems and should be friendly to modification. It should have a small learning curve and also it should not be hard to implement (weeks not months).
Am I asking too much? Maybe, but what’s the problem in trying to solve it?
JS + Flux +Redux have taught us all some good stuff about state management, I’m pretty sure we can simplify event sourcing quite a bit and bring it down from the over engineering spectrum.
Why Binocular as project name?
Like CQRS, binocular vision happens when two separate images from two eyes are successfully combined into one image in the brain. CQRS has two eyes: the read and write eyes.
What is the problem I’m trying to solve?
It’s always better when we have real examples to give us context. So I work at Weengs, which is a shipping company and in our domain we have an aggregate root called `Shipment`. Logically speaking a shipment gets passed around in many apps (ours and third party). And also physically, from the seller of a product, to our drivers, then to our warehouse, then to a carrier to finally being delivered to the buyer/recipient.
And we’re talking about same day or next day delivery here, it’s heavy stuff. To make it possible to work with shipments we do a lot of logging and monitoring, it’s time-consuming and a sign of struggle to manage the state of shipments.
Apart from the obvious information about the Shipment itself, there is also a lot of what you could call metadata from the operations department and third party services. Shipments hold loads of information, and it changes all the time.
How I want to solve it
- I want to write in an immutable, append only fashion.
- I also want to read the current state fast.
- If I bump into any issue I need to fix it fast.
- I need all events logged.
- I want to save some metadata with the events so I can debug with some luxury.
Some important guidelines to follow:
- Calculating state should be done ONCE, perhaps before events get appended/saved?
- There has to be a safety net so outdated events don’t get appended causing inconsistent state.
- It HAS to be change friendly. If events need to change it should be no problem and we should know from which point the change happened.
- And if you ever need to replay all events to recalculate the current state it should be easy to do, even with changes in the data structure.
- Serialising/deserialising should be a breeze and done by built-in functions.
The “inconsistent state” problem
Before each write we should ask for the version of the last event, then we can save the new event with the version incremented. There should never exist repeated versions of the same entity.
The “replaying of events to create snapshots” problem
We should go a bit reducer here: previous state + action returns a new state. If the last event holds the current state it’s a way of caching, isn’t it? Reading will be easy and fast.
The “upgrading events” problem
Let’s go real immutable about it. If the reducers are versioned and never change, we wouldn’t need to deal with changes on read time.
- Event: has the current state, ID of the entity, action being applied. Saving the current state brings a bit of criticism to the solution, some people argue events sould be pure and just represent an action taken. Just take in consideration that the current state will be used as the previous state for the next event. And in the end it’s always possible to recalculate the current state only with actions and reducers. So saving the current state is just for convenience and ease of use.
- Action: has a name, version and the data necessary for the reducers to do the change.
- Store: persists and retrieves the events. Internally it uses the events to match reducers with actions.
- Reducers: basically an array of versioned callables given to the stores.
I did an implementation of the idea in PHP in here: https://github.com/thiagomarini/binocular
Check out the tests with in memory implementation.
I think the correct way to use it is to have a factory method in the aggregate root to hydrate the object with the current state data.