Event Sourcing Part III: Handling Different Access Patterns

Published in

SSENSE-TECH

8 min readJul 17, 2020

In Part II of this SSENSE-TECH series, I provided more details and a sample implementation of an application that uses Event Sourcing. With that, you should have the knowledge and tooling to start your project, as you now have the basis to model your entities, events, as well as manage your creation, retrieval, and update cycle. But… there is much more to Event Sourcing, and this article will present additional topics that you will likely encounter when developing your application.

Read Access Patterns

So far, I have implemented the access pattern where we are retrieving one entity at a time, and always doing so by its unique identifier. This already allows us to interface with our domain, but rarely will it be enough for any application.

Since reconstructing the state of an entity occurs by replaying the events in order, we do not have a direct way to perform ad-hoc queries like you would do for any non-Event Sourcing application. However, before you rule it out, you have to remember that we use the CQRS pattern, and with it we get to create as many read models as needed from our stream of events, and each read model can be optimized to the use case it has to provide.

Let’s go back to our example domain. After an order has been placed and the payment created, the business has determined that we should provide a way for the customer to check the status of the order, including the payment. To ensure the easiest customer experience, filters have been added to select only payments that have been settled, refunded, or declined, for example.

Clearly, we can’t satisfy these requirements with what we have so far. To allow the filter, we will have to create a read model of the payments. This read model is also known as a projection.

Figure 2. Write model (left) and the corresponding read model (right)

The first thing to notice about the read model is that it is optimized to serve the client, which means it can be denormalized and use different persistence technology than the one used by the Event Store.

Building a projection

To create your projection in its simplest form, you have to receive the event stream, manipulate them to extract the information needed, and store for later retrieval. This is illustrated by Figure 3.

Figure 3. Simplified view of the creation of a projection

In practice, it is common to think about the following parts involved:

The projectionist: connects to the event stream, receives the events, passes to the projector, and updates the ledger;
The ledger: maintains the position from the stream of the last event that has been received by the projectionist and processed by the projector;
The projector: receives the events, decides which ones are important for the projection, extracts the relevant information from the event, and passes to the projection;
The projection: receives the information it needs to store and places it according to the persistence technology used.

Figure 4. The components found in the generation of a projection

You would run a continuous task like the one used in the snippet below:

You have a PaymentProjectionist that receives the event stream, a ledger, and the projector.

The projectionist subscribes to the stream from the last known position recorded in the ledger. The example code we are using includes volatile subscriptions to the EventStore, so it is our responsibility to keep track of the last position. EventStore also supports the concept of persistent subscriptions. If you decide to use persistent subscriptions the EventStore itself will maintain the last position.

For each event received, the projectionist will pass — as is — to the projector and upon its return, move the ledger to the next position.

The projector specifies which events it wants to handle, manipulates the content, and passes to the projection. In our example, we can see that the projector simply drops any transaction details, decline codes from the events it receives.

Finally, the projection takes the information and persists to whatever technology being used. In our example, we used SQLite for simplicity.

One aspect that can’t be stressed enough is the fact that because projections are used for read-only purposes you should feel comfortable with the fact that you may have to have more than one projection.

If we take our example, imagine that our business stakeholders decide to use the bank and decline code information existent in all declined transactions, with the goal of optimizing the selection of the bank that will settle the payments. It is a completely different access pattern and information than the existing projection.

Figure 5. Write model (left) and the corresponding read model (right)

As you can see in the code snippet, we only care for the PaymentDeclined events, take the bank name and decline code to ignore all other events and data.

This provides great flexibility for both development and business alike.

Some of the benefits are:

No need to build all possible read-models upfront:

If a new use case appears, create a new read model — or adjust the existing one;

2. Freedom to choose the best persistence technology and data structure:

Use key-value store for one use case, NoSQL for another, Column-based for a third;

3. Reduced risks:

Found an error in your projection? Create a new one and once it is up to date drop the old one without business disruption. Since you are not changing the write models, creating new projections does not affect the workflow that produces the events.

At this point we are able to provide all expected functionality with the aforementioned benefits. However, Event Sourcing does come with some challenges due to its nature. Let’s explore some of them with common solutions.

(Very) Long Streams

As mentioned in Part I of this Event Sourcing series, when you start working with Event Sourcing, you realize that it is very important to understand how the state of your entities evolve over time, even more so than the data structures that will hold it.

Some domains will find themselves having lots of events for each entity and that will naturally bring the question about performance while trying to reconstruct the current state of the entity. Afterall, you would have to start from the beginning and apply one after the other. If your entity has a handful of events, that would usually not be an issue, retrieving and replaying will likely be inexpensive.

If that is not your case and you see yourself with hundreds of events for a single entity; this can be problematic for reconstructing the current state or event to maintain/create projections.

Before applying any of the following solutions, I recommend you look closely at your model. Sometimes, this is an indication that you ended up creating an entity that is too big and should actually be broken down into smaller ones. While you would not necessarily cut down the number of events emitted, you would focus on retrieving just the ones needed to reconstruct the newly smaller entity.

Snapshots

Snapshots — like the name should imply — are “photographs” that capture the state of an entity in a given point in time. Very similar to a projection, snapshots are read-only artifacts and can be generated, thrown away, and re-generated as needed.

Figure 6. Obtaining an entity using a snapshot to skip replaying the entire stream

As depicted in Figure 6, if you decide to use snapshots, then your repository would first try to locate the existence of a snapshot for the desired entity. If not found, you receive all events from the beginning of the stream for that entity. If you have a snapshot, you would load it by setting the state as found and then only retrieve the events from the stream starting from the N+1 position.

If your entities have largely different sizes of streams, you should use an adaptive approach to avoid generating snapshots for those entities with just a few events. This way, you establish a threshold and every time an entity goes above it, you generate or update the snapshot to reflect the state up to that moment.

If you decide that you have to use snapshots, check if your Event Store already provides native support to generate them for you. If not, you will have to decide the strategy and implement it yourself.

Create New Aggregated Streams

The previous solution helps us to speed up reconstruction of the state but still leaves us with one issue: the stream is still very long.

Imagine you have been running your system for some time and your entities now have thousands of events each. Your streams will be very long and take a premium space on your persistence medium. After all, accessing and retrieving the events must be fast.

In a system like this, the chances of having a use case that forces you to access the state in the beginning of the stream are low. A good example would be a banking application, if you consider that each transaction done in your account is an event. How many of us need to access the transaction that happened 5 years ago?

To address this issue, you can leverage the creation of new streams that, similar to snapshots, have their first event as a representation of the state at that given time.

When you use this approach, you can archive the old stream or move it to a less expensive — and likely slower — persistence medium.

Using our banking example, this translates into having the first event as the balance at the beginning of the statement cycle and also explains why you can only go back for a limited number of days/months. Need to retrieve a longer time? No problem! Request it and it will be processed offline. Once ready, you receive a notification and have access to it.

Consider using this solution if your streams are very long, as your use cases do not require you to have constant access to the entire history, and if you can establish a steady temporal cutoff for each stream (ex.: every day, every month, etc.).

What is next?

In the fourth and final part of our Event Sourcing series, I will discuss the alternatives you can use when you need to evolve your application and therefore have to evolve the events as well. Additionally, Part IV will cover the particularities of handling GDPR when the user exercises their right to be forgotten.

Editorial reviews by Deanna Chow & Liela Touré.

Want to work with us? Click here to see all open positions at SSENSE!