Event Streaming: The Revolut way

Published in

Revolut Tech

11 min readJul 22, 2020

We told you all about EventStore in our first installment of this series, but if you didn’t get a chance to read it, let’s just have a quick recap before we launch into part two.

EventStore is our bespoke event-streaming platform. It’s one of the backbones of Revolut, processing more than 37 billion streamed events each month. Unlike other companies, we didn’t just grab a solution off the shelf, but created a custom platform from scratch, tailored to our every need. Applications can stream all 4 years of available data, which is growing by an order of magnitude year over year. Imagine the possibilities!

Its main purpose is to decouple applications, but we also use it for fraud detection, FX-risk calculation, marketing and promotion, and building derived systems (ie. creating materialised views from a stream of events).

One of our major design principles has been speed, aiming to minimise latency from the time an event is created in the publisher app, until it’s eventually delivered into the consumer app. Our target is around 20 milliseconds, and less than 10 milliseconds for delivery from EventStore to the client.

In this article, we’ll be exploring EventStream, the other part in the event streaming puzzle at Revolut. We’ll take a look at how EventStream and EventStore come together to help us manage all of the events that occur within the app, and how we’ve created this bespoke architecture to help us do it better. Read on to find out more!

Architecture

In this article, we’ll be exploring one of the flows processed by EventStream. Before we can get into it, here’s a quick overview of the EventStore and EventStream architecture.

The EventStore/EventStream architecture consists of two main systems, one responsible for publishing events, and the other for streaming events.

EventStore

The Publishers are the different applications that generate events. In Revolut’s architecture, virtually every app is a publisher. Once a publisher performs an action, it will persist the different mutations and generate events in an atomic fashion. Once an event has been persisted in the publisher’s database for durability reasons, the publisher will attempt to publish it in a best effort way. In case this process fails or the publisher dies, a reconciler process is in place, delivering unpublished events that are present on the publisher’s database.

The EventStore server is responsible for persisting the events sent by the publisher, and also for persisting them into its database.

Bonus points if you noticed that this is the same flow from the previous article!

EventStream

Once the events have been successfully persisted into the EventStore’s database, a new flow starts focusing on delivering them to the different processor-applications with subscriptions eligible for those events.

The EventStream server is responsible for receiving subscriptions-requests and delivering a stream of events as soon as they are persisted. The event-processor also plays an important role, which you’ll soon learn about here.

Let’s deep dive into this specific flow. By the end of this article, you’ll know all the ins and outs of the EventStream streaming platform.

Database Topology

EventStream relies on Postgres 12 as the server-side persistence layer. As you can see in the image below, the database consists of two different Postgres clusters. You’ll see database-events acting as the main cluster, and database-events-archive as its foreign cluster. Each cluster is composed of a master and set of read replicas in standby. Reads and writes are properly segregated, in order to reduce load into the master nodes.

The database holds more than 12 billion records, adding a billion a month. This creates the problem of how to operate an exponentially growing system. That’s why we rely on a foreign cluster, helping us to operate more manageable independent database systems.

The main cluster contains the last 12 months worth of heavily indexed data. It’s responsible for the majority of the subscriptions, while the archive database contains all the historical partitions until that point. The archive cluster is meant to be rarely used, which means we only use it when applications need to stream events from the very beginning.

Database Schema

In order to hold the constantly growing dataset, the database relies on Postgres native partitioning and pg_partman to create more manageable tables, partitioned by date on a monthly basis. This helps to increase query performance, since lookups will only be performed on the partitions that actually hold the query’s time-frame.

Despite the tables having a hierarchical structure from the relational point of view, the system is designed as an append-only system from an application perspective. In this case, only the most recent partition is mutated, treating partitions as immutable blocks, similar to how sstables works on systems like Cassandra or Bigtable.

Another advantage of partitioning is that it eases maintenance operations like indexing, since operations can be performed in parallel to smaller tables.

Subscription

Processor-applications can create subscriptions which can receive a stream of events based on the subscription-request criteria. Each subscription is a sequential stream of events, which can be formed by a mixture of events of different model types and event types.

A subscription starts from a point in time, specified either by a monotonically increasing offset, or its timestamp counterpart. The stream of events is divided in two substreams: the offline-stream contains past historical data, while the live stream contains real time, ie. current and future, events.

Partitioning

Subscriptions are meant to be processed in a sequential fashion, executing one event at a time, in order to avoid undesirable race conditions.

In order to increase the processing throughput, subscriptions can be partitioned into sub-streams. Partitions allow parallelization of the stream by splitting the data into sub-streams that can be executed independently while still maintaining the sequential execution. In the example below, you can see the original subscription on the left and the same subscription after being split into 3 partitions by a given partition key on the right.

An application can configure the number of partitions and the partition key by which the event is hashed and assigned to a specific partition. There are two partitioning strategies, that can be mixed and matched:

Scale up: Several partitions can be executed into a single event-processor, allowing an increase in processing throughput per event-processor.

Scale out: Partitions can be distributed across multiple instances in order to evenly distribute the processing-load across a cluster of event-processors.

One key element that separates EventStore’s architecture from other streaming platforms is that partitioning is applied on-the-fly per subscription. That’s why repartitioning is a trivial operation that can be executed whenever an event-processor needs to scale up or scale down the number of partitions, or change the partitioning key.

Client Side

The EventStream architecture comes with a Java SDK, providing a set of high-level components that makes integrating event-processing into applications a simple task — resulting in faster development and integration times.

Consumers

The EventStream SDK provides the SingleEventConsumer which allows us to configure three properties: the model-type, the event-type that the consumer is interested in and, optionally, a payload-filter that can filter down based on the event’s payload-attributes.

In this example below, the BalanceChangedEventConsumer will be executed whenever a BalanceChangedEvent, with no balance in either euros or pounds, arrives at the event-processor.

Similarly, the MultiEventConsumer allows users to subscribe to several events from the same model-type. E.g The TransactionNotificationEventConsumer will execute the lambda when either a TransactionCompletedEvent or a TransactionRevertedEvent is received by the event-processor.

Event-Processor

Every application that consumes events relies on the event-processor. This component allows the execution of several consumers at the same time. It is responsible for composing the subscription-request and dispatching the subscription’s events to the consumers.

The Event-Processor requires persistence, and its primary use is to store the subscription’s offsets, helping keeping track of the consumers positions within the subscription. Additionally, it persists other metadata, like leases and locks used for partition mutual exclusion.

At the time of writing, the Event-Processor relies on Redis for this, but other appropriate key-value stores may be considered in the future.

Execution

Each partition is executed in a parallel-sequential model, where a partition is executed sequentially starting by receiving an event, executing all the consumers that are suitable in parallel and finally checkpointing on redis once all the events have been processed.

Anatomy of a stream

Now focusing on the server side, it’s necessary to understand how the event-subscriptions are actually generated. For this purpose, let’s review the offline and the live stream.

Offline Stream

The offline stream is a finite stream that retrieves all the historical data, starting from an arbitrary point in the past until the present. In order to do that, the event-stream simply opens a cursor.

Such a simple operation brings some unwanted complexity since Eventstore’s database uses asynchronous replication, which is prone to replica lag, albeit by only a few milliseconds. If an Offline Stream is started against a lagging replica, the stream may miss event delivery; since the query can be completed before the data may not have reached the replica yet. Instead of relying on the DBops team to ensure a replica-lag within optimal values, we decided to build a system that is resilient to lag scenarios. Therefore we’ve got a problem to solve.

Originally, we overcame the issue by connecting to the master node for the offline-stream, but this added too much pressure on it. That’s why we came up with a better proposal: one that would not depend on the master node.

Nowadays, the offline stream polls the connection’s replica before opening the cursor by checking that the subscription timeframe is present on the replica that we are connecting to. The query is executed only when there is a guarantee that all the data is present on the replica. So even if there is a replica lag, that lag should not affect the outcome of the query that is being executed. Otherwise the master node is used as a last resource.

Live Stream

The live stream is the stream in charge of retrieving real-time events, containing the events that are currently being persisted into the EventStore system. It is an infinite stream starting from the present.

Every time an event is persisted into the database, an AFTER INSERT trigger publishes the event into a Postgres channel, that acts as a queue (fifo), maintaining the insertion order.

Each Raw-Live-Stream-Publisher opens a connection against the master node, listening to the events channel and receiving every single event that is being inserted into the Eventstore database.

In order to avoid unnecessary computation, different subscriptions’ Live-Streams can subscribe to the same Raw-Live-Stream-Publisher and apply a set of filters, since the subscriptions are not usually interested in all the events. These filters will verify that only the events from certain partitions, model-types, event-types and even specific properties in the payload are valid, before sending the event to the client.

Reactive Streams

EventStream architecture relies heavily on Reactive Streams. Both the offline and the live stream are project-reactor’s fluxes, that concatenated together creates the final stream that is sent to the client.

The client and server are connected by an rsocket on top of a TCP socket that extends the reactive streams’ semantics to the network protocol. In a way, the client opens a stream that is extended all the way until the server side, with the rsocket protocol acting as a link between the client and the server side reactive streams.

The server side will do work only under client demand, which allows the client side to control the consumption pace.

Putting it in motion

Each client defines a number of events on the fly, eg. 4k. The client side will request the 4k events and the offline stream will start traversing the cursor until either completes the offline stream (no more-events) or successfully produces the desired number of events. If it depletes the offline stream, the live stream will start consuming events until it eventually reaches the desired 4k events.

Once the client has consumed 75% of the 4k events, it will asynchronously request for another 3k more, and this flow will continue until either the client or the server eventually disconnects.

Where to next?

As we mentioned last time, this platform is always evolving and in active development. We’re always looking at how we can create something even better, perfectly tailored to our needs. One of the current limitations of our architecture is the high dependency on a single-master. To overcome this, we’re planning to study new database topologies. The live stream can only open listen-connections to the master, since it’s using streaming replication limitation where the triggers can only be executed in the master-node. By changing the topology to have some nodes using logical replication, it will allow to run triggers on these new nodes, allowing to distribute the notify listen connections across the cluster, reducing the pressure on the master node.

Conclusion

Now you know all about how EventStream works at Revolut. Across these two articles you’ve learned about how we store events, and how we manage them. We built this bespoke service to be able to have full freedom over how we manage these services, because freedom gives us control. This platform will continue to evolve over time — stay in the loop to discover new developments at Revolut!

Want to work with me?

Revolut is a global community with offices in London, Krakow, New York, Berlin, Vilnius and other cities. The number of our customers has already topped 12 million. We’re looking for engineering talent to help us build the best money management solution. Check out opportunities on the Careers page to join us.