Anatomy of an Event Streaming Platform — Part 1
Understand the concepts, architecture, and the ecosystem of a real-time event streaming platform
Businesses today are dealing with a massive influx of real-time data. Financial transactions, application logs, website clickstreams, and IoT telemetry data are few examples.
As a business, you can’t simply ignore the importance of real-time data. Even if you do, your competitors won’t. Modern businesses use them to get meaningful insights in a timely fashion to stay ahead of the competition.
Crude oil is the best analogy we can apply to real-time data as well. They have to be extracted, transported, and refined to get a meaningful value. The question is, how do we put together a proper system architecture to harness the power of real-time data?
Event stream platforms play a central role in ingesting, storing, and processing real-time data in a scalable and resilient manner. Before using them, one must first understand the architecture and ecosystem of a typical event streaming platform. In this article series, we’ll take a deep dive into the architecture of a modern-day event stream platform and how you can leverage it to your competitive advantage.
Before jumping into the architecture straight, this article sets the stage by introducing you to the concept of an Event Streaming Platform, its potential use cases, and some implementations that exist in the market.
A primer on events and event streams
Events are notifications of change of state. Notifications are issued or published, and interested parties can subscribe and take action on the events. Typically, the issuer of the notification does not know what action is taken and receives no corresponding feedback that the notification has been processed.
An event stream is a continuous unbounded series of events. The start of the stream may have occurred before we started to process the stream. The end of the stream is at some unknown point in the future.
Each event in a stream carries a timestamp that denotes the point when the event occurred. Events are ordered based on this timestamp.
In general, we can think of a stream as related events in motion.
What is an Event Streaming Platform (ESP)?
An Event Streaming Platform (ESP) is a highly scalable and durable system capable of continuously ingesting gigabytes of events per second from various sources. The data collected is available in milliseconds for intelligent applications that can react to events as they happen.
The ultimate goal of an ESP is to capture business events as they happen and react to them in real-time to deliver responsive and personalized customer experiences.
What type of data it can ingest?
An ESP can ingest data from various data sources such as website clickstreams, database event streams, financial transactions, social media feeds, server logs, and events coming from IoT devices.
What can you do with ingested data?
Ingesting and storing real-time data alone doesn’t serve the purpose of an ESP to its fullest potential. The captured data must be made available to applications that know how to process them to gain real-time actionable insights.
Some examples include:
Stream processing applications
To query an incoming stream of data to detect interesting patterns and take action on them. Real-time anomaly detection, infrastructure monitoring are few examples.
Perform computations on incoming streams of data to find metrics in real-time. Real-time counters, moving averages, sliding window operations are typical use cases of this. Applications in financial trading and advertising fields are good examples.
These applications present incoming streams data on interactive real-time dashboards. They are suitable for monitoring and making informed decisions. Real-time sales leaderboards, traffic monitoring, and incident reports are some examples.
Real-time data pipelines
These applications move ingested data from the ESP to multiple targets such as data lakes, data warehouses, and OLAP databases in real-time. Those systems are best suited for offline analytics and performing interactive queries. Often, the moved data is fed into batch analytics engines such as Apache Spark.
ESP and Messaging Systems — are they the same?
No, streaming and messaging are two different things. Although there are some similarities in between, they serve different purposes. For example, unlike in traditional messaging systems, ESPs don’t delete a message after being consumed.
You can refer to my following article to get a deeper comparison.
Comparing Enterprise Messaging and Event Streaming
They are complementary technologies instead of competing technologies
Why do I need an ESP?
If you have a real-time application use case similar to the above, you should consider having an ESP in your organization.
The following are the reasons that justify your decision.
Decouple data producers and consumers
By putting an ESP in between your data producers and consumers, you can make them loosely coupled. Producers are not aware of who’s going to consume the data that they are producing. Conversely, consumers are not aware of who produced the data they are consuming. For example, an IoT sensor emitting temperature readings might not know who’s going to process them.
That enables adding or removing components from/to the architecture without a significant change, enabling an agile business.
Provide scalable and fault-tolerant storage for real-time data
Databases are not the obvious choice to capture millions of data items per second. Even if they do, you might have to spend a fortune to own and maintain them.
ESP is a distributed system purpose-built to ingest millions of data records per second. This data then stored in fault-tolerant storage to keep them safe from data losses.
Act as a shock absorber for incoming data
Most of the time, the rate at which the data arrives is greater than the consumption rate. If downstream consumers are not performant enough to catch up with the production rate, they’ll crash.
In this case, the ESP acts as a buffer to absorb incoming data and consume consumers at a comfortable rate.
Provide a single source of truth for business data
Unlike traditional messaging middleware, data stored in an ESP is not deleted once consumers consume them. You can configure an ESP to retain data up to a point in time you decide.
That way, an ESP can be the single source of truth for your business. The consuming applications can read data from the ESP at any time they think fit.
Also, this greatly simplifies the debugging of production incidents that happened in the past. It’s a matter of a consumer replaying data from the incident time to diagnose any issues in the code.
Commercial and Open Source implementations
Building and maintaining an ESP is often a difficult task. It requires a specialized skill set to operate these distributed systems in production.
An ESP should be scaled on-demand to cater to sudden spikes in incoming and outgoing event traffic. Also, reliably storing events across multiple data centers or geographic regions. Considering all these things, you’ll always resort to managed ESP offering, if that exists.
Fortunately, major IaaS vendors are providing ESP as a managed service. That takes away the ever-increasing management burden from organizations.
Following are some key players in the market:
- AWS Kinesis
- Azure Event Hubs
- IBM Event Streams (managed Apache Kafka as a service)
- Confluent Cloud (managed Apache Kafka as a service)
There are open-source ESPs out there. Some organizations prefer building and managing ESPs by themselves as they have hired top talent to do so.
Here are some open-source alternatives you can deploy on-premise or in the cloud.
The high-level architecture of an ESP
The following figure illustrates the high-level architecture of an ESP.
We can break down the above architecture into three layers based on the runtime behavior.
This layer is responsible for receiving events from event sources at very high throughput.
The ingestion layer often accepts event data over multiple transport protocols such as HTTPS, AMQP, MQTT, and Kafka. That enables to cater to a wide range of event sources apart from HTTPS.
The ingested data then handed over to the storage layer for durable and fault-tolerant storage. The storage layer organizes related events into topics. Topics are further broken down into partitions, which are the smallest storage unit of an ESP.
The stored data then consumed by consumer applications through this layer. Each consumer consumes one or more partitions from a topic through a logical group called a consumer group. That enables parallelized event consumption to improve the throughput at the consumer side.
If you don’t understand these concepts, such as topics, partitions, and consumer groups, don’t worry. We’ll take a deep dive into them in the following article with examples taken from ESP implementations.
For now, keep this high-level architecture in mind. It is universal to almost all the ESPs out there.
Event Streaming Platforms (ESP) are getting popular in the industry as the central piece for building real-time event-driven applications that are distributed, massively scalable, and fault-tolerant.
Getting to know the architecture and the concepts around ESP is essential to make informed decisions about them. This article discussed ESP concepts, why do they exist, and some industrial implementations out there. Finally, we glanced through the high-level architecture.
In the following articles, we’ll take a deep dive into each ESP component with examples taken from well-known implementations.