Many Faces of Event-driven Architecture

Looking at the Event-driven architecture from multiple perspectives to understand potential use cases and tools to build them.

Dunith Danushka
Tributary Data
8 min readAug 28, 2021

--

Photo by drmakete lab on Unsplash

The event-driven architecture (EDA) is a popular choice for building distributed applications that demand a great deal of scalability, performance, and maintainability.

EDA is a loaded term these days. Hence, I will discuss multiple variations of EDA, when to use them along with industrial tools in this article.

In the end, you’ll have a clear idea of the entire EDA landscape.

The overview

A typical event-driven architecture consists of an event producer, an event consumer, and an event broker.

An event is a primitive construct that travels from producer to consumer via an event broker. It carries a state change that happened in the producer. For example, “the order has been placed” or “the temperature reading is 27 degrees Celsius”.

When an event is produced, the producer is not aware of the consumer who receives it. Likewise, the event consumer is doesn’t want to know who had produced it. The event broker in the middle makes that decoupling, allowing both parties to scale and evolve in a loosely coupled manner.

That is why the EDA is so popular even today.

The event broker

The event broker accepts events from producers, stores them durably, and then dispatches them to consumers. They are meant to be scalable, high performant, and fault-tolerant.

Depending on the application use case, we can categorize event brokers into three as follows.

Message brokers

Message brokers implement the publish-subscribe pattern where multiple consumers consume the same event. Protocols such as JMS, AMQP, and MQTT make that possible with their topics and subscription semantics. The broker takes care of the subscription management, retrying, and consumer acknowledgment handling.

Message brokers delete the messages once consumers have consumed them. If a new consumer joins a topic later, it can’t see the past events.

Technology choices — RabbitMQ, ActiveMQ, Azure Service Bus Topics, AWS SNS, IBM MQ, Google Cloud PubSub

Event streaming platforms

You can think of an event streaming platform as a giant append-only log distributed across multiple machines. It appends incoming events into topics which is the logical container to group related events together. A topic is partitioned across multiple machines to ensure high availability, scalability, and fault tolerance.

Producers append events into topics which are partitioned. Consumers read partitions. Image Credits

Streaming platforms don’t delete the events after they are consumed. That’s the significant difference compared to message brokers. Event consumers usually start reading from the beginning of a topic partition. Also, the same partition can be read by multiple consumers without stepping on each other’s feet.

Technology choices — Apache Kafka, AWS Kinesis, Azure EventHubs

Web servers

Here, events are produced to a “channel” of a regular web server. Consumers consume them over a protocol like WebSockets, Server-Sent Events (SSE), or classic webhooks. Unlike the previous, there’s no event storage happens.

Technology choices — Pretty much all programming language supports implementing WebSockets natively. Some notable serverless platforms exist to serve events over WebSockets, SSE, etc. E.g., Ably real-time and PubNub.

Event production

Event production is the process of detecting an “event-worthy” incident, constructing the event payload, and publish it to the broker. For example, after processing an order, the Order service produces an “OrderProcessed” event.

Modern programming languages provide ample SDKs and interfaces to produce events to a wide variety of event brokers. As long as you have access to the existing application code, it can be modified to embed the event production logic.

However, there can be situations where you are not allowed to modify existing systems to produce events. Even if you do that, it will cost you a fortune. The best example is a legacy system such as a mainframe.

How would you ask a mainframe to produce an event after processing an order?

Well, in that case, we need to “event-enable” these systems.

Event-enablement of applications and databases

When an existing application can’t produce events, an “event enabler” must extract relevant information from them to produce events. Classic enterprise integration comes to help you at this point.

Imagine a mainframe writes a file to an FTP server after processing a sales order. An integration middleware periodically polls that FTP server to fetch newly created files, transform them, and publish that to the event broker as an “OrderProcessed” event. The shipping application then consumes that event to prepare the shipment. In this case, the integration middleware has “event enabled” the mainframe.

Another possibility is that you have a proprietary application with a database. You can’t make changes to the application, but you have the read-access to the database. In that case, you can use change data capture (CDC) to capture row-level changes being committed to the database and extract them out as a stream of change events. With CDC, you can publish events as changes are made to the database, preventing you from building complex ETL pipelines that run in batch mode.

Putting all together. Event-enablement with integration middleware and change data capture (CDC)

Technology choices: For enterprise integration, you can consider tools like WSO2 Enterprise Integrator, Mulesoft, Apache Camel, and Dell Boomi. Debezium is the ideal choice for implementing CDC in production.

Event consumption

Event consumption is complicated as there are many options for you to consider. Let’s unpack them one by one.

Simple event processing — processing an event at a time

In this case, you process one event at a time. Past events will not influence the processing of the current event.

That is the most straightforward event consumption possible. For example, the broker hands you an event; it can be filtered, transformed, and routed to the downstream consumer. Or you can update a database to reflect upstream state changes. Another possibility is to trigger an event-driven workflow with the received event.

Stateless event processing is the simplest form of event consumption.

Another option is to process the event in a serverless manner. That way, you are freed from provisioning and maintaining machinery that is required to process the event.

Technology choices — Serverless functions or FaaS (AWS Lambda/EventBridge, Google Cloud Functions, Azure Functions), integration middleware, Event-driven Microservices, workflow orchestration systems (Google Workflows)

Event stream processing — Stateful event processing

Here, you read a partition(s) from an event streaming platform and process them as an unbounded stream. The key difference here is that the state of the past events highly influences the processing.

For example, you can join multiple streams together to produce a single stream. Also, you can apply aggregate operations on the stream to calculate a running count, average, or sum. Also, you can perform advanced windowing operations like sliding and tumbling window functions.

Stream processing is ideal for processing an unbounded stream of events to uncover patterns and respond to them with low-latency actions. Practicals examples include real-time fraud detection, infrastructure monitoring and alerting, updating low-latency dashboards, etc.

Stateful stream processing

Technology choices — Apache Flink, AWS Kinesis Streams,

Real-time analytics — Streaming databases

Another variation is to ingest an event stream in real-time and then immediately make them available for querying. During the past few years, a new breed of databases has emerged to serve real-time, low-latency, user-facing OLAP queries to the masses.

They store incoming data in columnar storage and heavily uses advanced indexing techniques to answer queries fast. Also, they are good at maintaining materialized views that are incrementally updated as new data comes in. Since the query result is pre-computed, clients can read them with sub-second latencies.

Materialiazed views in ksqlDB

These databases are becoming the next stage of stateful stream processing, and they are ideal for building user-facing analytics dashboards, materialized views for Microservices, and real-time stateful applications.

Technology choices — Materialized, Apache Pinot, Apache Druid, ClickHouse, Rockset, and ksqlDB

Streaming ETL pipelines

These applications continuously listen for incoming events, transform them in flight, and deliver them to their destination in real-time. These are different from the traditional batch ETL pipelines, where they are scheduled to run periodically.

These pipelines are ideal for delivering data from the sources to their destinations with minimal latency. For example, you can read events from a Kafka topic and move them to a data warehouse for further analysis.

Frequently, CDC works hand in hand with streaming data pipelines as they stand in the first phase of the pipeline.

Technology choices — Apache Airflow, Apache Spark, Apache Beam

Documenting the EDA — Async APIs

Service-oriented architectures (SOA) and RESTful APIs feature a rich ecosystem of standards and formalization. Simply put, you can document web service and REST APIs with standards like WSDL and OpenAPI specification.

What about event-driven applications?

Specifications like AsyncAPI serve that purpose today. You can document your event-driven architecture with an AsyncAPI spec file. For example, you can mention where to find the event broker, the security schema it entails, event channels, and most importantly, the format of events that goes into and comes out of the event broker.

A sample AsyncAPI that describes an application emits user signup events.

Once you have that in place, you can generate an event-driven API out of it. These APIs stream events to their consumers asynchronously. They are ideal for building mobile and web applications that provide a real-time user experience.

To find more about asynchronous APIs, see my previous blog posts below.

EDA challenges and takeaways

EDA is not the silver bullet for many architectural problems. Although it provides you unparallel scalability, performance, and loose coupling, that all comes with a price.

Operating an event broker in production demands specialized skills. EDA has many moving parts and is often criticized for being hard to test in isolation. Furthermore, it isn’t easy to trace the path of events inside the application.

But the modern technology has been making EDA much more straightforward to adopt, even for small organizations. Serverless platforms take away the operational burden of managing event consumers. Also, modern distributed tracing appliances like Jaeger simplifies the tracing and debugging across components. Specifications like CloudEvents and AsyncAPI attempt to standardize and govern events across multiple domains.

Every powerful tool comes with significant responsibilities. So does the EDA. If you understand and use it well, your application architecture will reach for the stars.

--

--

Dunith Danushka
Tributary Data

Editor of Tributary Data. Technologist, Writer, Senior Developer Advocate at Redpanda. Opinions are my own.