Event Driven Architecture — Part 1

Deepak Chauhan
8 min readAug 31, 2023

--

Event-Driven Architecture (EDA) is a software architectural pattern that focuses on the communication and coordination of various components or services within a system through the use of events.

An event is an immutable message that signifies that something has happened at a particular point of time. It’s a fact which maintains the true and immutable state at the time when it happened.

EDA is one of the styles of Message-Driven Architecture (MDA) which follows the publish subscribe model. MDA is heavily used in SOA and known as enterprise application integration (EAI). EAI utilizes Enterprise integration patterns (EIP) for communication over Enterprise Service Bus (ESB).

ESB provides a central hub for routing messages between different services and applications. Messages can be transformed, filtered, and directed to their intended destinations based on defined routing rules. There are two major differences in EDA and MDA.

  1. EDA is strictly unidirectional. MDA uses both unidirectional and bidirectional communication. One of the integration pattern request-response is heavily used for bidirectional asynchronous communication in SOA.
  2. EDA emits only events to a specific topic. This topic is owned by the same component which is emitting the event. Multiple components can subscribe to this topic to process the events. Whereas MDA could also send the commands in addition to any arbitrary message over ESB/queue for asynchronous processing.
Event Driven Architecture depicting communication from service A to service B and C
Event Driven Architecture depicting communication from service A to service B and C

In the above diagram, Service A is directly/indirectly publishing to a topic. And topic has 2 subscription A and B. On each subscription S1 and S2, multiple consumers of Service B and Service C are subscribed respectively. We will discuss the whole concept in subsequent sections.

Publishers

In EDA, events can be published using the following approaches.

Application layer

In this approach, the applications themselves push the changes as part of the event to the topic after the transaction/main function is completed. This is the most common and favorable approach and works for any kind of event driven use case as the event is prepared at application layer and you can enrich as much information in the event which is required for processing by subscriber components/services.

There is one anti pattern, where event data is enriched by calling other application API data. Such applications are hard to scale as you won’t be able to estimate the load efficiently and incur huge cost to infra. Such enrichments, if required, must be done at subscriber end rather than at publisher side.

Event publish is the cross cutting concern and last step of any domain function and it should be kept out of DB transaction boundaries. Sometimes, application face the challenge of event loss due to following reasons:

  1. System was not able to make connection to broker to publish the event
  2. System get crashed while it was about to publish the event.

These challenges can be better handled by implementing resiliency in the application. Resilient system makes sure that it will recover from the step where it failed. Resiliency could only be achieved if your application is persisting every data oriented step somewhere else. Implementing such a solution requires great engineering. Fortunately, few highly scalable platforms like Uber Cadence, Netflix Conductor and Temporal are available in the market to solve this purpose.

Change Data Capture

CDC captures changes made in the data source, such as INSERT, UPDATE, and DELETE operations, and replicates them into a target system, typically in real-time or near real-time. This approach is generally used for data migration, streaming and real time analytics, however widely used in EDA just because of its strictly ordered behavior. Instead of pushing the event directly from the application layer, CDC connects with the data source and uses the push or pull approach to stream the events to a topic. This topic is further subscribed by other components to consume and process the events.

A. Pull Based CDC / ETL Pipeline

This is a well known approach for data migration however, could be used for EDA in many cases. In this approach, particular tables/collections are introduced with an additional timestamp column which captures the record change time. ETL pipelines execute the incremental time bound query to fetch the data at specific intervals (generally kept minimum one second for near real time sync). However, there are few limitation with this approach.

As this is an interval based pull approach, you may not consume all the changes. Look at the following picture.

In the above example, there is a record R, whose property name has changed 8 times at different timestamps from T1 to T8. Cron executed at X1, X2 and X3 time. Cron at X1 time captured T1 time state (name = a), cron at X2 time captured T4 time state (name = d) and missed two changes for timestamp T2 (name = b) and T3 (name = c).

Another problem, you can’t hard delete the record as the DB query can only fetch non-deleted data. Application should soft delete to capture the deleted change.

How to implement

  1. Define the timestamp column on the table which must capture record change time. This could be time of insert, update and soft delete. This column should be indexed in ascending order
  2. Define a persistent variable (ex: last_record_timestamp) which maintains the last record timestamp. For the first run it will be set to 0.
  3. Use your favorite ETL tool and define the data fetch query. Keep in mind that query should fetch a defined number of entries otherwise it could cause timeout or delay due to huge data size and you may go in an infinite failure loop.
  4. Transform every record received in the resultset and update last_record_timestamp for every push to the broker.
  5. Always define two intervals, one interval where you want to keep on executing the query until you get an empty row set. This interval will help the data cleanup faster if somehow the job was shut down for a longer time due to any reason. Second interval defines the frequency at a fixed interval.
  6. Always read from a replica rather than application write database.

ETL query based CDC may not serve the purpose, if the application is modifying multiple tuples in different tables as part of a single transaction and you need all changed tuples in the single event. Let’s take the example of e-commerce application, where single transaction is making following changes:

  1. Order Table: Modification in the order tuple with order ID XYZ
  2. Line Item Table: Added one more line item with order ID XYZ

To solve this problem, both tables must have the record change timestamp column. You may write a multi-step ETL which may involve following steps:

  1. Capture the changed orders using pull query
  2. Fetch changed line items for every order in result set
  3. Merge the order and line item and push to the topic

It may not be a good approach, if there are scaling challenges with a given database. In that case you should look at alternatives like application layer publish or log based CDC to meet the right expectation.

B. Log Based CDC (Modern approach of CDC)

It is a modern and advanced approach to data migration and frequently used for EDA. OSS tools like Debezium and cloud specific CDC technologies allow you to sync the data from various databases through db logs.

Log based CDC tools actually create a stream over database logs. For mysql, it is binLog, for Mongo, it is opLog and for Cassandra, it is commit log. Its implementation varies from one database to another database and hence, choice to choose EDA should be considered according to source database limitations.

Relational Databases (ex: MySQL / Postgres) stores the full record (which includes before and after) in the log and allows the CDC tool to capture full record information as part of the event.

RDBMS is also a good choice for use cases where a single transaction consists of multiple tuple changes across multiple tables. It maintains a single log entry for the whole transaction and stores all changed tuples in the log. This way CDC tool capture the full information in a single event.

Mongo doesn’t store the full record information in opLog, however provides a change stream where you can enable full record lookup to receive the complete record as part of the event.

Cassandra commit logs do not record the value of every column in the changed row, it only records the values of columns that have been modified (except for partition key columns, which are always recorded as they are required in Cassandra DML commands). There are many other databases which maintain only partial information of the record in log and may not be the best candidate to enable EDA using CDC.

There are following approaches to deal with this partial record capture limitation.

  1. You can opt-in application layer event publish or pull based pipeline to overcome the limitation. However you should consider all factors listed in both approaches.
  2. Create a bridge service which can consume the CDC events, enrich them further and publish a full content event to the target topic. This enrichment can be done using event sourcing or id lookup as described below.

Event Sourcing: Instead of persisting the current state of an object or entity in a traditional database, event sourcing captures and stores every change or event that occurs over time. This provides a historical record of how the state of the application has evolved.

The bridge uses an event store database which stores all the events with ID and event timestamp. It acts as a subscriber to prepare the event store from the CDC tool. Whenever a event is received

  1. Step 1: It is first stored in event store
  2. Step 2: All events for the received ID are fetched from event store
  3. Step 3: These events are sorted by timestamp and merged to prepare the full content and further pushed to another topic.

Kafka comes up with an in-built event store KSQL. If you are using Kafka then instead of implementing the bridge approach you can explore KSQL as an event store DB.

ID Lookup: Once the bridge receives the event it looks up the whole record by ID in the source database. This may not be an efficient approach if the source database is not scalable enough to read the data at high frequency.

Part 2: https://medium.com/@foreverdeepak/event-driven-architecture-part-2-94d3ba05c2eb

--

--