Resilient Eventing in Microservices, using the Outbox Pattern.

Sohan Ganapathy
8 min readOct 12, 2019

--

The Problem Statement

Microservices often publish events after performing a database transaction. Writing to the database and publishing an event are two different transactions and they have to be atomic. A failure to publish an event can mean critical failure to the business process.

To explain the problem statement better, let’s consider a Student microservice that helps Enroll the student. After enrollment the ‘Course Catalog’ service, emails the student all the available courses. Assuming an Event Driven application, the Student microservice enrolls the student by inserting a record in the database and publishes an event stating that the enrollment for the student is complete. The ‘Course Catalog’ service listens to this event and performs its actions. In a failure scenario, if the Student microservice goes down after inserting a record, the system would be left in an inconsistent state.

Image 1. Failure while publishing an event after updating / inserting the database.

The OutBox Pattern

This pattern provides an effective solution to publish events reliably. The idea of this approach is to have an “Outbox” table in the service’s database. When receiving a request for an enrollment, not only an insert into the Student table is done, but a record representing the event is also inserted into the Outbox table. The two database actions are done as part of the same transaction.

An asynchronous process monitors the Outbox table for new entries and if there are any, it publishes the events to the Event Bus. The pattern merely splits the two transactions over different services, increasing reliability.

Image 2. Two separate transactions using the outbox pattern.

A description of this pattern can be found on Chris Richardson’s excellent microservices.io site. As described on the site there are two approaches to implementing the Outbox pattern (Transaction log tailing and Polling publisher). We will be using the log tailing approach in the solution below.

Transaction log tailing can be implemented in a very elegant and efficient way using Change Data Capture (CDC) with Debezium and Kafka-Connect.

Outbox Pattern with Kafka Connect

Solution Design

The Student microservice, exposes endpoints to perform database operations on the domain. The microservice uses a Postgres database, which houses two tables “Student” and “Outbox”. The transactional operations, modify/insert into the “Student” table and adds a record in the “Outbox” table.

The Kafka-Connect framework runs as a separate service besides the Kafka broker. The Debezium connector for Postgres is deployed on the Kafka-Connect runtime, to capture the changes on the database. In our example a custom connector is also deployed within Kafka-Connect to help identify the right Kafka topics for an event.

The Debezium connector tails the database transaction logs (write-ahead log) from the ‘‘Outbox’’ table and publishes an event to the topics defined by the custom connector.

Image 3. Solution Design

This solution guarantees at-least-once delivery, since Kafka Connect services ensure that each connector is always running. However there is a chance the solution can publish the same event multiple times between connectors going down and starting up. To ensure exactly-once delivery, the consuming client must be Idempotent, making sure the duplicate events aren’t processed again.

Understanding the code

You can find the code here. But I would encourage you to read through the story - since I have walked through some key implementation details and the limitations of this pattern.

Student Microservice

This is a simple Spring-Boot microservice, which exposes 3 endpoints via the REST controller and uses Spring-JPA for database actions. The endpoints exposed are a GET for fetching student information, POST for creating or enrolling a student and a PUT for updating the student email address. The POST and the PUT generate events ‘Student Enrolled’ and ‘Student Email Changed’.

The change to invoke the Database actions and inserting the event is handled in the Service class.

The method needs the Transactional annotation so that database action and the event write is bound by a single transaction. The enrollStudent() creates a new record on the Student table and then fires an event using Spring’s ApplicationEventPublisherAware support. The method createEnrollEvent(), helps build the data to be inserted into the OutBox.

Inserting the event into the ‘Outbox’ table is handled in the EventService class which uses a Spring-JPA Repository to handle the database interactions.

A key thing to note here is the code deletes the record on the ‘Outbox’ Table once it has been written, so that the outbox table doesn’t grow. Also Debezium doesn’t examine the actual contents of the database table, but instead it tails the write-ahead transaction log. The calls to save() and delete() will make a CREATE and a DELETE entry in the log, once the transaction commits. The Kafka-Connect custom transformer can be programmed not to perform any action on the DELETE entry.

Custom Debezium Transformer

This component determines the Kafka topic to which the event needs to be published. This is done by using the EVENT_TYPE column of the payload from the ‘Outbox’ table. The component is built as a JAR and will be placed in the Kafka-Connect runtime.

The setup of placing the JAR in the Kafka-Connect runtime is handled by the DockerFile.

We use the image debezium/connect, since it comes preloaded with all available connectors. For installing a particular connector you can refer to the documentation here.

The component consists of just one class that helps determine the topic before the message is published.

The transformer extends the Kafka-Connect Transformation class. The apply() method, filters the CREATE operation (‘c’) skipping the DELETE, as explained above.

For every CREATE the topic name is identified and payload is returned. For simplicity in this example the topic name is the lowercase value of the EVENT_TYPE column, inserted into the “Outbox” table by the Student Microservice.

Installation of the needed images and frameworks

The guide assumes the user has docker pre-installed. You can follow the steps for installation here.

Creating the Debezium Connect Image, is done by triggering a maven build on the custom-debezium-connect project and building the docker image.

$ mvn clean install$ docker build -t custom-debezium-connect .

Running the Docker Compose under the project folder, installs all the pre-requisites: Zookeeper, Kafka, Postgres and Kafka-Connect. The command to run the Docker Compose file is:

$ docker-compose up -d

We use the image debezium/postgres, because it comes prebuilt with the logical decoding feature. This is a mechanism which allows extraction of the changes, that were committed to the transaction log making CDC possible. The documentation for installing the plugin to Postgres can be found here.

Setting up the Kafka topics

Execute the below commands to create the two Kafka topics : “student_enrolled” and “student_email_changed

$ docker exec -t kafka /usr/bin/kafka-topics \
--create --bootstrap-server :9092 \
--topic student_email_changed \
--partitions 1 \
--replication-factor 1
$ docker exec -t kafka /usr/bin/kafka-topics \
--create --bootstrap-server :9092 \
--topic student_enrolled \
--partitions 1 \
--replication-factor 1

Linking the Debezium Kafka Connect with the OutBox table

Execute the below curl command to create a connector in the Kafka-Connect server. This connector points to the Postgres installation and also specifies the table and the custom transformer class we built earlier.

$ curl -X POST \
http://localhost:8083/connectors/ \
-H 'content-type: application/json' \
-d '{
"name": "student-outbox-connector",
"config": {
"connector.class": "io.debezium.connector.postgresql.PostgresConnector",
"tasks.max": "1",
"database.hostname": "postgres",
"database.port": "5432",
"database.user": "user",
"database.password": "password",
"database.dbname": "studentdb",
"database.server.name": "pg-outbox-server",
"tombstones.on.delete": "false",
"table.whitelist": "public.outbox",
"transforms": "outbox",
"transforms.outbox.type": "com.sohan.transform.CustomTransformation"
}
}'

That completes the setup needed, where we have Zookeeper running on port 2181, Kafka running on port 9092 with all the needed topics, Postgres running on port 5432 having a ‘StudentDB’ pre-created and finally the Kafka-Connect with Debezium and our custom transformer running on port 8083.

Running the Solution

Once the Student Microservice is started we can see the pattern in action.

To simulate a Student enrollment, we can execute the below curl.

$ curl -X POST \
'http://localhost:8080/students/~/enroll' \
-H 'content-type: application/json' \
-d '{
"name": "Megan Clark",
"email": "mclark@gmail.com",
"address": "Toronto, ON"
}'

We see that a new student record is inserted into the database for ‘Megan Clark’.

Image 4. Student Enrolled inserted into the database

And we see an event published into the topic student_enrolled, notifying the downstream systems that ‘Megan Clark’ has enrolled.

Image 5. Console consumer to verify data being published to Kafka

To simulate a student updating the email address, we can execute the below curl operation.

$ curl -X PUT \
http://localhost:8080/students/1/update-email \
-H 'content-type: application/json' \
-d '{
"email": "jsmith@gmail.com"
}'

We can notice that email has been changed into ‘jsmith@gmail.com’

Image 6. Student Email changed in the database

And we see an event published into the topic student_email_changed, notifying the downstream systems that Student with Student-ID ‘1’ has changed his email id.

Image 6. Console consumer to verify data being published to Kafka

If we comment the line of code that deletes outbox events after writing them in the EventService (outBoxRepository.delete(entity)), we can view the events inserted in the outbox table.

Image 7. Events in the OutBox Table.

Summary

In a microservice architecture, system failure is inevitable. Adapting this architecture style forces us to design for failures. The Outbox Pattern gives us a robust method of reliable messaging in the face of failure.

The above solution makes the implementation of the pattern simple. But to make the system highly available, we must run multiple instances (clusters) of Zookeeper, Apache Kafka, and Kafka Connect.

Finally, I would like to point out this isn’t the only way to tackle the problem of reliable messaging. But it is an invaluable pattern to have at your disposal.

--

--