How to Build a Reliable Live Data Replication Architecture (Part 1): Track Events With SQL Triggers

Live event tracking using SQL triggers

tong eric
Bina Nusantara IT Division
4 min readJul 13, 2021

--

Photo by Joshua Sortino on Unsplash

This article is a two-part series about how to engineer live data replication architectures using SQL triggers, and the outbox pattern. You can read part two here.

Murphy’s Law states, “Nothing is as easy as it looks. Everything takes longer than you expect. If anything can go wrong, it will, at the worse possible moment”. Replicating data over multiple nodes is useful for us to keep the database up-to-date, but it is difficult to guarantee the reliability of the service. You need a reliable replication architecture that can help you process data and help audit errors when it fails.

For that purpose, this article will provide and explain the basic data flow architecture for replication.

Data Flow Overview

Image by Author

The diagram above is a visualization of the architecture of our data flow. Our goal is to achieve live data replication by copying events into the outbox table using SQL triggers. Then, an event scheduler will read the table periodically and will publish the event to RabbitMQ. There will be a consumer listening for events from RabbitMQ that will update data on the target table.

Architecting the Data Flow

The required components of the data flow are as follows:

  • SQL Server for SQL triggers.
  • Event table for storing updated events.

Creating the Event table

The event table is used to store data replication events. For example, there is a Student table and an Exercise table, in this case, the Student table will be the master and the Exercise table will be the target. Why? Because the Exercise table has a foreign key attached to the StudentID column on the Student table.

+---------------++------------------+
| Student || Exercise |
+---------------++------------------+
| StudentID PK || ExerciseID PK |
| Name || StudentID FK |
| Age || Name |
| Address || Content |
| NIM || |
+---------------++------------------+

You want to replicate the Name column on the Student table to the Exercise table. To find and replace the name to you need to store the StudentID. Base on this analysis our event table will look like this.

+--------------+
| EventTable |
+--------------+
| EventID PK |
| StudentID |
| Name |
| TableName |
| UpdateTime |
+--------------+

The EventID will act as the primary key, the StudentID will be the target identifier. The TableName and UpdateTime columns are only for auditing purposes. Here is SQL Query to make this table.

Creating the SQL Trigger

After you have prepared the event table, next you will need to create a trigger to an update to the event table. There are some conditions that you need to fulfill, mainly :

  • The trigger must be only fire when on an update event.
  • The value that is input to the outbox table is only when the Name column is changed.

Here is an example of an SQL Trigger:

First, the query starts from a casual CREATE TRIGGER query.

Then, you will specify that only an update will fire the trigger.

After that, you have to declare a variable that you need to use in our trigger. studentId, studentName, tableName, and updateTime is used to store value from events. insertCount and loopCount will go through each variable and each condition variable used.

To retrieve the table name you can query sys.objects. The select query above can produce the name of the table that triggered the event.

You can also use the GETDATE() function to retrieve the updateTime variable.

Next, you want to know how many rows of updates have changed in the Name column. To get each affected row you will need to store the insertCount variable and based on that you can produce 3 conditions that lead to a different query:

The first condition will query through each row using OFFSET FETCH and insert it to the event table, here is an example:

Finally, if the insertCount equals 1 you just need to insert one row to our event table.

Otherwise, if the insertCount is equal to 0 the trigger will do nothing.

Conclusion

To summarize, making live event triggers can be complicated and you must plan your data flow carefully. Incorrect logic can cause massive problems in the data architecture.

Again, there are a few components to our solution, mainly:

  • SQL Server for SQL triggers.
  • Event table for storing updated events.

Anything that can go wrong will go wrong, don’t forget to always double-check.

--

--