Transactions and Failover using Saga Pattern in Microservices Architecture
Summary:
In this article, I’ll introduce to you how to use the Saga Pattern for distributed transactions and will show up how it can help in building robust business transactions flow in microservices architecture.
Introduction
I’m sure you have heard about Two-Phase Commit, It’s a very popular approach to build transaction operations which is in summary when the commit of a first transaction depends on the completion of a second. It’s very straightforward and easy when especially comes to update multiple entities at the same time like confirm an order and update the stock at the same time.
But, when it comes to Microservices things became more complicated as most of the application’s parts are distributed among different services and every service has its own data storage, and you no longer can leverage the simplicity of local two-phase-commits to maintain the consistency of your whole system.
Let’s take this example, say we’re building a Travel & Booking Website, and we started with a very simple architecture.

In the example above, one can’t just place an order, charge the customer, confirm booking with supplier, and send confirmation email/SMS to customer all in a single ACID transaction. To execute this entire flow consistently, we would be required to create a distributed transaction.
Problem
Building distributed transaction across multiple services counted as very complex and tricky task as we have to consider many issues that may take place like dealing with service availability with transient states, eventual consistency between services, isolations, and rollbacks all these scenarios should be considered during the design phase carefully.
Solution: The Saga Pattern
A Saga is a sequence of transactions where each transaction interacts with its corresponding single service. The first transaction is initiated by an external request corresponding to the system operation, and then each subsequent step is triggered by the completion of the previous one and it contains the mechanism of handling rollback for the whole transaction sequence.
Using our previous example, in a helicopter view, a Saga implementation would look like the following:

There are a couple of different ways to implement a saga transaction, but the two most popular are:
- Command/Orchestration: There’s an orchestrator which responsible for centralizing the saga’s decision making and sequencing business logic.
- Events/Choreography: There’s no coordination or orchestrator, each service integrates and listens to the other service’s events and decides if an action should be taken or not.
Command/Orchestration is my favorite one for these reasons:
- Avoid messy dependencies between services, as the Saga orchestrator is the one who invokes the saga participants.
- Reduce complexity as they only need to execute/reply commands.
- Easier to be implemented and tested.
- The transaction complexity remains linear when new steps are added.
- Rollbacks are easier to manage.
Let’s dive more
In the orchestration approach, we’ll create a new service which will take the responsibility of telling each participant what to do and when. The saga orchestrator communicates with each service in a command/reply style telling them what operation should be performed and will take the responsibility of firing rollbacks if needed.

- Order Service saves a pending order and asks Order Saga Orchestrator to start a create order transaction.
- Orchestrator sends an execute payment command to Payment Service and wait for feedback on orchestrator queue channel.
- Orchestrator sends a confirm booking command to Booking Service, and wait for feedback on orchestrator queue channel.
- Orchestrator sends execute send a notification to Notification Service.
- Orchestrator executes confirm the order in Order Service.
In the case above, Order Saga Orchestrator knows what is the flow needed to execute a “create order” transaction. If anything fails, it is also responsible for coordinating the rollback by sending commands to each participant to undo the previous operation.
Notice: We should move the operations which can’t be rollback to the last order in the transaction flow like Notifications/SMS as we can not revert this action if it has been executed.
Rolling Back in Saga’s Command/Orchestration
Rollbacks are a lot easier now, the Orchestrator should fire execute compensation/rollback event once needed for the corresponding services.
Example: If the booking has been failed by the supplier for any reason after we take money from the client for any reason, we should refund the money to the customer again.

However, This approach still has some drawbacks, one of them is the risk of concentrating too much logic in the orchestrator and ending up with an architecture where the smart orchestrator tells dumb services what to do.
