Saga Architecture Pattern: Pros, Cons and Use Cases

4 min readJul 11, 2023

Introduction:

Saga is an architectural pattern that defines how a distributed transaction should work among multiple loosely coupled services. It is an old concept that is originally published in a paper in 1987 in Princeton University. The main idea of the paper can be summarised by this statement, from the paper;

A LLT(Long Lived Transaction) is a saga if it can be written as a sequence of transactions that can be interleaved with other transactions.

Therefore the Saga pattern is used for long lived transactions that can run across loosely coupled services. Each service has a local ACID transaction and Saga simply puts them all together in a chain of transactions with a well defined states and steps. Each Saga step follows by another step until we reach to an end state.

For an example implementation of Saga pattern you may check this course.

Implementation:

There are 2 ways to implement Saga, using Choreography with async events or using an Orchestrator to control the flow of saga from a single place.

In the Orchestrator approach, an orchestrator service should run a saga step by sending commands to the target backend service, and once gets a correct result proceed to the next step which is handled by sending another command to another service. At any point if a failure occurs it should stop the processing and send rollback commands to the previously completed services to rollback the completed steps. The rollback operation is done by a compensating transaction. Orchestrator approach usually implies a sync operation by sending a command and waiting a response immediately, so it is a blocking operation as opposed to Choreography approach which relies on events and async communication.

In the Choreography approach, I prefer to use a coordinator service to make the event handling process easier. The coordinator service can run a saga step and send async events to an event store. The next saga step is handled by another backend service that will listen the events sent by the coordinator service. At any point if a failure occurs coordinator should stop the processing and send rollback commands to the previously completed services to rollback the completed steps. The rollback operation is done by a compensating transaction as in the Orchestrator approach. This coordinator approach is an improvement to Choreography approach to have some control over the async event processing. The Choreography approach is event-based and therefore ultimately results in a coherent system. This means that any result you read may change on the second reading as the process continues. But eventually the system will come to a final state whether it succeeds or not.

Order Saga Process with Choreography and Async Events with a Coordinator Service

Saga is a complex pattern to implement especially when the number of steps is large and when the communication is handled asynchronously. It requires a good deal of effort to handle failure cases to rollback all previous operations to a known good state when a step is failed with a compensating transaction. So let’s summarise the pros and cons.

Pros:

Enable distributed transaction management among loosely coupled services
Well suited for use cases with small number of steps
Enable rollback operation to a previous known state by using a compensating transaction
Each local transaction of a service is isolated from other services providing better fault tolerance. Multiple services can run their local transactions in parallel

Cons:

Implementation can be complex especially when large number of steps are required
Requires a good effort to cover all scenarios including compensating transactions in each step
Difficult to debug as a result of distributed services
There is no read isolation with Saga pattern. A client can read a state which can later be set to a previous state as a result of a compensating transaction.

Having some advantages and disadvantages, it is important to apply Saga pattern when it really suits well to the problem. The main problem that Saga resolves is to enable distributed long running transactions. Therefore it should be used in use cases which requires distributed transactions. If it is possible to put 2 different local transactions into a single service by redesigning the system then there is no requirement to apply Saga pattern.

We may consider using Saga if below use cases are required:

Having the requirement to have a chain of local transactions that belongs to loosely coupled services
Ensuring data consistency in a distributed system
Having the requirement to have compensating transactions to rollback previous operations if a step is failed

Conclusion:

Managing transaction across loosely coupled services is a difficult job especially when the number of steps in a transaction is large and communication involves asynchronous operations. Saga pattern addresses these difficulties and provides a road map to handle distributed transactions in a safe way by providing well known steps, states along with compensating transactions to rollback when any step is failed. Since the implementation can be complex, it is important to analyse the use case and apply Saga only when distributed transactions are required.

Saga Architecture Pattern: Pros, Cons and Use Cases

Introduction:

Implementation:

Conclusion:

Written by Ali Gelenler