Data consistency among microservices: is it possible?

Published in

Oracle Developers

8 min readDec 14, 2018

Microservices is a trend, it’s nice, it’s cool… even awesome! But if you are considering using it or have already moved to this approach, have you considered properly how to deal with data? Do you even think that you should think about it anyway?

First: why microservices?

Why should you even consider microservices in your project? Why the heck developers should care about it?

The answer can be found through the Conway’s Law:

“Organizations which design systems … are constrained to produce designs which are copies of the communication structures of these organizations.”

In other words, the software that you and/or your company deliver is structured in the same way that you folks communicate to each other internally.

There’s some good examples of Conway’s Law in action in a famous article of Martin Fowler (URL 1).

First he shows how the “siloed functional teams” structure their applications:

Then how the “cross functional teams” do it:

What brings us to a important concept: microservices intends to scale people first, not software.

That is, doesn’t matter if you split your monolith into dozens, hundreds or even thousands of services. If the process of break it down is not preceded by a split in the way that your team(s) is(are) organized, the output of your project can be a disaster.

Another law related to the way of organizing your teams it’s also well known: “the 2 pizzas law”. It states that no team and/or meeting in the organization should be so big that it can’t be fed by only 2 pizzas. By doing this, the company keeps all the teams small, independent and, if everything works well, agile.

By joining the two laws (Conway and 2 pizzas), you understand that you would break your teams down and keep them small to help you scale them. Before even considering to scale software.

If the teams have autonomy, they can choose the technologies used in their services. They can define the service API. And they can choose, of course, the service’s database.

Databases & Microservices

As our point here is to deal specifically with data with microservices, Fowler’s article also address it. According to the Conway’s Law, the natural database output for both “siloed” and “cross functional” teams is like this:

At the left side, we have the monolith databases: one single database for the whole application.

At the right side, one database per application, aka “database per service pattern”.

If you want to know better how to came from the left side and go to the right side, check this amazing book of my friend Edson Yanaga (URL 2).

The database per service pattern helps on dealing with many aspects of microservices development, but also creates a problem: consistency.

Data consistency

How would you say “consistency” in a single image? I’d do this way:

When speaking about microservices data consistency, it’s related to what should we do to avoid the owl among the cats…

For monolith databases is easy and natural to use ACID transactions to guarantee consistency. The ACID acronym means:

Atomicity: no matter if a transaction has one, two or a hundred of steps; all of them must complete successfully. Otherwise the transaction will be rolled back;
Consistency: all data in the database must be consistent in the end of transaction. Consistent to the integrity references, to business logic references, etc;
Isolation: one transaction can’t touch the data that is being touched by other transaction in the same time;
Durability: relates to persistence. In the end of the transaction, the data must be persistent in the database.

Ok, but… how would you do it when databases are apart from each other? You cannot guarantee even that they are under the same technology. How could you perform transactions among them?

They can be under different networks, cloud vendors, servers, frameworks… take a look at the image bellow:

So far I hope you are convinced that you can’t transact among databases in a microservices architecture, so you can’t use transactions to guarantee consistency. But you still need the results that ACID transactions could bring to your application, right?

Sagas to the rescue

To help us with this challenge there’s a great design pattern called Sagas. For a quick background, it was first mentioned in a paper written by Hector Garcia-Molina and Kenneth Salem in 1987, published by Princeton University.

The paper is great and you can read a copy of it here at URL 3.

One of the most important takeaways of this paper for our issue is a concept called “compensation transaction”.

So for each step of your transaction, you have a side step that will be called in case of failure. Important: the compensation transaction will not rollback what is done, but will… well… compensate it. You can see it in the image bellow:

Imagine that you’ve used your credit card for a $100 shopping. But there was some problem with the product and you decided to return it right away. The credit card company will not rollback or even delete the transaction: they will create another one to revert it. To compensate it.

You will end up with both +$100 and a -$100 transactions in your credit card statement.

So when you have a chain of services been called and one of them fails, you can call the compensate transactions of the previous ones in order to return the state of data to the starting point.

How to manage the compensations calls

Ok, great. Compensation transactions everywhere and everything will work. Problem solved!

Well..

There are two ways of calling the compensations:

Self managed: the service knows what should be done if something goes wrong. It knows which compensation must be called;
Orchestrated: there’s a service used just to orchestrate the chain of calls. It knows the order of calling the services and what must be done when any of them fails.

The first one generates coupling among the services. Also increases the service complexity. And decreases its usability. So, avoid it.

The second one is loose coupled, the complexity is moved to the orchestrator service and the services keep their reusability. Prefer it.

Imagine that a credit transaction could be used in different scenarios. One for shopping, another for items returned, another for online services… Each one of them could have different compensations to be done. So you can have one orchestrator for each scenario and let it manage all tricky parts.

Great! You now have a pattern and a approach for this pattern in order to succeed in your battle against inconsistency. But have you notice that your application complexity is increasing? How can you keep it manageable and still get the best takeaways of Saga?

A hand from Fn Flow

There’s a great open source project called Fn. I won’t go in too much details about it here, but you can get more information at URL 4.

Fn is a serverless platform where you can run your application as functions. And to help you manage your functions executions, Fn has a great tool called Fn Flow.

Fn Flow will not only manage, monitor and log all the executions of your functions (if you use it to call them), but will also gives you a way to orchestrate your sagas.

The code bellow can be found at the Fn tutorial in the URL 5.

It simulates a travel agency, where you have functions for booking and canceling different travel services: hotel, car rent and flights. By the end of a request, you should receive a confirmation e-mail.

The service orchestrator code will look like this:

To understand the code:

First you get an instance of Fn Flow from “ .currentFlow()”;
With this instance, you create the invocation of each service endpoints from “invokeFunction(…)”;
These invocations are instances of Future, so you can have all benefits of async calling;
When doing “thenCompose”, you can manage the order of calling and what should be called when any of them fail;
For each “excepcionallyCompose”, it’s called a “cancel” method, pointing to a endpoint.

The “cancel” method code is something like this:

These callings are tracked by Fn Flow and can be monitored through its dashboard. Check this example:

Awesome! But we are not still “there”… we are “almost there”…

This approach solves the issue of dealing with fails of each service by calling another service (a “cancel” service). But what if the “cancel” service also fails?

You can’t do a saga of a saga, because the possibilities of error are endless and this code would be really bad. So, what should you do?

Business aspects

The decision of what should be done in case of failing the compensation is much more related to business aspects.

For this example the choice was to create a logic of retrying it. So instead of calling a “cancel” method, it was created a “retryCancel” method:

The code behind Retry class doesn’t make too much difference here, it’s enough to know that it will schedule some retries against the failed endpoint.

As said before, the decision about the failed compensation it’s much related to business approaches. It could be:

Open a ticket to a help desk
Change application behavior until the problem is solved
Send an email to someone/somewhere
You name others…

Conclusion

Microservices is a great approach to tackle many situations, but only if it’s used in the right way and for the right reason. Otherwise it will only create complications that you didn’t have in your project and in your team.

Your turn! If you try any of these approaches, share your thoughts and results about it.