Resiliency and Fault Tolerance are important concepts in microservices architecture. Services are distributed among several nodes and interact with one another through a network in a microservices architecture.
This implies that failures could happen anywhere in the system, which could have an effect on the reliability and availability of the entire system.

A system’s resilience is its capacity to tolerate failures and recover from failures. On the other side, the ability of a system to function even in the face of errors is referred to as fault tolerance. Circuit breakers, retries, and timeouts are a few of the strategies that can be used to create fault tolerance.

https://github.com/App-vNext/Polly

In SuuCat resiliency and fault tolerance are implemented using Polly. Polly is perfect for this kind of work. Polly is a .NET library that provides several policies that can be used to implement resiliency and fault tolerance in a microservices architecture. Polly is generally known for being used in HTTP requests to repeat the request when the desired response is not received, e.g.: TimeOut. We know that it is a bad practice in a microservice architecture for services to be tightly coupled to each other via HTTP requests (except in extreme cases. For example, a final price check of the items in the cart during checkout). Therefore, here we will consider an error scenario that may occur during database creation while the application is starting up.

Now let’s see how we use it in our project. The following code calls the MigrateDatabaseAndSeed() method to create the database and seed it with data.

https://github.com/ebubekirdinc/SuuCat/blob/master/src/Services/Assessment/src/WebUI/Program.cs
https://github.com/ebubekirdinc/SuuCat/blob/master/src/Services/Assessment/src/WebUI/Program.cs

The MigrateDatabaseAndSeed() method uses the Polly library to implement a retry policy for a database seeding operation. The Handle() method of the Policy class, which describes the kind of exception to be handled, is used to first build a retry policy. In this instance, the policy is configured to handle any exceptions thrown while seeding the database.

https://github.com/ebubekirdinc/SuuCat/blob/master/src/Services/Assessment/src/Infrastructure/Persistence/ApplicationDbContextInitialiser.cs
https://github.com/ebubekirdinc/SuuCat/blob/master/src/Services/Assessment/src/Infrastructure/Persistence/ApplicationDbContextInitialiser.cs

The policy is then set up to use the WaitAndRetry() method to retry the action up to five times. The amount of time to wait between retries is specified using the sleepDurationProvider option. The Math.Pow() is used to calculate 2 raised to the power of the retry attempt number in order to get the duration in this situation. As a result, the first retry will wait for 2 seconds, the second for 4 seconds, the third for 8 seconds, and so on.

docker failure
Retrying 3 times with failure.

As you can see in the above Docker log, the database seeding operation is retried 3 times before it succeeds. You can see in the image above that it gives an error like this:

Retrying MigrateDatabaseAndSeed 00:00:02 of RetryPolicy -4face1c8 at null, due to: Npgsql.NpgsqlException(0x80004005): Failed to connect….

The first retry is done after 2 seconds, the second retry is done after 4 seconds, and the third retry is done after 8 seconds. The database seeding operation will be retried 3 times before it succeeds. This can be tested by stopping the PostgreSQL AssessmentDB container on Docker and then starting it again while starting the Assessment API.

docker success
Success after failures.

Here you can see that the database tables are successfully created after 3 tries. With this, we have seen how Polly was used when there was an error to ensure resiliency. However, Polly is also used in areas such as Circuit Breaker, Fallback, Hedging, Timeout, and Rate Limiter.

More info can be found in the Polly docs, and SuuCat GitHub.

--

--