Designing applications at scale — Part 2: Handling requests

Jose Antonio Diaz Mata
Adevinta Tech Blog
Published in
10 min readMar 17, 2022
Purple Adevinta shape

In the previous post of this series, we talked about microservices and events as isolated components: what they are and how to implement them in a design for a scalable application. However, for the application to work, these components must interact with one another.

For our distributed application to be healthy and performant, the system should be capable of gracefully handling all the requests that we throw at it. In this post, I’ll go over some of the most common design patterns and some best practices we follow at Adevinta to design communication systems for scalable microservices that can handle a lot of traffic.

But before that, let’s start with understanding the limitations of a distributed application.

Design limitations: the CAP theorem

The CAP theorem is a staple of system design for distributed applications. It describes the limitations of any system handling data requests by establishing three main guarantees:

  • Consistency: every read request receives the most recent data available.
  • Availability: every request receives a fast, non-error response.
  • Partition tolerance: the system can be partitioned and can handle component failures or dropped requests without breaking.
Venn diagram of the CAP theorem

Any system may at most fulfil two of these guarantees by definition. If the system is not partitioned, it can handle all requests locally satisfying availability and consistency like a traditional RMDBS. Otherwise, when a request is dropped due to network issues, the system must decide between two possible outcomes: either dropping the request sacrificing availability, or returning the latest non-error data sacrificing consistency.

As we are designing an application with microservices and events in a distributed system, we must also choose between consistency and availability when managing requests in our application. To help make the decision, we can focus on the experience we want for the application.

  • Focus on availability: if we want users to have a fast and responsive experience, even if the data returned may be a bit out of sync, or if a process gets cancelled down the line. E.g. retrieving a catalogue of products.
  • Focus on consistency: if we want users’ requests to be handled perfectly, and the data retrieved to be always correct and relevant. E.g. adding a product to a cart or handling a purchase or payment.

A compromise: eventual consistency

Even when focusing on availability, we should never send bad data or risk desynchronisation between services. To maintain healthy and stable application data and state, we can use the eventual consistency model. This model specifies that all data updates are eventually relayed to all the services in the application.

Essentially, when we query a service that works with the eventual consistency model we retrieve the latest update of the data the service in question knows about, which is most of the time a little out of date (or very out of date if the system is recovering from a crash). If this model is implemented in our application, we are guaranteed that all services will eventually converge on the same version of the data when there are no more updates.

Eventual consistency example between two microservices

There is a risk involved when implementing the model: If the service receives multiple requests from different sources in quick succession, some updates may not be reflected in the newer changes, and they may be overwritten by more recent requests or be committed in the wrong order.

A good way to minimise this issue is to make sure that the application satisfies three conditions for its data models:

  • There is a single source of truth for each model
  • All operations on the models are atomic and final
  • Data operations are state-independent or are always applied in the same order

Extra tip: when handling eventual consistency, it is common practice to partition the data by using a hash of their unique identifier to fulfil the conditions for avoiding desynchronisation between services. This way, we ensure that all the relevant data for the same object is in the same partition, in order, and readily available. For example, Kafka handles this by partitioning events by message key by default.

Synchronous interactions between microservices

When the application must handle the latest data for a request, or when we require explicit confirmation from all microservices involved in an operation, we use synchronous calls. The microservice will make its own requests to other relevant microservices and will wait until they are finished before returning a response to the original request.

If any of the components of the application handling the request aren’t available, the request itself should fail. In other words, for the request to be successful all of the microservices involved must be running and healthy.

Nowadays, having synchronous requests with very high uptime is quite feasible if the application is hosted on a modern infrastructure with replication and auto-repairing microservices. For example, a system like Kubernetes orchestrates and load-balances a set of replicas for each microservice. It keeps them healthy and restarts them when they fail.

Direct queries

The simplest way to make a synchronous request between microservices is to query the other microservice directly. However, a microservice application can have multiple replicas with dynamic hosts and ports. To account for this, we can use the server-side discovery pattern.

When we create a new microservice we register it as a resource in the service registry, with all required dependencies to other resources. All instances of that microservice will be then tagged with that resource name. Then, when a microservice makes requests, a router distributes them among the registered instances with the target resource tag.

Server-Side Discovery example between two microservices.

On top of that functionality, service discovery routers can also act as load balancers by defining a strategy for distributing requests, and as circuit breakers by avoiding passing requests to downed or high-latency microservices.

Extra tip: When defining an API between services, instead of using HTTP/REST you might want to consider using an RPC protocol such as gRPC. This protocol has 3 main benefits that are useful for server to server communication: It is faster than HTTP/REST, strongly typed, and can also handle data streams. Here is a good article going more in-depth into using gRPC in microservices.

Sagas

There is also another pattern that uses events for synchronous requests between microservices: the saga pattern. When a microservice receives a request, it will start a saga by creating an event with the request data. Interested microservices will read the event, handle it, and then update it in turn. After all the microservices are done the requested microservice will read the result and close the saga.

Saga example for the creation of an order in an eCommerce app

Although the saga itself is synchronous, the interaction with the user does not have to be. As the status of the transaction is reflected on the event, the frontend for the request can query the microservice for updates or use a webhook to view the real-time status of the operation.

Using this pattern allows for the decoupling of the synchronous operation from the user request and enables event sourcing on the operation with all of its benefits. However, this is more complicated to implement and coordinate than direct queries.

For errors there should be a “cancelled” event that microservices can produce and consume to indicate that they should roll back any changes made in the saga. For service downtime or high latency, the microservice that handles the requests may also want to implement a timeout to cancel long-running or blocked executions.

Extra tip: The delay of synchronous operations is not always a negative as immediate responses are not always the best decision for UX. As humans, we feel that operations have more weight and importance if there is a couple of seconds delay. If our users make an order or a payment, and the application responds instantly, they might not trust that the operation even happened. This UX article explores this concept further.

Asynchronous interactions between microservices

When the application must have an immediate non-error response, we compromise consistency for speed by sending the latest data available. This is achieved by querying data between microservices asynchronously.

Even if part of the system fails the request itself will not fail. The microservice that receives the request will simply return the latest data available, and the system will eventually be updated when the failing components are active and healthy again.

For the most part, users prefer a responsive and fast interface over a slow interface, even if the data is older. Because of this, most requests in a distributed application are usually handled asynchronously, especially reads.

Messages and orchestrators

Messaging is one of the simplest ways to handle asynchronous interactions between microservices in an application. The initial microservice generates a message in a queue for the recipient to receive and perform an action. This action can be a simple write in a database that is processing data or executing an operation in an external system.

Messages are different from events. Rather than specifying a change in state that other services react to, it signifies an action or an execution sent explicitly to one or more recipients.

An orchestrator is a microservice that organises, starts and controls the flow of operations in many microservices. An orchestrator is usually defined by a set of workflows that handle business logic and interactions with other microservices. A workflow is dynamic, and can also be designed to parallelise actions and choose what action to execute based on data or execution results.

Orchestrator example for restocking in an eCommerce app

Orchestrators are usually reserved for complex operations that require interacting with multiple microservices and/or external systems. They are especially useful for dynamic tasks started by a user or an external event, like handling new product deliveries to an eCommerce application or the GDPR deletion of user data.

Extra tip: The value provided by orchestrators as microservices is the huge amount of flexibility that they can handle and the ability to control the execution flow of a job dynamically. However, designing and implementing one is not a simple task and carries a maintenance cost. If the execution flow is simple, you might consider embedding it in its originating microservice. If it’s a static or regular task it might fit other automation tools better, like Kubernetes Jobs and CronJobs for example.

Materialised views

If a microservice can read data from other microservices, one of the best ways to enable fast asynchronous reads is through a local materialised view. This pattern is based on the event sourcing pattern, the database per service pattern and the eventual consistency model, and takes advantage of their benefits and guarantees.

To create the view, we start with an empty local data store. We then listen to events produced by the relevant microservices and update the local store with its data. If the application handles event sourcing correctly, the view created with the events should be a comprehensive replica of the full application state and data. The microservice can use this local store directly to handle requests without needing to fetch from other microservices every time.

Materialised view example between 2 microservices

Although this pattern requires a database, an event system and enforces all required patterns and models in the application, it’s one of the most simple, clean and performant solutions for asynchronous data reads. There are also tools like Kafka Streams or Kafka Connect that facilitate its implementation. Additionally, it has benefits outside handling the request like presenting a data view for easy debugging and audits, as well as enabling data rollbacks and reprocessing.

As I mentioned in the previous post, enforcing schema in event-driven asynchronous operations is also important, and more so with this pattern. Knowing that the data will always be consumed correctly is fundamental for good disaster recovery. To do that we can use tools like Avro, Protobuf or JSON-Schema to validate events before they are committed.

Extra tip: One of the complexities of event sourcing is data retention and deletion: events are very good at remembering, but not very good at forgetting. This is usually fine, but it quickly becomes a problem when implementing security or privacy features such as GDPR into event systems. For example, this article goes a bit more in-depth into handling GDPR with Kafka. When implementing one of these features, always remember to propagate changes to views as well.

Conclusion

In this two-part blog post, I gave a general overview of what microservices and events are and described some of the patterns used to interact with one another in an application. This was not a comprehensive list but I hope that you found it informative and interesting to read. The microservices world is extensive and ever-evolving so I invite you to dive deep, to keep learning and make cool things.

--

--

Jose Antonio Diaz Mata
Adevinta Tech Blog

Backend Engineer @Adevinta. I make tech blogs, but you probably knew that by now.