Designing applications at scale — Part 1: Microservices and Events

Published in

Adevinta Tech Blog

7 min readFeb 24, 2022

You might have a working prototype application that you want to deliver to a wide audience, or you might have a legacy system that has trouble handling traffic; the question is how do you prepare a system that can scale that application up to millions of users?

In this post, I share some design tips and guidelines we use in Adevinta that will help you get a clean and performant distributed application up and running quickly. In this first article, we’ll start with vertical vs horizontal scaling, then we’ll go through microservices and events. In the second article, I’ll explain how to handle requests and share data between microservices.

Vertical scaling vs horizontal scaling

Vertical scaling is about adding more resources to the hardware that runs the application and it’s the first solution that should be considered. This is the most common solution for small or medium-size projects as it’s relatively easy to achieve using cloud services and usually does not involve development costs or culture changes.

If vertical scaling isn’t cost-effective or if high availability of the service is important, horizontal scaling is another option. Horizontal scaling involves adding new processing units to the application that share the workload, creating a distributed system.

This simplifies the scaling process after it’s implemented as you can just increase the number of machines in the system. Distributed systems also have high availability as the system can keep working even if some processing units fail. This results in zero-downtime deployments when releasing new versions of the application components.

Divide and conquer with microservices

The simplest distributed system uses multiple instances of a single service component that can handle concurrency. This is the cheapest way to implement horizontal scaling and works short term for simple, medium size applications.

However, this solution quickly becomes cumbersome. As we’re just replicating a monolith, we’d have to deal with all the downsides of the single component and a complex codebase, resulting in difficulty implementing and deploying changes and fixes. It would also make it impossible to customise scalability and create new technology implementation barriers. These issues arise because all the application logic is in a single component where all features are tightly coupled with one another.

Splitting the monolith into microservices

The better long term solution is to split the logic of the application into microservices which are dynamic, easier to understand and can be individually worked on by developers. This makes implementation, testing, deployment and application maintenance easier.

One of the first things to think about when migrating the monolith into microservices is to detect what areas can be broken into isolated units, which isn’t a trivial task.

Breaking down the monolith

In my opinion, the best way to define and separate the microservices of an application depends on the complexity of its business logic. A simple streamlined design with few components can fit a simple application, whilst another system with a lot of features and interactions may require a more complex design.

If the application is simple:

First, define a single source of truth component as the centre of the system. Then, separate the microservices by type of interaction or vertical. This separation ensures that each microservice can have a defined API and behaviour, independent of its specific implementation.

For example, this could be a design for an IoT application that allows users to schedule and book meetings in a co-working space:

If the application is complex:

You first need to define domains that relate to each part of the business logic within the application and define how they interact with one another. Each domain should be the responsibility of a development team that owns it so that they can focus on their domain design and implementation independently. Each domain should have a single source of truth that the team owns for their domain entities (even if all of them exist in the same infrastructure) and they should create an API for other domains if they need access to it.

Inside each domain, separate the microservices based on business logic entities or processes, making sure that each component has a single defined purpose. This design pattern eases the understanding of each component in a complex system and simplifies the maintenance.

For example, an eCommerce application could be divided into the following domains, each with its own microservices:

Extra Tip: Don’t get trapped in the pitfall of systematically making a new microservice for every new feature or interface. Although it’s fast and simple, it’ll grow to a point where it becomes impossibly hard to maintain because of the low observability and high complexity of the application. For the system to be healthy, the definition of what the microservice does should be able to change and mutate to the business requirements. You should always have the bigger picture in mind when deciding how and where a feature should be implemented.

Distribute application data with events

Having a single source of truth for the data of the application is almost always a necessity. A LOT of problems can arise if the application handles multiple sources of truth including lack of visibility, lack of clarity, out of date data, desynchronisation and version control issues.

However, to avoid coupling issues in a distributed system, each microservice should handle its own data, and use an API to share it with the rest of the application. One of the most popular ways to organise and centralise data sharing between microservices in a distributed system is through using events and event streaming.

An event is an isolated and atomic record of a change in state for a given entity. It’s produced by the microservices of the distributed system that carry out or register the state change, managed and shared by the event streaming platform and then consumed by the relevant microservices.

Example of a restaurant working with events (icons by Vecteezy)

Asynchronous state with event sourcing

If we want the events to be the main single source of truth for the application, we can use the event sourcing pattern. With this pattern, all the application’s state changes should be comprehensively recorded with events, including the creation, modification and deletion of all the application’s relevant data.

This makes the events themselves act as an asynchronous timestamped record of the state of the application. With this record, microservices can be onboarded using past event data. This enables them to determine the status of the application at any given point in time, and recover from errors through replaying events.

The simplest way to implement this is to embed the data of the entity with all attributes inside the event itself. That way, the only thing that microservices have to do to get all the information they need from an entity is to consume the latest event produced.

To avoid inconsistency issues with event sourcing, it’s a good idea to enforce schema and versioning in the topics of the messaging system like a normal API. Events can morph and change but they should use the same streaming queue as long as the changes are compatible with all the other events. If a breaking change is introduced in a new version, the new events should be produced in a new queue. Consumers can then keep using the deprecated queue while they migrate to the new one, and once the migration is complete, the old queue can be removed safely.

To avoid synchronisation issues, if an event is present in the streaming platform, this means that the change of state has happened and all components of the system must ensure that the change is reflected. Respectively, if a microservice fails and the event is not produced, it should roll back all changes locally so that no components in the system change.

Extra tip: A simple way to ensure that changes in a microservice are not committed if the respective event is not produced, is to separate the operation into two parts with the Listen to Yourself pattern. First, create and produce the event with the registered changes. Then, listen to the event produced and commit the local changes needed. If the microservice fails during the local commit, it’ll simply retry the operation when it restarts until the event is consumed. This ensures that the changes are consistent.

Conclusion

Applying microservices and events requires not only a development cost, but also a change in culture, organisation, processes and policies. However, through using these techniques, you can deliver reliable software more frequently and at a faster pace. I’d say that adopting microservices and events is a change that is worth making in the long run.

This first post covers only microservices and events in an isolated way, but there are also multiple design decisions to make when defining how these elements interact with one another within an application. In part two of this blog, I’ll dive a little deeper into how to handle requests and share data between microservices.