Mind-mapping Microservices Design Patterns

Ruchi Tayal
14 min readAug 25, 2020

Microservice architecture based solutions are the most popular solutions these days. There are several considerations and design options possible when it comes to implementing a solution using a microservices-based approach! These multiple design choices can make it a bit overwhelming. In this write-up, I’ve taken the important considerations one needs to make and the options available to address those problems. Some design options create the need for other additional determinants. I’ve put this co-relation into a simple mind-map, so you view it all in a single frame of reference. If you are starting to build solutions using micro-services or looking at moving your monolithic application to microservices-based design, then read on!

1. How to decompose the application into microservices?

The first step towards creating a micro-service based application is to decompose the application into smaller components. The challenge here is to define the boundaries of microservices.

To create cohesive and loosely coupled applications; it is helpful to follow these guidelines (some of them applied to the traditional object-oriented design as well):

  • Single Responsibility Principle (SRP) — the responsibility of the class is the reason for the class to change
  • Common Closure Principle (CCP) — All classes that need to change for a business reason, are part of the same package.

Hence, in the microservice architecture, they should be designed around business capabilities, not horizontal layers such as data access or messaging bus.

It is helpful to follow one of the following microservices design patterns to decompose the application into easily manageable services.

Decompose by Subdomain

This means define services corresponding to Domain-Driven Design (DDD) principle, which is modeling the software services as per the domains and sub-domains of the business and evolve the software as an understanding of the domain evolves and changes. In this process, business domains and software services tend to share a common language. It could result in a reasonably stable architecture as the business capabilities are relatively stable. The services designed as per this model would be cohesive and loosely coupled. The challenge lies in defining the subdomains and requires an understanding of the business.

Decompose by Business capabilities

This means defining services corresponding to the business capabilities of the organization, which is modeling the services as per the capabilities supported by the business groups. It could also result in a reasonably stable architecture as the business capabilities are relatively stable. The services designed as per this model would be cohesive and loosely coupled. It could be similar to grouping them by sub-domains but as the business grows to add more and more capabilities; you could end up adding classes that may be common to multiple services so-called ‘god classes’

2. What’s the database architecture in a microservices application?

The next step would be to look at how the data would be stored and accessed by these microservices. In the case of a monolithic application, the data layer would contain all the tables, and data would be accessed by the application either using SQL queries, DAO (Data Access Object) layer, or ORM tools like hibernate.

In the case of microservices, where each service is designed to be cohesive and loosely-coupled, each service would need to be able to access the data it needs to work with.

Shared Database

As in monolithic applications, microservices can continue to use a single shared database for all their data. This would require the data to follow well-known ACID properties to keep the data consistent across operations from different services. A single database is simpler to manage and sufficient if the application is smaller. Also, multiple services can follow the same schema/model without too many changes.

Database per service

Services are loosely coupled and need to be developed and changed independently. Given this, one of the patters is to keep each service’s data in its database. The service will access the data only from its database and the data in that database will be private and accessible to only that service’s APIs. To keep the services persistent data private; the entire data server could be specific to the service or a particular schema or table (in case of SQL databases) that could be specific to the service. This would allow the application to modify the schema at its will and also allow each application to choose a database that suits its needs the most. However, business transactions that span across services or combining data from multiple services for the end-user could be challenging.

Certain business processes will require communication between multiple services. In the case of monolithic applications, distributed transactions could be achieved using two-phase commit (2PC). This mechanism would not work for micro-services as it would introduce run-time coupling of application and be an anti-pattern per the principles of microservices.

Also if each service might have its own database, we cannot use ACID transactions that span services. Also, distributed transactions aren’t supported by modern message brokers and some NoSQL databases.

How do we then maintain data consistency?

Saga Pattern

A well-known pattern comes in handy if you need to maintain data consistency across services without using distributed transactions — Saga Pattern. In general terms, Saga means a long involved story with a series of events. In this pattern, a series of transactions are involved, hence the name “Saga”.

A transaction in one service updates the database within the service and co-ordinates with the other services to trigger the next transaction in the saga. As the transactions happen, there needs to be a way to co-ordinate each of these transactions between services. Following 2 ways are the most popular to support the coordination of these transactions

  • Choreography/Events — In general terms, the choreography is a sequence of moves, usually as in a dance form. In this pattern, one transaction in a local service publishes events that trigger the transaction(s) in the next service(s). Each service produces and listens to the other service’s events and decides to take appropriate action. In summary, the saga’s decision making and sequencing of business logic are distributed across services.
  • Orchestrator/Command — As a conductor arranges the music in an orchestral performance; an orchestrator communicates with the Saga participants and tells them what to do and persists their state in the database. In summary, a coordinator service is responsible for centralizing the saga’s decision making and sequencing of business logic.

This helps in keeping the data consistent across services without using distributed transactions. However, if a transaction fails; then the saga executes a series of compensating transactions to rollback the changes made by the previous transactions, which makes the implementation complicated.

How does a service in a choreography based Saga publish an event when it updates the data? One approach is to decompose the micro-services based on DDD and organize the business logic of service as a collection of DDD aggregates which omit domain events. These domain events are published by service and consumed by other services.

Alternatively, a graph of business objects can be treated as a business unit. The collection of DDD events is commonly referred to as Aggregate pattern.

The success of the orchestrator based saga needs the update to the database and publish of event/messages to be an atomic operation in order to avoid inconsistencies. It is not possible to use a distributed transaction to update the database and the message broker atomically. There are 2 well-established patterns to achieve the same: Transactional Outbox & Event Sourcing.

Transactional Outbox

Instead of directly sending the message to the message broker, the message is saved in the first service’s database as part of the current transaction. This achieves internal consistency within the service. If the transaction is unsuccessful and rolled, then no message is saved in the outbox table; thereby avoiding any “ghost” messages being sent.

IIn the case of the relational database, this is an additional “outbox” table, whereas in the case of NoSQL databases the record being updated can be an additional attribute appended to it to track the message.

A separate message relay can be used to forward the message stored in the outbox by publishing them as events to the message broker. This process can run asynchronously and read the messages in the outbox in the same order in which they were created. If the message is successfully sent to the message broker, it can be deleted from the database or marked as processed.

There are 2 mechanisms to build the message relay that will publish the messages/events from the outbox table to the message broker.

Transaction log tailing

One option is to tail the database transaction log (called the commit log also). The committed inserts into the database OUTBOX table are recorded into the database transaction log which can be tailed/read and each change is published as an event to the message broker. The mechanism of trailing the transaction log is specific to each database.

Polling Publisher

This solution publishes messages by polling the database’s outbox table. This option works with any SQL database but may not work with all NoSQL databases where the query pattern is more complex.

Event Sourcing

This helps solve the problem of atomically updating the database and publishing an event. The basis of this pattern is that when you do something to a domain entity; that can be treated as a domain event. Event sourcing persists the state of each domain entity as a sequence of state changes/domain events. Every time the state changes, a new event is appended to the list of events. This list can be replayed to identity the current state of the entity. These events are persisted in an event store, which acts as a database of events. This is a reliable audit log of the changes made to a business entity.

The event store also acts as a message broker and provides APIs for other services to subscribe to events which then get notified when an event is added.

How do we implement queries?

In the case of database per service model; querying the database is likely to be complex where joint data is needed from multiple services. Implementing queries that span services are supported by 2 known approaches.

Command Query Responsibility Segregation (CQRS)

As the name suggests, this model separates the responsibility of reads and writes into different models; where commands are used to update the data, and queries are used to read the data. Commands (write operations) should ideally be done as task-based and can be placed in a queue for synchronous operations. Queries (read operations) do not modify the database, they just read the data from a consolidated view. For greater isolation, the read and write data can be physically separated and that would require the write model to publish an event to update the read store. When this model is used with event sourcing patterns, the store of the event is the write model and a replay of those events works for the read queries as well where applications use queries to reconstruct the state of the entity.

API composition

Another way to implement queries that combine data from multiple services is to implement the query by defining an API composer. This invokes the individual services that own the data and does an in-memory join of the data to form the end result. API Gateway is a classic example of this pattern. (Described below)

3. How to deploy the services?

Each service in the microservice architecture is deployed as a set of service instances for better throughput and scalability. There are various ways of deploying services on hosts.

Service instance per host

The simplest form of deployment and the purest form of segregation/isolation is to deploy each service instance on a single host. In this approach, there is no possibility of any resource conflict among services and each service can use the maximum available resources on the host. It is also straightforward to manage, monitor, and change the deployment of each service instance. However, it would need a lot of hosts if there are more and more services.

Multiple service instances per host

A more efficient way is to have multiple instances of different services run on a single host (virtual or physical machine). The services can be deployed as multiple instances within the same process or each instance as a different process. This allows more efficient resource utilization but as a result, also introduces some risk of conflicting resources and limit of resources for each instance. If multiple services are deployed within the same process, then it would be hard to monitor the resource utilization of each service

Service instance per VM

Developers can choose to bundle/package their applications as a VM image and then deploy the VMs on the cloud environment using only the bare/compute instances of the cloud environments. The service can be scaled by increasing the number of VM instances. Also, all the mechanism of building the service is part of the VM image and the way to start and stop the service would be same across all deployments

Service instance per container

A more modern concept is to bundle/package the application/service as a container image (most popular being Dockers) and deploy each service instance as a Docker container. There are multiple mechanisms to orchestrate and cluster these docker containers like Kubernetes, Mesos, etc. which further simplifies the management and monitoring of these containers. Containers are faster to start as compared to VMs and multiple can be deployed on the same VM instance. While this isolates each service instance within a container, it also puts limits on the resources consumed by the service.

Serverless

This is the latest kid on the block and the leanest mechanism of deployment as it hides all the complexity of servers, resources, containers from the application developers. The developers just need to package their code and upload it to the cloud along with the desired performance requirements. Each cloud provider has its own mechanism of running the code as a service instance.

4. How the clients access the microservices?

If microservices provide a specific functionality via fine-grained APIs, the clients may need to interact with multiple services to get a business response. In some cases, the types of clients may be different and would still need to work in the same fashion. To address this, one of the following options can be used.

API Gateway

This service becomes the entry point for all calls coming from external clients and it can then route the requests to the other services by simply proxying/routing the request to a specific service or by fanning out the request to multiple services. All clients can call the same API which will be handled by this service and dealt with appropriately. It could also provide a different API for each different type of client

Backends for frontend

This is a variation of the API gateway pattern mentioned above. Instead of having a single service/gateway to handle different types of clients by providing different APIs, this pattern defines a separate API gateway that provides a single API for each different type of client.

5. How would the gateway find other services?

In monolithic applications, it is easy to find other components/services as they are part of the same application and the applications are hosted on machines with a dedicated IP address and port. In microservices, the independent services can be deployed separately and may be running in containerized or virtualized environments or the VMs may be assigned IP addresses dynamically. In such cases, the location of the service is not so well-known. There needs to be a way for services to be able to find/discover other services.

Clients can use client-side discovery or server-side discovery to find the location of the service to which it wants to send requests to. Also, for services to be found by other services using any of these discovery mechanisms, they should have their location registered in a central place.

Service Registry

Service Registry is like a database of services with their locations and instance details. Services can register themselves on startup and deregister on shutdown. Clients or other services needed to use a service will lookup the Service registry to find the coordinates of the needed service. This concept is similar to Apache zookeeper or Etcd. Services can register themselves with the service registry using a self-registration pattern or a 3rd party can register service instances with the service registry using a 3rd party registration pattern.

Client-side discovery

When a client needs to make a request to a service, it can look up the service registry. In this case, the logic of searching for the service location in the registry needs to be on the client-side. However, this pattern allows the client to find the services in much lesser network hops.

Server-side discovery

Instead of the client directly accessing the service registry as mentioned above; the client just calls an intermediate router like the load balancer which in turn invokes the service registry to get the service instance location. The router then forwards the request to the service instance found. The client code is much lighter in this case as it does not need to have the logic of accessing the service registry.

6. How do the services communicate with each other?

Synchronous communication between the services would introduce tight coupling between the applications. To seek advantage of the decoupled model, the services will need to communicate with each other in an asynchronous manner.

Remote Procedure Invocation

RPI has been one of the long-used patterns of communication among multiple services, REST being one of the technologies that follow this pattern. The client uses a request-reply model to make requests to another service. This is a simple model as it is well known and there is no intermediary or broker between the 2 services that use this model. Hence, in this case, the client needs to locate the service instances on its own, thereby using a client-side discovery pattern.

Domain-Specific Protocol

The services can use protocol specific to their function to connect with each other. e.g. email services can use SMTP/IMAP

Messaging

This is the most commonly used pattern for inter-service communication which is asynchronous in nature. This pattern can use multiple styles of asynchronous communication and multiple technologies provide support for these different styles. The messaging pattern could follow the standard request/response model where the sender expects a reply, a notification model where the sender just sends a message without expecting any reply and also a publish/subscribe model where the sender sends the message to a common location without knowing who the receivers of the message would be. This pattern has added complexity due to the presence of a broker to make the communication async.

Recall from the Saga pattern that the services in the “saga” use messaging to communicate with each other. Also, CQRS used this to place tasks in an asynchronous queue.

Putting it all together

Mind-mapping all the patterns needed to address the concerns discussed so far:

With all of these considerations, we would have our application designed, deployed, and accessible. The next step is to ensure the application runs securely, is scalable, and performs well. In the next article, we will further look into observability, reliability, performance, and security patterns.

--

--