Microservices Best Practices

Md Khadeer
13 min readJul 25, 2019

--

Microservices are a software development technique — a variant of the service-oriented architecture architectural style that structures an application as a collection of loosely coupled services. In a microservices architecture, services are fine-grained and the protocols are lightweight

Before starting:

  • Know the advantages and disadvantages of microservices.
  • Avoid disastrous mistakes.
  • Make better technological decisions regarding microservices.

When selecting a technology for a microservice, it’s recommended to consider:

  • Maintainability
  • Fault-tolerance
  • Scalability
  • Cost of architecture
  • Ease of deployment

Some examples of frameworks/ technologies team uses for microservices:

  • Scrapy for web crawling
  • Celery + RabbitMQ to communicate the microservices

This definition includes three microservice design principles:

  • Single purpose — each service should focus on one single purpose and do it well.
  • Loose coupling — services know little about each other. A change to one service should not require changing the others. Communication between services should happen only through public service interfaces.
  • High cohesion — each service encapsulates all related behaviours and datatogether. If we need to build a new feature, all the changes should be localized to just one single service.

When we model microservices, we should be disciplined across all three design principles. It is the only way to achieve the full potential of the microservice architecture. Missing any one of them would become an anti-pattern.

Without a single purpose, each microservice would end up doing too many things, growing as multiple “monolithic” services. We will not get the full benefits of the microservice architecture and we pay the operational cost.

Without loose coupling, changes to one service affect other services, so we would not be able to release changes fast and safely, which is the core benefit of microservice architecture. More importantly, issues caused by tight coupling could be disastrous, e.g., data inconsistencies or even data loss.

Without high cohesion, we will end up with a distributed monolithic system — a messy set of services that have to be changed and deployed at the same time in order to build a single feature. A distributed monolithic system is often much worse than a centralised monolithic system because of the complexity and cost of coordination of multiple services, sometimes across multiple teams.

In the meantime, it’s also important to realise what a microservice is not:

  • A microservice is not a service that has a small number of lines of code or does “micro” tasks. This misconception comes from the name “microservice”. The goal of the microservice architecture is not to have as many small services as possible. Services could be complex and substantial as long as they meet the above three principles.
  • A microservice is not a service that is built with new technology all the time. Even though the microservice architecture allows teams to test new technology more easily, it is not the primary goal of microservice architecture. It is totally fine to build new services with the exact same technology stack, as long as the team benefits from decoupled services.
  • A microservice is not a service that has to be built from scratch. When you have a well-architected monolithic app already, avoid getting into the habit to build every new service from scratch. There might be opportunities to extract the logic from the monolithic service directly. Again, the above three principles should still hold.

Microservices Best Practices

  1. Keep Independent and loosely coupled Microservices
  2. Try to Reach the Glory of REST
  3. Use Distributed Configuration
  4. Using Spring HATEOAS. This helps you use navigable, restful APIs.
  5. Monitor everything and Logging
  6. Application performance management(APM). This collects extra details to help you troubleshoot issues. Zipkin
  7. Continuous Delivery
  8. API gateways to aggregate data to specific clients.
  9. Event Sourcing and CQRS (Command and Query Responsibility Segregation)

Best Practices

  1. Loose coupling
  2. Event-driven architecture
  3. Stateless design
  4. Asynchronous communication
  5. Timeouts
  6. Retries w/back off and Jitters here
  7. Self-contained services
  8. Max retries
  9. Rate Limit
  10. Rejection
  11. Circuit-breaker design pattern here
  12. Consider separating data storage: Data should be made private to each of the microservices. Microservice becomes the owner of its data. Any access to data owned by a specific service should only happen through APIs. Failing to do so would allow multiple services to access the database owned by a specific service leading to coupling between services. The architecture pattern such as CQRS (Command and Query Responsibility Segregation) comes handy in taking care of data which required to be read by different kinds of users.
  13. Build separate teams for different microservices: Teams should be divided based on microservices with one team working on one microservice. This consists of product manager, and DevOps staff (development, QA, and Ops staff). Recall that microservices shine when they could help organizations in building cloud-native applications which could be released to cloud frequently with very less lead time.
  14. Design domain-driven APIs: APIs should be designed keeping the business domain in mind. Also, implementation details should not be made part of API design.
  15. Design cohesive services: Consider grouping the functions requiring to change together as a single unit rather than separate services. Not doing so would lead to a lot of inter-service communications representing the hard-coupling.
  16. Consider separating services for cross-cutting concerns: One should consider designing separate services for cross-cutting concerns such as authentication and authorization.
  17. Automate enough for independent deployment: Nicely designed micro-services should be able to be deployed independently. And, build and release automation would enhance the deployment process thereby leading to quicker releases and shorter overall lead time. This would help build microservices truly cloud-native in nature with microservices wrapped in containers and deployed to any environment including cloud in an easy manner. Good DevOps practice followed organization-wide would help achieve this objective.
  18. Failure isolation: Microservices-based architecture should consider adopting isolation of failure with independent microservices. Architecture principles and design patterns such as some of the following would help achieve the same:

Microservice Strategies

Adopting the microservice architecture is not trivial. It could go awry and actually hurt engineering productivity. In this section, we will share seven strategies that helped us in the early stage of adoption:

  • Build new services with clear value
  • Monolithic persistent storage considered harmful
  • Decouple “building a service” and “running services”
  • Thorough and consistent observability
  • Not every new service needs to be built from scratch
  • Respect failures because they will happen
  • Avoid “microservice syndromes” from day one

Build New Services with Clear Value

One may think adopting a new server architecture means a long pause of product development and a massive rewrite of everything. This is the wrong approach. We should never build new services for the sake of building new services. Every time we build a new service or adopt a new technology, there must be clear product value and/or engineering value.

Product value should be represented by benefits we can deliver to our users. A new service is required to make it possible to deliver the values or make it faster to deliver the values compared to building it in the monolithic Node.js app. Engineering value should make the engineering team better and faster.

If building a new service does not have either product value or engineering value, we leave it in the monolithic app. It is totally fine if in ten years Medium still has a monolithic Node.js app that supports some surfaces. Starting with a monolithic app actually helps us model the microservices strategically.

Monolithic Persistent Storage Considered Harmful

A big part of modeling microservices is to model their persistent data storage (e.g., databases). Sharing persistent data storage across services often appears to be the easiest way to integrate microservices together, however, it is actually detrimental and we should avoid it at all cost. Here is why.

First of all, persistent data storage is about implementation details. Sharing data storage across services exposes the implementation details of one service to the entire system. If that service changes the format of the data, or adds caching layers, or switches to different types of databases, many other services have to be changed accordingly as well. This violates the principle of loose coupling.

Secondly, persistent data storage is not service behaviors, i.e., how to modify, interpret and use the data. If we share data storage across services, it means other services also have to replicate service behaviors. This violates the principle of high cohesion — behaviors in a given domain are leaked to multiple services. If we modify one behavior, we will have to modify all of these services together.

In microservice architecture, only one service should be responsible for a specific type of data. All the other services should either request the data through the API of the responsible service or keep a read-only non-canonical (maybe materialized) copy of the data.

This may sound abstract, so here is a concrete example. Say we are building a new recommendation service and it needs some data from the canonical post table, currently in AWS DynamoDB. We could make the post data available for the new recommendation service in one of two ways.

In the monolithic storage model, the recommendation service has direct access to the same persistent storage that the monolithic app does. This is a bad idea because:

  • Caching can be tricky. If the recommendation service shares the same cache as the monolithic app, we will have to duplicate the cache implementation details in the recommendation service as well; if the recommendation service uses its own cache, we won’t know when to invalidate its cache when the monolithic app updates the post data.
  • If the monolithic app decides to change to use RDS instead of DynamoDB to store post data, we will have to reimplement the logic in the recommendation service and all other services that access the post data as well.
  • The monolithic app has complex logic to interpret the post data, e.g., how to decide if a post should not be viewable to a given user. We have to reimplement those logics in the recommendation service. Once the monolithic app changes or adds new logics, we need to make the same changes everywhere as well.
  • The recommendation service is stuck with DynamoDB even if it is the wrong option for its own data access pattern.

In the decoupled storage model, the recommendation service does not have direct access to the post data, neither do any other new services. The implementation details of post data are retained in only one service. There are different ways of achieving this.

Ideally, there should be a Post Service that owns the post data and other services can only access post data through the Post Service’s APIs. However, it could be an expensive upfront investment to build new services for all core data models.

There are a couple of more pragmatic ways when staffing is limited. They could be actually better ways depending on the data access pattern. In Option B, the monolithic app lets the recommendation services know when relevant post data is updated. Usually, this doesn’t have to happen immediately, so we can offload it to the queuing system. In Option C, an ETL pipeline generates a read-only copy of the post data for the recommendation service, plus potentially other data that is useful for recommendations as well. In both options, the recommendation service owns its data completely, so it has the flexibility to cache the data or use whatever database technologies that fit the best.

Decouple “Building a Service” and “Running Services”

If building microservices is hard, running services is often even harder. It slows the engineering teams down when running services is coupled with building each service and teams have to keep reinventing the ways of doing it. We want to let each service focus on its own work and not worry about the complex matter of how to run services, including networking, communication protocols, deployment, observability, etc. The service management should be completely decoupled from each individual service’s implementation.

The strategy of decoupling “building a service” and “running services” is to make running-services tasks service-technology-agnostic and opinionated, so that app engineers can fully focus on each service’s own business logic.

Thanks to the recent technology advancements in containerization, container-orchestration, service mesh, application performance monitoring, etc, the decoupling of “running service” becomes more achievable than ever.

Networking. Networking (e.g., service discovery, routing, load balancing, traffic routing, etc) is a critical part of running services. The traditional approach is to provide libraries for every platform/language. It works but is not ideal because applications still need a non-trivial amount of work to integrate and maintain the libraries. More often than not, applications still need to implement some of the logic separately. The modern solution is to run services in a Service Mesh. At Medium, we use Istio and Envoy as sidecar proxy. Application engineers who build services don’t need to worry about the networking at all.

Communication Protocol. No matter which tech stacks or languages you choose to build microservices, it is extremely important to start with a mature RPC solution that is efficient, typed, cross-platform and requires the minimum amount of development overhead. RPC solutions that support backward-compatibility also make it safer to deploy services even with dependencies among them. At Medium, we chose gRPC.

A common alternative is REST+JSON over HTTP, which has been the blessed solution for server communication for a long time. However, although that stack is great for the browsers to talk to servers, it is inefficient for server-to-server communication, especially when we need to send a large number of requests. Without automatically generated stubs and boilerplate code, we will have to manually implement the server/client code. Reliable RPC implementation is more than just wrapping a network client. In addition, REST is “opinionated”, but it can be difficult to always get everyone to agree on every detail, e.g., is this call really REST, or just an RPC? Is this thing a resource or is it an operation? etc.

Deployment. Having a consistent way to build, test, package, deploy and manage services is very important. All of Medium’s microservices run in containers. Currently, our orchestration system is a mix of AWS ECS and Kubernetes, but moving towards Kubernetes only.

We built our own system to build, test, package and deploy services, called BBFD. It strikes a balance between working consistently across services and giving individual service the flexibility of adopting different technology stack. The way it works is it lets each service provide the basic information, e.g., the port to listen to, the commands to build/test/start the service, etc., and BBFD will take care of the rest.

Thorough and Consistent Observability

Observability includes the processes, conventions, and tooling that let us understand how the system is working and triage issues when it isn’t working. Observability includes logging, performance tracking, metrics, dashboards, alerting, and is super critical for the microservice architecture to succeed.

When we move from one single service to a distributed system with many services, two things can happen:

  1. We lose observability because it becomes harder to do or easier to be overlooked.
  2. Different teams reinvent the wheel and we end up with fragmented observability, which is essentially low observability because it is hard to use fragmented data to connect the dots or triage any issues.

It is very important to have good and consistent observability from the beginning, so our DevOps team came up with a strategy for consistent observability and built tools in support of achieving that. Every service gets detailed DataDog dashboards, alerts, and log search automatically, which are also consistent across all services. We also heavily use LightStep to understand the performance of the systems.

Not Every New Service Needs to be Built from Scratch

In microservice architecture, each service does one thing and does it really well. Notice that it has nothing to do with how to build a service. If you migrate from a monolithic service, keep in mind that a microservice doesn’t always have to be built from scratch if you can peel it off from the monolithic app.

Here we take a pragmatic approach. Whether we should build a service from scratch depends on two factors: (1) how well Node.js is suited for the task and (2) how much it costs to reimplement in a different tech stack.

If Node.js is a good technical option and the existing implementation is in a good shape, we peel the code off from the monolithic app and create a microservice with it. Even with the same implementation, we will still get all the benefits of microservice architecture.

Our monolithic Node.js monolithic app was architected in a way that make it relatively easy for us to build separate services with the existing implementation. We will discuss how to properly architect a monolithic later in this post.

Respect Failures Because They Will Happen

In a distributed environment, more things can fail, and they will. Failures of mission-critical services, when not handled well, could be catastrophic. We should always think about how to test failures and gracefully handle failures.

  • First and foremost, we should expect everything will fail at some point.
  • For RPC calls, put extra effort to handle failure cases.
  • Make sure we have good observability (mentioned above) to failures when they happen.
  • Always test failures when bringing a new service online. It should be part of the new service check-list.
  • Build auto-recovery if possible.

Avoid Microservice Syndromes from Day One

Microservice is not a panacea — it solves some problems, but creates some others, which we call “microservice syndromes”. If we don’t think about them from day one, things can get messy fast and it costs more if we take care of them later. Here are some of the common symptoms.

  • Poorly modeled microservices cause more harm than good, especially when you have more than a couple of them.
  • Allow too many different choices of languages/technology, which increase the operational cost and fragment the engineering organization.
  • Couple running services with building services, which dramatically increases the complexity of each service and slow the team down.
  • Overlook data modeling and end up with microservices with monolithic data storage.
  • Lack of observability, which makes it difficult to triage performance issues or failures.
  • When facing a problem, teams tend to create a new service instead of fixing the existing one even though the latter may be a better option.
  • Even though the services are loosely coupled, lack of a holistic picture of the whole system could be problematic.

AVOID MAKING THESE MISTAKES

  1. Sharing data between microservices is a big no-no. If two services are manipulating the same data, you will start experiencing consistency issues
  2. Avoid trying to switch to microservices without figuring out the platform and the dependencies. Also, believing that microservices are good because every microservice can be written in a different language is a bad practice.
  3. Handling data is crucial. It’s pretty easy to screw up data but really hard to restore. Data migration should happen in more steps
  4. Breaking an application into too many and too small pieces or forcing to transform a system into microservices that shouldn’t be a microservice

Suggestions:

  • Do not stress about selecting the perfect technology. Take an iterative, experimental approach instead.
  • Every microservice architecture is unique; the selected technology should be aligned with the system’s needs.
  • Keep in mind that too many different technologies make hiring more complicated.

Best Practices With Cloud and Microservices

Best practices with cloud and microservices. You can read the first four parts here:

  1. The 12 Factor App: Best Practices in Cloud Native Applications and Microservices
  2. Microservices Architecture: Even Driven Approach
  3. Microservices Best Practices: Why Do You Build a Vertical Slice?
  4. Microservices Architecture Best Practices: Messaging Queues

--

--