Service Mesh concept explained in plain English

Alex Burnos
The Startup
Published in
6 min readMay 26, 2019

If you’ve heard the term “Service Mesh,” but still struggle to internalize what it is, and it is probably too late to admit it, — this article is for you.

You start with googling “What is Service Mesh,” the hopes are high, and first results tell you it is a programmable infrastructure, it is likely there to connect services. If this googling endeavor did not help, and there is still “Very nice… but what is Service Mesh?” question in your head — read on.

This article aims to explain the “Service Mesh” concept in generic and relatively simple terms, what it is and what kind of problems it is trying to solve, without going into a review of specific implementations.

Understand the problem to understand the solution

To understand what Service Mesh is, you need to understand what problems it is solving. It begins with a tale of Monoliths and Microservices.

Traditionally, applications are monolithic, meaning it is one program, built as one binary and ran as one process.

Monolith application is what you think of as “normal application.”

Monoliths are simple, but they come with challenges of their own, to name a few:

  • Hard to scale. If any single component of your application needs to scale — you need to scale the whole application.
  • Hard to release. Because everything within monolith is tightly coupled, any change you do in the application could affect other parts of the code. Multiply this by many teams trying to make the change simultaneously and get your version of the release hell.
  • Reduced technology flexibility. Trying new technology (like a new language, for example), without migrating the whole codebase to it, is often problematic.
  • Impact on the team’s dynamic. Last but not least, it is harder to draw borders of responsibility, assign roles, and develop teams when you only have one big deliverable.

Many of these drawbacks are only a problem when a particular scale is reached. For many use cases, the monolith is a right and reasonable choice. However, you started scaling, and it is no longer a choice for you. What to do? Well, of course, let’s break things… down.

Microservices to the rescue

People were building “Microservices” before it became fashionable to call them so, the industry likes fancy terms to create a hype around them. “Microservices” is a software design pattern that favors breaking down an application into independent, de-coupled from each other, components. De-coupled means that they have their codebase and run as standalone programs but interconnected via some communication interface. This interface is usually network.

When you had a monolith, it consisted of several logical modules: things like frontend/UI, backend logic, a database, or other storage.

Monolith consists of several logical components A, B, C, D.

To turn it into “Microservices,” we take each logical component and make it an independently developed and deployed program. Each such application interacts with others over the network, forming the same overall coherent view as if nothing changed from the user perspective. When it runs, we will call such a program “microservice.” In practice terms “microservice” and “service” are interchangeable, “micro” attempts to highlight that this service is a relatively small part of a broader application or service.

In practice terms “microservice” and “service” are interchangeable.

Monolith becoming “Microservices,” services A, B, C, D interoperate over the network now.

Once you moved to this design pattern, the whole world of possibilities opens up:

  • You can scale each service independently from each other.
  • You can de-couple release cycles of each service.
  • You are free to write each service using the technology of your choice, as long as they use the same communication interface between them.
  • You can build your organization structure around independent services, being able to assign clear ownership, roles, and responsibilities.

So far, so good, but we also went into this trouble to achieve scale and resilience, right? That’s why we don’t want to run a single instance of services A, B, C, D (what if one of them goes down, what if service needs more capacity?), we want a lot of them. We spin off something like Apache Mesos, Kubernetes or Nomad to run many copies of these services at scale.

Multiple copies of services help to achieve resilience and scalability.

Are microservices containers?

Let’s resolve common misconception. I often hear people confusing “microservices” with “containers.” The truth is that they are orthogonal concepts. You don’t need containers to have microservices, and you don’t need microservices to use containers. Containers are a way to package your service. It is a choice, that can be equally applied (or not) to both monoliths and microservices.

You don’t need containers to have microservices and you don’t need microservices to use containers.

I’ll leave you with this thought, and for the rest of the article, microservices will be what they are — independent programs, containerized or not.

The monolith problems are dead. Long live microservices problems!

Congratulations, you got rid of monoliths shortcomings and acquired a whole new class of microservices problems.

Microservices are coming with their challenges.

Instead of deploying one big binary, you now have N smaller binaries running around, with own lifecycle, scheduled and terminated as needed on many different servers, while communicating with each other.

Effectively, you got yourself a mesh of services (hint, hint) and following brand new issues to resolve:

  • Service discovery. Service A needs to talk to B, C to D, etc. Since each service can be scheduled anywhere and migrate between servers as they come and go, how does service A know which server to connect to talk to B? It needs to discover the exact location of the service before making a connection.
  • Traffic management. Maybe you have instances of service D running in two different datacenters, you want service C use service D in Datacenter 1 and only use the same service in Datacenter 2 only if the first one is not available.
  • Security & Policies management. You want to control who and how can talk to service C. Network-level access control is not sufficient now since the same service can have different addresses, that might belong to various networks. You also want to ensure that all communication between services is encrypted.
  • Observability. A fancy word for monitoring when you have to monitor much more than a single binary. You want to know (observe) what is the state of your services, connectivity, and failures, with an ability to trace and troubleshoot specific issues. Observability gets progressively harder as the number of services and interconnections scale.

You could offload solution of these problems to the services, by arming them with this additional, infrastructure specific, logic. However, this would mean the following:

  • Application scope would grow much beyond its business needs. Do your developers want to work on the application or infrastructure?
  • You would need re-implement this logic for every technology stack you use (and having microservices architecture could mean there is more than one).

Better to outsource solution of this problem to where it originated — infrastructure. You need some software stack, that would modernize your deployment to allow your services to discover each other, control the traffic and policies, and provide observability, ideally without modification of services themselves. This infrastructure solution is called “Service Mesh.”

The tricky part is in terminology overload, “Service Mesh” is both “deployment where many instances of many services communicate with each other” as well as “infrastructure solution” for problems that come with the mesh of services.

I hope this helped you to develop the right mental model to understand the source of the problems that “Service Mesh” solves and what it effectively is. In the next article, I go over conventional approaches of how exactly “Service Mesh” could be implemented, along with the cons and pros of each method.

--

--