Why do we need a service mesh?
Ok, so what actually is a service mesh to begin with? William Morgan the CEO of Buoyant, the company behind Linkerd, one of the “big 3” services meshes, wrote a great post about the whats, the whys and the hows.
In summary, in a microservice architecture, an app is split up into (micro-)services, each fulfilling a certain task. Each service needs to talk to other other services to implement the app’s functionality.
The important part here is “needs to talk to”. While it may seem simple to have services talk to each other, it can get complicated quite quickly in practice. Just think about things like service discovery, monitoring, tracing, load balancing, or encryption of communication. Without a service mesh, each service and application would potentially need to solve these over and over again. What a waste!
Service meshes move this functionality into a new application-independent infrastructure layer and thus decouple business logic from service-to-service communication logic. It comes with no surprise that many microservice deployments rely on services meshes. By far, the most popular service meshes are Istio, Consul, and Linkerd.
The following is a sketch of a microservice and container-based application (e.g., on Kubernetes) with a service mesh.
How do service meshes actually work?
Fundamentally, services meshes are implemented using so-called sidecars. The most prevalent sidecar is probably Envoy. In essence, sidecars are separate containers that proxy network communication from services. Sidecars observe, control, and often encrypt the network communication between services.
Sidecars form the data plane of service meshes. The other part of a service mesh, the control plane, manages and configures the sidecars to balance load, enforce policies, and collect stats.
Ok, you said sidecars “encrypt the network communication“ between services”, can’t we then just put our services into secure enclaves and be done with it?
Glad you asked! Unfortunately, in the context of confidential computing, that wouldn’t help much. Briefly, there are two main reasons:
- Encrypted service-to-service communication needs to terminate inside secure enclaves instead of separate sidecars. Otherwise an attacker could just tap the service-to-sidecar communication, manipulate the sidecar, etc.
- A crucial aspect of confidential computing is verifiability. Someone needs to make sure that each service in the cluster is actually running inside secure enclaves and that it was initialized with the right parameters and code.
In particular, the second reasons requires careful design and thought. Of course there is also much more to consider like “what happens in case of failures or updates?” or “how does the user of the app know that everything is alright without jumping through 10,000 hoops?”.
By now, we hope to have convinced you that confidential computing is useful, that services meshes are useful, and that we clearly need a dedicated service mesh for confidential computing — just like Marblerun.
In the next and final episode of this series we’re going to dig deeper into the requirements for a service mesh for confidential computing and give an intro to the design of Marblerun. See you soon!
This blog post is the second of a series of three concerning with the question “Why do we need a service mesh for confidential computing?”.