Istio service mesh

sushant kode
4 min readJan 28, 2024

--

What is Istio?

Istio is an open-source implementation of a service mesh that enables organizations to secure, connect, and monitor microservices-based apps anywhere.
It networking layer, automating and securing communications between applications. It is a great technology that improves how services communicate in a microservice architecture. It was originally developed by IBM, Google, and Lyft.

Components of Istio

Istio service mesh has two pieces: a data plane and a control plane. The data plane in Istio consists of Envoy proxies that control the communication between services. The control plane portion of the mesh is responsible for managing and configuring the proxies.

Istio architecture

Envoy(data plane)

Istio service mesh injects the Envoy proxy as a sidecar container next to your application container. This proxy then intercepts all inbound and outbound traffic for that service. In addition it also do — the load balancing, circuit breakers, fault injection, etc. Envoy also supports a pluggable extension model based on WebAssembly (WASM).

Istiod (control plane)

Istiod is the control plane component that provides service discovery, configuration, and certificate management features. Istiod takes the high-level rules written in YAML and converts them into an actionable configuration for Envoy.

The pilot component inside Istiod abstracts the platform-specific service discovery mechanisms (Kubernetes, Consul, or VMs) and converts them into a standard format that sidecars can consume.

Citadel, acts as a certificate authority and generates certificates that allow secure mutual TLS communication between the proxies in the data plane.

Galley is Istio’s configuration validation, ingestion, processing and distribution component. It is responsible for insulating the rest of the Istio components from the details of obtaining user configuration from the underlying platform

Istio features:

Istio majorly serves value in 3 main categories:

1. Traffic management

Simple and advance routing
With a VirtualService we can define traffic routing rules and apply them when the client tries to connect to the service. We can define subsets based on labels and weights. we can deploy new versions of services and run them next to the released, production version of the service, without disrupting production traffic. With both service versions deployed, we can gradually release (canary release) the new version and start routing a percentage of incoming traffic to the latest version.

Outlier Detection
Outlier detection is a circuit breaker implementation that tracks the status of each host (Pod) in the upstream service. If a host starts returning 5xx HTTP errors, it gets ejected from the load balancing pool for a predefined time. For the TCP services, Envoy counts connection timeouts or failures as errors.

Resiliency
The goal for resiliency is to return the service to a fully functioning state after a failure occurs. A crucial element in making services available is using timeouts and retry policies when making service requests. We can configure both on Istio’s VirtualService.

Failure Injection
We can apply the fault injection policies on HTTP traffic and specify one or more faults to inject when forwarding the destination’s request.There are two types of fault injection. We can delay the requests before forwarding and emulate slow network or overloaded service, and we can abort the HTTP request and return a specific HTTP error code to the caller. With the abort, we can simulate a faulty upstream service.

ServiceEntry
With the ServiceEntry resource we can make external services or internal services that are not part of our mesh look like part of our service mesh. When a service is in the service registry, we can use the traffic routing, failure injection, and other mesh features, just like we would with other services.

2. Security

Authentication
Istio uses the X.509 certificate from the service account, and it creates a new identity according to the spec called SPIFFE (Secure Production Identity Framework for Everyone). The identity in the certificate gets encoded in the Subject alternate name field of the certificate. The Envoy proxies are modified so when they do the TLS handshake, they’ll also do the portion required by the SPIFFE validation (check the SAN field) to get a valid SPIFFE identity.

Certificate and key management
There are three parts in play when creating identities at runtime:

Citadel (part of the control plane)
Istio Agent
Envoy’s Secret Discovery Service (SDS)

Istio Agent works together with Envoy sidecars and helps them connect to the mesh by securely passing them configuration and secrets.

Secret Discovery Service (SDS) simplifies certificate management. Whenever certificates expire, SDS pushes renewed certificates, and Envoy can use them right away.

Every time we create a new service account, Citadel creates a SPIFFE identity for it. Whenever we schedule a workload, the Pilot configures its sidecar with initialization information that includes the workload’s service account.

Istio provides two types of authentication: peer authentication and request authentication.

Using the PeerAuthentication resource, we can turn on mutual TLS (mTLS) across the mesh without making code changes. However, Istio also supports a graceful mode where we can opt into mutual TLS one workload or namespace at the time. This mode is called permissive mode.

The request authentication (RequestAuthentication resource) verifies the credential attached to the request, and we use it for end-user authentication.
The request-level authentication is done with JSON Web Token (JWT) validation. Istio supports any OpenID Connect providers, such as Auth0, Firebase or Google Auth, Keycloak, ORY Hydra.

3. Observability : telemetry and logs

Istio generates three types of telemetry to provide observability to services in the mesh:

Metrics
Distributed traces
Access logs

Istio generates metrics based on the four golden signals: latency, traffic, errors, and saturation. It collect metrics at 3 levels: proxy, service and control plane.

Istio uses Prometheus to record metrics that track the health of Istio and applications in the mesh.

Grafana can connect to various data sources and visualizes the data using graphs, tables, heatmaps, etc. With a powerful query language, you can customize the existing dashboard and create more advanced visualizations.

Zipkin is a distributed tracing system. We can easily monitor distributed transactions in the service mesh and discover any performance or latency issues.

--

--