Microservices Mesh — Part III — Istio Advanced

Asad Faizi
May 24 · 11 min read

This is the fourth in a series of blogs — Kubernetes and Microservices Mesh. In our last article, we went over the basics of Istio and saw how useful it can be for setting up and managing more complicated cloud architectures. In a distributed microservices setting, Istio lets you use a service mesh to reap some of the benefits that come with centralization. Today, we’ll be looking into some deeper Istio features to show what a service mesh is really capable of.

Istio is powerful but it can also be quite complicated. Building a scalable service mesh capable of dealing with heavy loads can be fraught. Ideally, by the end of this article, you’ll have a better understanding of the tools at your disposal when building your mesh, and an appreciation of its complexity. While most of the time you won’t need to be manually configuring sidecars or configuring networks, it helps to have an understanding of what’s going on underneath the hood. With a topic as deep as service mesh, the more help you can get out of your tools, the better off you’ll be.

In that spirit, today we’ll be taking a deeper dive into 4 key components of Istio and its service mesh implementation: traffic management (Envoy), the control plane (Pilot), the telemetry component (Mixer) and the security component (Citadel). We’ll look at each of these components in turn and explain how they fit into the larger service mesh picture.

Envoy

You’re probably already familiar with the sidecar proxy from our last piece on Istio. There, we showed how the proxy is added to each service definition and deployed in the same pod as the service themselves. We treat Envoy and the sidecar as practically synonymous even if it is possible to plug in something different for the sidecar.

The sidecar acts as the main networking gateway for traffic coming from an individual service that you’ve defined: it takes incoming traffic directed towards the service and routes it according to the rules and policies you’ve set in your overall configuration.

The sidecars need to handle control information coming from two sources. The first is from you, and the configuration you’ve deployed in your mesh. When you pass along configuration such as changing load balancing parameters, adding new nodes and services, or new network routing information, that gets put into Pilot, which is Istio’s source of truth for all the pertinent information about the state of your app. The sidecars periodically check in with the Pilot, to make sure they have the latest configuration information and make any adjustments they need to their local rules.

The second source of control information is from the apps the sidecars are attached to. In its role as a load balancer, the Envoy sidecar constantly checks up on the health of instances that it’s attached to and pings to make sure they’re still active. It also monitors key indicators like response time to make sure requests are being handed. Envoy will remove bad instances from the pool to make sure that bad deploys or server errors don’t bring down your entire service.

So what does adding a sidecar mean in terms of benefits? Aside from the built-in load balancing and health check, you can also configure traffic in interesting ways to help you gain insights into how your app functions. For example, when deploying new versions with significant code changes, there’s only so much you can learn from testing in a development environment. Often, you’d like to direct a small amount of production traffic to an instance running new code so you can see how it holds up under real-world scenarios.

Envoy lets you define a configuration that weights the load between different versions. That means you can start by directing just 5% of traffic to a new version of a service. When you push the configuration change to Pilot, it changes the load balancing behavior on the sidecar. Initially a small amount of traffic goes to the new service and you can collect data on your experiment. As you gain confidence you can scale up from 5% to 100% incrementally, then iterate again.

Configuring the Sidecar

What does this configuration look like in practice? Well to move traffic to different destinations you just need to set the weight parameter on the service configuration, as here:

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: helloworld
spec:
hosts:
- hello1
http:
- route:
- destination:
host: hello1
subset: v1
weight: 75
- destination:
host: hello1
subset: v2
weight: 25
---
apiVersion: networking.istio.io/v1alpha3
kind: DestinationRule
metadata:
name: helloworld
spec:
host: hello1
subsets:
- name: v1
labels:
version: v1
- name: v2
labels:
version: v2

Here we have a single host, hello1, with two versions of the destination on it, v1 and v2. We assign v1 75% of the traffic at the start. After pushing the configuration, you can check traffic in a pre-defined interval and change the parameters again.

Envoy gives you great control over the way traffic is sent to individual destinations. If your traffic is bursty and overwhelming your hardware, you can introduce latency or faults by changing the fault parameter:

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
name: helloworld
spec:
hosts:
- hello1
http:
- fault:
delay:
percent: 10
fixedDelay: 3s
route:
- destination:
host: hello1

Pilot

The main counterpart to the Envoy sidecar is the Pilot, which is a centralized manager for all the Envoy instances in your mesh. Pilot is the central source of truth, where you locally store all the rules for your different services and how they communicate, and where information about what’s running is stored. Pilot is often synonymous with the Control Plane of Istio, because it manages and configures all the Envoy Proxies (although other components such as the Citadel are also technically part of the control plane).

Naturally, the Pilot is a pretty important piece of the puzzle when you’re trying to understand the mesh, but in practice most of the work associated with the Pilot is performed by the Envoy proxies themselves. They’ll check in periodically with the mesh’s Pilot to make sure they have the latest configuration, and update regularly. The control plane is also what you’re interacting with when you change the configuration for the mesh. This division of duties is a key idea in Istio: the Envoy proxies act in concert with the Pilot to produce the services you’re building your app out of. Understanding how they all interact is a crucial part of understanding the mesh concept.

In our first article, we talked about how the great abstraction of the service mesh is that it allows you to specify higher-level commands about how you’d like your services to interact, and the service mesh parses those into precise configurations for the different services it controls. In the Istio setup, the Pilot is where this core functionality happens. Pilot takes as input service discovery rules and spits out specific actions that are performed on the Envoy proxies (or any sidecar format that is compatible with the Envoy API for that matter).

Test-Driving Pilot

The best way to familiarize yourself with the Pilot instances for your mesh is to take a look at what sidecars have checked in and understand what their status is. You can get a list of the services in your mesh along with the Pilot id that controls them by running the command:

istioctl proxy-status

And out pops a list of the services and their ids, their Pilot id, and the current synced status. Sidecars that are listed as “Synced” or “Synced (100%)” are up-to-date and have acknowledged the latest configuration changes from the Pilot. The status “Not Sent” means that there are no recent configuration changes, so there’s nothing to sync. And the status “Stale” means the sidecar is not responding to changes.

If you do have sync issues, the proxy status command also allows you to supply a service id and view the sync details that are going awry between the Pilot and the Sidecar. For example, if your service list above included an id like “hello-world-v2–5aa3f7abd-ppz2n.default” you could run the command:

istioctl proxy-status hello-world-v2-686a3b641-4r52s.default

You should see the following output:

Clusters Match
Listeners Match
Routes Match (RDS last loaded at Fri, 24 May 2019 20:22:14 UTC)

Note that the output of proxy-status commands has been changing with different versions of Istio (1.0.5, 1.1.2 and latest 1.1.7). In previous versions of Istio, the output of the command contained complete JSON, but starting with 1.1.7, the output is as above. You may see different output format.

And the Pilot will show you a formatted diff of what configuration changes haven’t made it up to the service and been acknowledged yet.

Mixer

Similar to the Pilot, Mixer is an Istio component that operates on traffic and applies rules that you configure. The key difference is that Mixer operates on the level of the mesh as a whole, and lets you apply global rules. This means you can use Mixer to collect telemetry on how your app is performing as a whole, or set global limits on how certain services are used.

It’s important to get a handle on the different ways that Envoy and Mixer operate. As you would expect from the name sidecar proxy, Envoy is added to each service deployed in your application and operates on the traffic coming and going to that individual service. In contrast, Mixer operates on the level of the application as a whole, and applies rules that you’ve set for the entire application.

What might that look like? Well the rules can range from something as simple as logging every request with a timestamp and an IP address, to applying complex quotas and whitelisting rules to protect security. Mixer is the component that brings you the centralization you’d expect from a monolithic application structure. Want to set up billing for your SaaS application based on the number of requests a user sends? You can do that through Mixer. Want to add logging to an external service whenever a user on your application accesses sensitive information? Mixer.

The other great thing about Mixer is its integration with third-party plugins. Naturally, the main source of centralized information about your application lends itself well to tools for visualizing activity or querying logs. Mixer lets you add these to Kubernetes and expose the information it collects, often in real time. One logging platform, Prometheus, comes pre-installed.

In general, these tools can make your life a lot easier. Mixer gives you insight into your service mesh that used to be out-of-reach for microservices deployments. Moreover, getting insight into what a microservices deployment is doing is not exactly easy. You should get all the help you can from whatever tools are out there.

Using Mixer

Istio comes ready with several Mixer policies able to be deployed. To start collecting telemetry data, it’s as simple as applying the YAML file with the configuration.

kubectl apply -f samples/bookinfo/telemetry/metrics.yaml

Then, incoming traffic will be logged and sent to the central Mixer instance. If your service is live, you’ll see results immediately. Otherwise, you can send some traffic to the service artificially, then see the result.

The easiest way to see the traffic is via Prometheus, the log querying tool for Istio that comes built-in. Set up Prometheus port forwarding

kubectl -n istio-system port-forward $(kubectl -n istio-system get pod -l app=prometheus -o jsonpath='{.items[0].metadata.name}') 9090:9090 &

Then, Prometheus will be running on port 9090 on your local machine. You can query the logs from there.

Citadel

Apart from the scalability provided by Envoy and the insights provided by Mixer, one of the main benefits of the cloud mesh structure is the security. In Istio, that’s handled by Citadel, the main security component. Citadel helps manage the keys and certificates necessary for a modern microservices deployment.

Simply put, managing the security component of a microservices deployment is typically one of the more frustrating parts of upgrading to a service architecture. You’re forced to choose between managing lots of individual, constantly changing self-signed certificates if you want to do mutual TLS, or leaving things unencrypted and hoping for the best.

Citadel takes a lot of the work out of that for you. It automatically handles the creation and storage of keys based on your service definitions and manages them using Kubernetes’ built-in secret management infrastructure. By interacting with Kubernetes, Citadel can ensure that each new service has a certificate assigned and that each new Envoy proxy is configured to trust the certificate that deployed with that new service.

That means there’s no longer any excuse for not using mutual TLS, considered to be the best practice for service architecture these days. Each service undergoes a TLS handshake when sharing data with other services, so even if there’s an attacker watching your network traffic, your data is encrypted and there’s no risk of exposure.

Of course, self-signed certificates and even mutual TLS are just the start of what Citadel can do. It can interact with third-party certificate authorities and use alternative certificates if you have existing security measures in place, and it can set you up for stricter policies like mutual TLS over HTTPS if needed. For a security service, it’s surprisingly flexible.

Using Citadel

While the security features it provides can be quite complicated, it’s actually pretty easy to get started using Citadel. As with Mixer, most of the magic happens automatically. You just need to activate the correct configuration to get things going.

To activate global mutual TLS, for example, you’ll need to change the mesh’s authentication policy. That can be done via kubectl, just apply the following:

kubectl apply -f - <<EOF
apiVersion: "authentication.istio.io/v1alpha1"
kind: "MeshPolicy"
metadata:
name: "default"
spec:
peers:
- mtls: {}
EOF

However, that’s only half the story. While you’ve told the mesh services to only accept incoming traffic with TLS, they’re not configured to send outgoing requests using TLS, so they can’t communicate at the moment. To fix that, configure the networking rules to use mutual TLS for outgoing requests:

kubectl apply -f - <<EOF
apiVersion: "networking.istio.io/v1alpha3"
kind: "DestinationRule"
metadata:
name: "default"
namespace: "istio-system"
spec:
host: "*.local"
trafficPolicy:
tls:
mode: ISTIO_MUTUAL
EOF

Then give your services a test. They should still be able to communicate with each other, but now they’re doing so with encrypted traffic. With two commands, you’ve made your network safer and your traffic more secure.

Conclusion

By now, you should be more familiar with the depth and breadth of what Istio offers, and how that relates to the benefits of the service mesh pattern more generally. One nice thing about Istio is that they’ve separated concerns quite nicely. Each component, like the Envoy sidecar or the Citadel key management system, is isolated and self-contained.

This means Istio rewards taking some time to learn how individual components work. So take some time and play around with one or two of the components in a scratch application, or work through one of our examples. It’s rewarding and pays dividends in your understanding of how what service meshes have to offer.

Of course, it’s nearly impossible for a single developer to really get a grasp on what Istio has to offer. As technology evolves, keeping up with changes and new complexity can be a full-time job. While the examples here can help you learn what’s going on under the hood, it’s typically best to use the available tools to make sure you’re doing things the correct “Istio” way.

Asad Faizi

Founder, CEO
CloudPlex.io, Inc

asad@cloudplex.io

www.cloudplex.io

Faun

The Must-Read Publication for Aspiring Developers & DevOps Enthusiasts

Asad Faizi

Written by

Faun

Faun

The Must-Read Publication for Aspiring Developers & DevOps Enthusiasts