Istio and Stackdriver

Yuri Grinshteyn
Google Cloud - Community
6 min readJul 19, 2019

Update 08/31/2021

The mechanisms that Istio uses to communicate telemetry have been changed completely since this was written. This should be the source of truth for now:

https://istio.io/latest/docs/reference/config/proxy_extensions/stackdriver/

Update 11/7/2019

The Stackdriver adapter that I manually installed in the OSS Istio section below has been updated, and the link no longer works. You now need to install the adapter using the instructions here.

Introduction

Developers use microservices to architect for portability, scale, and decoupling. This presents some challenges with operations and management — you have to manage all the various services and understand the interactions between them. This is where a service mesh comes in. A service mesh is used to describe the network of microservices that make up distributed applications and the interactions between them. The real value of a service mesh like Istio is that it enhances the security, reliability, and observability of services — which are not necessarily easy to do on Kubernetes without it. I wanted to take a look at how a service mesh and Stackdriver can work together to help understand what’s going on in a distributed application.

Observability with the Istio on GKE Add-On

The first option I wanted to review was monitoring a GKE cluster with the Istio add-on. Istio on GKE is an add-on for GKE that lets you quickly create a cluster with all the components you need to create and run an Istio service mesh, in a single step. Once installed, your Istio control plane components are automatically kept up-to-date, with no need for you to worry about upgrading to new versions. You can also use the add-on to install Istio on an existing cluster.

Here’s the command I used to create a cluster with the add-on:

gcloud beta container clusters create \istio-addon-cluster \--addons=Istio --istio-config=auth=MTLS_PERMISSIVE \--cluster-version=latest \--num-nodes=5 \--enable-stackdriver-kubernetes

Note that I used the PERMISSIVE setting for MTLS configuration for the sake of simplicity. I recommend reviewing the appropriate Istio documentation to choose the appropriate policy for your deployment.

From there, I deployed the Hipster Shop microservices demo app on the cluster using the Istio instructions. That created a lot of services (as expected):

These services are instrumented with the proxy sidecar. The proxies send telemetry information to a component called Mixer, which in turn sends it to Stackdriver via the Stackdriver adapter.

The easiest place to see Istio metrics in Stackdriver is in Metrics Explorer — I simply searched for “istio” and saw all the metrics created by Istio for the services in the mesh:

From there, I was able to create an Alerting Policy for service availability — I used the Server Response Latency metric, filtered it by Destination Service Name for just the front end, and further filtered it by response code to just measure latency for successful requests.

Note that you should consider creating a policy for metric absence so that you can be notified of any issues with Mixer and the Stackdriver adapter sending metrics to Stackdriver. Otherwise, you run the risk of not catching an issue due to a problem with your monitoring infrastructure.

The Stackdriver adapter exports more than just metrics — it also sends traces to Stackdriver Trace. I went to the Trace List screen to confirm that:

Note that tracing is actually disabled by default if the Istio add-on has installed Istio version 1.1.7 or later (as per the documentation). You can enable it by updating the stackdriver-tracing-rule in the istio-system namespace — refer to the documentation for the latest instructions on how to enable this integration.

The adapter also exports Istio-specific logs to Stackdriver. For example, I was able to see the istio-proxy logs to see a list of all the requests traversing the mesh:

The same documentation link will also help you disable logging integration if logging costs are a concern.

Using the add-on with the automatically installed Stackdriver adapter provided everything I need for microservice observability.

The takeaway here is that using the Istio add-on makes service monitoring very easy — there’s no need to install or manage the Stackdriver adapter or any associated configurations — everything was done for me.

Observability with manually installed Istio

Some users may prefer to install Istio in their clusters themselves, rather than use the Istio add-on, to have tighter control over the Istio version or use features or customizations that the add-on cannot support. To get the same observability benefits with Stackdriver, they will need to install the Stackdriver adapter themselves. Here’s what that experience is like.

I started with a GKE cluster with Stackdriver monitoring and logging enabled. As before, I installed Istio with mutual TLS set to PERMISSIVE for the purposes of this demo.

Next, I needed to determine how the adapter was going to authenticate to the Stackdriver APIs. I chose the simplest route — an API key. I created it from the APIs and Credentials section in the Cloud Console and restricted it to only the Monitoring, Logging, and Tracing APIs.

Now, it was time to configure the Stackdriver adapter. I downloaded the configuration file and modified the top section to specify my project ID and API key (note that I uncommented the apiKey line):

spec:
# We'll use the default value from the adapter, once per minute, so we don't need to supply a value.
# pushInterval: 1m
# Must be supplied for the stackdriver adapter to work
project_id: <your project ID>
# One of the following must be set; the preferred method is `appCredentials`, which corresponds to
# Google Application Default Credentials. See:
# https://developers.google.com/identity/protocols/application-default-credentials
# If none is provided we default to app credentials.
#appCredentials:
apiKey: <api key>
# serviceAccountPath:

I created the adapter by using

kubectl apply -f stackdriver.yaml

As before, I went to Metrics Explorer to confirm that metrics were being sent:

I also validated traces in the Trace List:

and logs in the Log Viewer, again by looking at the istio-proxy logs:

Not using the Istio add-on required a bit more work to get the observability data connected to Stackdriver by installing the adapter myself after configuring authentication — but the end result was still the same. Some users will have a need to manually install Istio on GKE, rather than use the add-on, to, for example, explicitly control the version of Istio. With the Stackdriver integration, they still get all of the observability benefits. Metrics, traces, and logs were being sent to Stackdriver, and I had a full observability picture of my application running in the Istio mesh.

Note: if you’re concerned about the overhead of using Istio, you can review the relevant performance and scalability documentation that address this directly.

Conclusion

In this exercise, I set out to understand the differences in user experience when attempting to monitor Istio with Stackdriver between using the Istio on GKE add-on and installing Istio myself. It turns out that the configuration was only slightly more complex with manually installed Istio, essentially requiring determining the means of authentication and configuring the Stackdriver adapter. In the end, the observability data was exactly the same.

I hope you found this useful — as always, comments and feedback are always appreciated.

Resources and additional information

--

--

Yuri Grinshteyn
Google Cloud - Community

CRE at Google Cloud. I write about observability in Google Cloud, especially as it relates to SRE practices.