Istio & Metrics

Matt Law
3 min readNov 15, 2018

--

The out of the box dashboards as described in Part 2 are a great start for understanding whats going on inside the system. For us, we have a preexisting time series monitoring platform, SignalFx. We use this as an aggregator for all of our time series monitoring, and it combines well with PagerDuty to alert the relevant teams of issues.

What I intend to cover in this article is how to incorporate external monitoring, using SignalFx as an example.

Before starting a bit of background reading. Have a look at the Collecting Metrics and Logs page and the Adapters page. There are a number of external adapters that are supported, SignalFx is one of them.

What does it all mean?

There are a couple of new concepts here that need to be understood. The basic premise is that we are going to tell mixer how to handle additional metrics for an adapter and where to send them. Here is my very simplistic overview:

Metrics

As the name indicates, what metrics your interested in seeing. Istio comes with a few default metrics, which you can see here (git repo is here). There is scope for adding your own via the metric template but Im not going to cover that.

Rule

A rule defines filters for providing metrics or alerts on all or some resources. e.g. collect tcpbytesent for all instances OR tcpbytessent for a specific service. We have the ability to be very fine grained here, however Im still looking for documentation of how this is structured.

Adapter

SignalFX provides a number of adapters out of the box. These integrate with mixer to allow Istio to interface with different, usually external, backends for metrics & logging.

The SignalFx adapter is here, and corresponding example configuration here. To see examples of the various supported adapters, have a look at the git repo: https://github.com/istio/istio/tree/master/mixer/adapter

Handler

A handler takes the predefined adapter, and links the metrics into it. It may be easier to start referencing the code that istio provide.

Putting it together:

I think its easier working from examples, so lets have a look how it fits together. Lets start first with the handler definition. Its `kind` is signalfx and the name is handler. This confused me at first (I thought it should be the other way around) but it does allow for multiple handlers for the same adapter, depending on the use case. For example, we have multiple development teams (and brands) and it might be a good idea to split the metrics per team in signalfx such that we can aportion cost.

The purpose of the definition is to start to tie all the pieces together. We can see how it associates which metrics to include. The custom piece in the configuration below is the access-token, which can be generated from the signalfx “Access Tokens” page. Different handlers will have there own mechanism for injecting data.

apiVersion: "config.istio.io/v1alpha2"
kind: signalfx
metadata:
name: handler
namespace: istio-system
spec:
access_token: REDACTED
metrics:
- name: requestcount.metric.istio-system
type: COUNTER
- name: requestduration.metric.istio-system
type: COUNTER
- name: requestsize.metric.istio-system
type: COUNTER
- name: responsesize.metric.istio-system
type: COUNTER
- name: tcpbytesent.metric.istio-system
type: COUNTER
- name: tcpbytereceived.metric.istio-system
type: COUNTER

The metrics outlined above pre-exist. Check them out:

kubectl get metrics --all-namespaces
NAMESPACE NAME AGE
istio-system requestcount 22d
istio-system requestduration 22d
istio-system requestsize 22d
istio-system responsesize 22d
istio-system tcpbytereceived 22d
istio-system tcpbytesent 22d

Following on from the handler definition, we now have a rules section. Here we are putting our ‘match’, as well as what ‘actions’ we need to take with the match. The documentation for rules is here. The following will match all services that have traffic for the protocols listed.

apiVersion: "config.istio.io/v1alpha2"
kind: rule
metadata:
name: signalfxhttp
namespace: istio-system
spec:
match: context.protocol == "http" || context.protocol == "grpc"
actions:
- handler: handler.signalfx
instances:
- requestcount.metric
- requestduration.metric
- requestsize.metric
- responsesize.metric

To match on a specific service, your match context would change to this:

match: destination.service == ratings*

We have now tied in our (pre-existing) metrics, our handler and our rules to send data to our external logging service.

It even worked!

--

--