Debugging microservices on Kubernetes with Istio, OpenTelemetry and Tempo — Part 1

Sander Rodenhuis
Otomi Platform
7 min readAug 9, 2023

--

I recently worked on a side project to improve tracing in Otomi by implementing Grafana Tempo and OpenTelemetry. I’m gonna share my experiences and configuration in two posts (because there is so much involved here). This is the first one.

Don’t expect a fussy story about tracing in general. I’m going to explain the full the full setup and share my experiences.

Why this project

Otomi (a self-hosted PaaS for Kubernetes) uses Istio in its core and includes an advanced multi-tenant observability stack with logging (Loki), metrics (Prometheus), alerting (alert manager) and tracing (Jaeger). But we got some questions from users about the tracing setup. Questions like: “Why do I only see partial data with partial context (single spans)” and “Where are traces stored and for how long?

What you need to know about Tracing with Istio

First of all, configuring Istio for tracing is easy to set up. Istio is responsible for managing traffic, so it can also report traces that allow visibility to Istio and the application behaviour. But because there is no code running within the application itself to collect data, Istio can only collect partial data with partial context.

Ehh, what does that mean? Well, when service A calls service B, Istio creates a span that represents the event. However, when service B calls service C, Istio cannot recognize that this is part of the same continual trace originating from service A. To solve this, you’ll need to instrument each service to extract the context propagation from Istio and inject it into the downstream service(s). Instrumenting can be done (manually) by using the OpenTelemetry SDK or automatically by using the OpenTelemetry Operator.

Where are traces stored?

Jaeger natively supports two open source NoSQL databases as trace storage backends, Cassandra and Elasticsearch. In Otomi we only use object storage. There are some open source projects you can use to connect object storage (like AWS S3) with Jaeger, but these projects are not actively maintained. That’s why we didn’t configure Jaeger with a storage backend. This led me to look at using Grafana Tempo as a backend. So to answer the question, in a K8s volume. That’s not ideal, especially not when you have long retention requirements.

Extending the Tracing setup in Otomi with OpenTelemetry and Tempo

I realised that the tracing setup in Otomi was quite limited, so I started a little side project to integrate Tempo as a tracing backend and provide teams (tenants) on the platform to query Tempo to see all elements involved in the request and see all the interrelationships between their various services using a node graph in Grafana.

Otomi uses Istio service mesh in its core. Istio leverages Envoy’s distributed tracing feature to provide tracing integration out of the box. Although Istio proxies can automatically send spans, additional information is needed to join those spans into a single trace. So we need context propagation.

This led to the following solution architecture:

  • Install Grafana Tempo
  • Install OpenTracing Operator
  • Configure OpenTelemetry Collector
  • Configure Istio to use the opentelemetry tracing provider and send spans to the OpenTracing Collector
  • Configure Grafana datasource for Tempo
  • Configure the Grafana datasource for Loki to provide a direct link from a traceID in the logs to the trace in Tempo
  • Configure Instrumentation for context propagation

Install Grafana Tempo

I’m going to use Tempo as the backend for the traces. Tempo can be configured to use Object Storage services like AWS S3, Azure Blob or (in my case) a local (S3 compatible) Minio Instance running in the cluster.

Before we’re installing the Tempo Distributed Helm chart, let’s first look at some important values. I always install charts with my own values ;-)

metricsGenerator:
enabled: true
config:
storage:
path: /var/tempo/wal
wal:
remote_write_flush_deadline: 1m
remote_write:
- url: http://po-prometheus.monitoring:9090/api/v1/write
storage:
trace:
backend: s3
s3:
bucket: tempo
endpoint: minio.minio.svc.cluster.local:9000
access_key: my-access-key
secret_key: my-secret-key
insecure: true

traces:
otlp:
http:
enabled: true
grpc:
enabled: true

metaMonitoring:
serviceMonitor:
enabled: true
labels:
prometheus: system

Install the chart:

helm repo add grafana https://grafana.github.io/helm-charts
helm install -f my-values.yaml tempo grafana/tempo-distributed -n tempo

As you can see, I’m installing the Metrics Generator. This will enable us to see trace related metrics in Grafana Dashboards. More on this later. Please also note that we did not look at resource configuration and scaling options. This is still a PoC right!

If your using Prometheus, make sure to enable the remote write receiver like this:

prometheus:
prometheusSpec:
enableRemoteWriteReceiver: true

You should now see the following pods running:

# kubectl get po -n tempo                                                                              
NAME READY STATUS RESTARTS AGE
tempo-compactor-d59b598b5-8287b 1/1 Running 4 (6h19m ago) 16h
tempo-distributor-7b5b649487-fbzf2 1/1 Running 4 (6h19m ago) 16h
tempo-ingester-0 1/1 Running 4 (6h19m ago) 16h
tempo-ingester-1 1/1 Running 4 (6h19m ago) 16h
tempo-ingester-2 1/1 Running 4 (6h19m ago) 16h
tempo-memcached-0 1/1 Running 0 16h
tempo-metrics-generator-66c5dfc565-5dhsv 1/1 Running 4 (6h19m ago) 16h
tempo-querier-694cbf6d7-gxjzj 1/1 Running 4 (6h20m ago) 16h
tempo-query-frontend-67b4ff47c6-9msmv 1/1 Running 4 (6h19m ago) 16h

Note that my Minio instance has been setup independently of Tempo. If you don’t already have Minio running (or don’t like to use S3 or an Azure storage container, then you can install Minio using the Tempo Helm chart.

Now we have Tempo up and running, let’s install the OpenTelemetry Operator. Why am I using the Operator? Well, I’m not a fan of the OpenTelemetry Collector Helm chart because it never creates the collector configuration I want. If you use the OpenTelemetry Operator, you can create your own custom Collector configuration and more control over it. Another benefit of using the Operator, it supports automated Instrumentation!

Install OpenTelemetry

The configuration is quite straightforward, so let’s just install it:

helm repo add open-telemetry https://open-telemetry.github.io/opentelemetry-helm-charts
helm repo update
helm install opentelemetry-operator open-telemetry/opentelemetry-operator -n otel

Now comes the interesting part: configuring the Collector. Create a OpenTelemetryCollector resource. You can use the following as an example:

apiVersion: opentelemetry.io/v1alpha1
kind: OpenTelemetryCollector
metadata:
name: otel-collector
spec:
config: |
receivers:
otlp:
protocols:
grpc:
http:
processors:
memory_limiter:
check_interval: 1s
limit_percentage: 75
spike_limit_percentage: 15
batch:
send_batch_size: 10000
timeout: 10s
exporters:
logging:
loglevel: info
otlp:
endpoint: tempo-distributor.tempo.svc.cluster.local:4317
sending_queue:
enabled: true
num_consumers: 100
queue_size: 10000
retry_on_failure:
enabled: true
tls:
insecure: true
service:
pipelines:
traces:
receivers:
- otlp
processors:
- memory_limiter
- batch
exporters:
- logging
- otlp
mode: deployment

When the OpenTelemetryCollector is created, you will see the following pods running:

kubectl get po -n otel 
NAME READY STATUS RESTARTS AGE
otel-collector-collector-776cdc65f8-pgmvs 1/1 Running 0 162m
otel-operator-78fc8b6975-h7lkh 2/2 Running 0 16h

Now we have a backend (Tempo) and a Collector (OpenTelemetry) running, the next step is to send some spans. Let’s start with Istio. Istio controls all traffic and when debugging applications this will become a very relevant aspect.

Configure Istio for tracing

Well, that’s easier said than done. There are many ways to configure Istio (Envoy) for tracing, documentation is fragmented and complete tutorials are hard to find. You can choose to configure tracing in the defaultConfig or use extensionProviders. And there are multiple extensionProviders. So the question is: “What configuration to use and when?”. I don’t have all the answers.

I decided to go for the OpenTelemetryTracingProvider and use the default envoy provider to add the TRACEPARENT header to the logs. The idea here is to use the default Envoy provider to add the trace-id to the logs of the istio-proxy sidecar. This is would be quite handy because you can configure the Loki datasource to create a link from the traceID directly to Tempo. More on that later.

I’m using Otomi and Otomi uses the Istio Operator (version 1.17.4). To configure tracing in Istio, we’ll first need to modify the Istio operator resource, using the following meshConfig.

meshConfig:
accessLogFile: /dev/stdout
accessLogFormat: |
[%START_TIME%] "%REQ(:METHOD)% %REQ(X-ENVOY-ORIGINAL-PATH?:PATH)% %PROTOCOL%" %RESPONSE_CODE% %RESPONSE_FLAGS% %RESPONSE_CODE_DETAILS% %CONNECTION_TERMINATION_DETAILS% "%UPSTREAM_TRANSPORT_FAILURE_REASON%" %BYTES_RECEIVED% %BYTES_SENT% %DURATION% %RESP(X-ENVOY-UPSTREAM-SERVICE-TIME)% "%REQ(X-FORWARDED-FOR)%" "%REQ(USER-AGENT)%" "%REQ(X-REQUEST-ID)%" "%REQ(:AUTHORITY)%" "%UPSTREAM_HOST%" %UPSTREAM_CLUSTER% %UPSTREAM_LOCAL_ADDRESS% %DOWNSTREAM_LOCAL_ADDRESS% %DOWNSTREAM_REMOTE_ADDRESS% %REQUESTED_SERVER_NAME% %ROUTE_NAME% traceID=%REQ(TRACEPARENT)%
enableAutoMtls: true
extensionProviders:
- opentelemetry:
port: 4317
service: otel-collector-collector.otel.svc.cluster.local
name: otel-tracing

To enable the extensionProvider, you’ll need to create a Telemetry resource:

apiVersion: telemetry.istio.io/v1alpha1
kind: Telemetry
metadata:
name: otel-tracing
namespace: istio-system
spec:
tracing:
- providers:
- name: otel-tracing
randomSamplingPercentage: 100

By creating this resource in the istio-system namespace, the provider will be active for all namespaces.

I set the randomSamplingPercentage to 100%. In a production environment this will probably be 0.1%.

After the prometheus operator has reconciled, you should see spans coming in that are then exported to Tempo.

kubectl logs otel-collector-collector-776cdc65f8-pgmvs -n otel
2023-08-09T13:07:58.186Z info TracesExporter {"kind": "exporter", "data_type": "traces", "name": "logging", "resource spans": 5, "spans": 7}
2023-08-09T13:08:08.186Z info TracesExporter {"kind": "exporter", "data_type": "traces", "name": "logging", "resource spans": 2, "spans": 6}

In Otomi we also use Nginx Ingress Controller. To eventually see the complete trace from the ingress controller, Istio Gateways, Ingresses and eventually the application, we’ll also configure the Nginx Controller to send spans to the collector. Configure Nginx Ingress using the following values:

controller:
opentelemetry:
enabled: true
config:
enable-opentelemetry: true
otel-sampler: AlwaysOn
otel-sampler-ratio: 0.1
otlp-collector-host: otel-collector-collector.otel.svc
otlp-collector-port: 4317
opentelemetry-config: "/etc/nginx/opentelemetry.toml"
opentelemetry-operation-name: "HTTP $request_method $service_name $uri"
opentelemetry-trust-incoming-span: "true"
otel-max-queuesize: "2048"
otel-schedule-delay-millis: "5000"
otel-max-export-batch-size: "512"
otel-service-name: "nginx"
otel-sampler-parent-based: "true"

See here to learn more about tracing in Nginx Ingress using OpenTelemetry.

Wrap up (for now)

We now have a backend for our traces, a Collector to receive trace spans and export them to the backend (Tempo), and Istio and Nginx Ingress Controller sending trace spans to the Collector.

In the second part, we’re going to instrument our application and configure datasources in Grafana for Tempo to see the real power of tracing in Kubernetes.

--

--