Grafana Alloy & OpenTelemetry

Say Hello to Alloy

Magsther
9 min readMay 18, 2024

Introduction

Grafana Agent has recently transitioned to Alloy. In this post, I’ll guide you through it. While most examples demonstrate sending telemetry data directly from the agent to the vendor’s backend, OpenTelemetry introduces an exciting component called Collector.

With the OpenTelemetry Collector, users gain control over their data, enabling manipulation, enrichment, and vendor-agnostic destination decisions.

Please refer to my earlier post, OpenTelemetry as a Service if you want more information on setting up a central OpenTelemetry collector (gateway).

With the OpenTelemetry Collector, users gain control over their data, enabling manipulation, enrichment, and vendor-agnostic destination decisions.

What is Grafana Alloy, and why should I use it?

Grafana Alloy operates as an OpenTelemetry (OTEL) agent, enabling the collection and transmission of telemetry data from diverse sources to monitoring systems for analysis and visualization.

Alloy serves as a replacement for ingestion components, similar to OpenTelemetry collectors or Grafana agents, but it doesn’t replace tools like Mimir, Tempo, or Loki.

Grafana Alloy is vendor-neutral distribution of the OTEL collector and the recommended way to send OTEL data to Grafana Cloud.

For example, if you intend to ingest and display logs, metrics, and traces in Grafana, you can set up Loki, Mimir, and Tempo as datasources and deploy Alloy to scrape and send the data there.

This approach enables managing ingestion for multiple purposes with one component. This makes Alloy a versatile tool that can meet various monitoring needs.

Walkthrough of Grafana Alloy

Prerequisites

Before we begin, ensure you have a Kubernetes cluster set up. You can create a new local Kubernetes cluster by running the following command:kind create cluster

You’ll also need a free Grafana Cloud account. Grafana Cloud is a fully managed cloud observability platform built on open-source projects like Grafana, Mimir, Loki, and Tempo. If you haven’t already, sign up for a free account as it provides more than enough for what we’ll be doing here.

Add the necessary repository:

helm repo add grafana https://grafana.github.io/helm-charts
helm repo update

Setting up configuration

There are two alternatives for configuring Grafana Alloy for your Kubernetes cluster in Grafana Cloud. I’ll outline both, starting with the easiest method.

Alternative 1 (the easiest way)

To configure Grafana Alloy for your Kubernetes cluster in Grafana Cloud, follow these steps:

  1. Navigate to Infrastructure > Kubernetes > Configuration in Grafana Cloud.
  2. Select the desired features you want to enable.
  • Metrics: This option scrapes Kubernetes cluster infra metrics and sends them to Grafana Cloud Prometheus.
  • Cost Metrics: Enable this to scrape cost metrics, sending them to Grafana Cloud Prometheus.
  • Cluster Events: Capture Kubernetes cluster events, sending them to Grafana Cloud Loki.
  • Pod Logs: Capture pod logs and send them to Grafana Cloud Loki.
  • OTLP Receivers: Grafana Alloy will be configured to receive OpenTelemetry data via OTLP/gRPC and OTLP/HTTP.

3. Proceed with the configuration process.

However, if you want to send data to your own OpenTelemetry Collector, continue with Alternative 2

Alternative 2 (the recommended way)

This approach offers more flexibility and control over the configuration process.

  1. Begin by preparing the Helm values file.
  2. Configure the Helm values file according to your requirements, ensuring to replace <OTLP endpoint>, <your_endpoint>, <my_username>, and <my_secret_password> with my actual values. Alos, ensure the server protocol is set to otlphttp if you're using OTLP over HTTP.
host: <OTLP endpoint>
writeEndpoint: /<your_endpoint>
protocol: otlphttp
basicAuth:
username: <my_username>
password: <my_secret_password>

3. Deploy the Helm chart to your Kubernetes cluster.

helm repo add grafana https://grafana.github.io/helm-charts &&
helm repo update &&
helm upgrade --install --atomic --timeout 300s grafana-k8s-monitoring grafana/k8s-monitoring \
--namespace "default" --create-namespace --values - <<EOF

Verify Deployment

Ensure the deployment was successful by inspecting the deployed resources using: kubectl get all -n default

PODS

NAME READY STATUS RESTARTS AGE
grafana-k8s-monitoring-alloy-0 2/2 Running 0 7m29s
grafana-k8s-monitoring-alloy-events-6fc5d58d6f-k4fz9 2/2 Running 0 7m29s
grafana-k8s-monitoring-alloy-logs-x4chm 2/2 Running 0 7m29s
grafana-k8s-monitoring-kube-state-metrics-f995ccbbc-7gdkh 1/1 Running 0 7m29s
grafana-k8s-monitoring-opencost-6659b44b6f-dn8ck 1/1 Running 0 7m29s
grafana-k8s-monitoring-prometheus-node-exporter-g2ddm 1/1 Running 0 7m29s

DEPLOYMENT
NAME READY UP-TO-DATE AVAILABLE AGE
grafana-k8s-monitoring-alloy-events 1/1 1 1 7m46s
grafana-k8s-monitoring-kube-state-metrics 1/1 1 1 7m46s
grafana-k8s-monitoring-opencost 1/1 1 1 7m46s

SERVICES
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
grafana-k8s-monitoring-alloy ClusterIP 10.96.55.19 <none> 12345/TCP,4317/TCP,4318/TCP,9999/TCP,14250/TCP,6832/TCP,6831/TCP,14268/TCP,9411/TCP 8m7s
grafana-k8s-monitoring-alloy-cluster ClusterIP None <none> 12345/TCP,4317/TCP,4318/TCP,9999/TCP,14250/TCP,6832/TCP,6831/TCP,14268/TCP,9411/TCP 8m7s
grafana-k8s-monitoring-alloy-events ClusterIP 10.96.222.205 <none> 12345/TCP 8m7s
grafana-k8s-monitoring-alloy-logs ClusterIP 10.96.68.158 <none> 12345/TCP 8m7s
grafana-k8s-monitoring-grafana-agent ClusterIP 10.96.207.204 <none> 12345/TCP,4317/TCP,4318/TCP,9999/TCP,14250/TCP,6832/TCP,6831/TCP,14268/TCP,9411/TCP 8m7s
grafana-k8s-monitoring-kube-state-metrics ClusterIP 10.96.168.190 <none> 8080/TCP 8m7s
grafana-k8s-monitoring-opencost ClusterIP 10.96.55.101 <none> 9003/TCP 8m7s
grafana-k8s-monitoring-prometheus-node-exporter ClusterIP 10.96.162.70 <none> 9100/TCP

DAEMONSET
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
grafana-k8s-monitoring-alloy-logs 1 1 1 1 1 kubernetes.io/os=linux 8m52s
grafana-k8s-monitoring-prometheus-node-exporter 1 1 1 1 1 kubernetes.io/os=linux 8m52s

CONFIGMAPS
NAME DATA AGE
grafana-k8s-monitoring-alloy 1 9m4s
grafana-k8s-monitoring-alloy-events 1 9m4s
grafana-k8s-monitoring-alloy-logs 1 9m4s
kube-root-ca.crt 1 121m
kubernetes-monitoring-telemetry 1 9m4s

Inspecting Grafana Alloy

To inspect the Grafana Alloy configuration, run:

kubectl get cm grafana-k8s-monitoring-alloy -o yaml

Here, you can observe the configuration details, including receivers, processors, connectors and exporters used within the OpenTelemetry Collector.

Discovery

You can see that Grafana Alloy first sets up Kubernetes discovery for various Kubernetes resources, such as nodes, services, endpoints, and pods, which can then be used for visualization, monitoring, and analysis in Grafana.

     discovery.kubernetes "nodes" {
role = "node"
}

discovery.kubernetes "services" {
role = "service"
}

discovery.kubernetes "endpoints" {
role = "endpoints"
}

discovery.kubernetes "pods" {
role = "pod"
}

OTLP Receivers

Receivers are set up to collect telemetry data using the OpenTelemetry Protocol (OTLP) over gRPC and HTTP.

     // OTLP Receivers
otelcol.receiver.otlp "receiver" {
debug_metrics {
disable_high_cardinality_metrics = true
}

grpc {
endpoint = "0.0.0.0:4317"
}

http {
endpoint = "0.0.0.0:4318"
}
output {
metrics = [otelcol.processor.resourcedetection.default.input]
logs = [otelcol.processor.resourcedetection.default.input]
traces = [otelcol.processor.resourcedetection.default.input]
}
}

The output block indicates that metrics, logs, and traces should be sent to a processor called otelcol.processor.resourcedetection.default.input

Processors

Processors are configured to transform and enhance telemetry data and can be used to enhance the observability of our monitoring infrastructure. The processors plays a big role in transforming, detecting, extracting, and filtering telemetry data within the OpenTelemetry Collector.

As an example, here are the processors being used.

otelcol.processor.transform         # Transforms metric data by adding attributes.
otelcol.processor.resourcedetection # Detects resources in the environment.
otelcol.processor.k8sattributes # Extracts Kubernetes-related metadata.
otelcol.connector.host_info # Collects host information.
otelcol.processor.batch # Batches processing of data.
otelcol.processor.filter # Filters data.

Exporters

Exporters are defined to export telemetry data to external systems. For example, exporting metrics to Prometheus, logs to Loki and traces Tempo.

Annotation Autodiscovery

Rules are defined to dynamically discover and scrape metrics from pods and services based on annotations.

Scraping Configurations

Configurations are set up to scrape metrics from various components such as Kubernetes, cAdvisor, Kubelet, and Node Exporter.

Relabeling

Rules are defined to modify labels attached to metrics and logs. For example, renaming labels, dropping unnecessary labels, and filtering out certain metrics.

Authentication

Basic authentication is configured for exporters and other components where necessary.

I recommend to inspect the configmap for an understanding of the configuration file as it provides the complete details and configurations.

Once you’ve reviewed and configured the Grafana Alloy settings as per your requirements, return to the UI to verify that everything is functioning as expected.

Explore Monitoring Components

Cluster Navigation

Explore the out-of-the-box cluster navigation features provided by Grafana, providing valuable insights into your Kubernetes cluster.

Cost Monitoring

Utilize Grafana’s cost overview page to monitor cloud costs, essential for efficient resource management.

Alerts

Set up alert rules and recording rules to ensure proactive monitoring and timely response to critical events.

Cardinality

This post isn’t focused on cardinality, but it’s important to mention. Monitoring cardinality helps you understand data uniqueness, which is crucial for optimizing performance and resource utilization. High cardinality refers to data sets with many unique values.

Adaptive Metrics in Grafana Cloud could be used to solve increased cardinality issues.

Configure Application Instrumentation

Now we need to tell our applications to send telemetry data to Grafana Alloy on http://grafana-k8s-monitoring-grafana-agent.default.svc.cluster.local:4318

That’s right, this local OpenTelemetry agent serves as the intermediary, receiving data from both the cluster and applications.

Metrics and Logs

Now that our metrics and logs are flowing from our Kubernetes Cluster to Grafana Alloy, they’re routed to our central OpenTelemetry Collector and then visualized in Grafana Cloud.

Accessing this data is easy — you can navigate through the Cluster Navigation or use the Explore tool. Personally, I prefer the Explore tool for its versatility. It enables exploration of telemetry data from diverse sources. With this powerful tool, you can craft queries, apply filters and transformations, and even perform calculations to extract valuable insights efficiently.

Traces

But what about Traces? Traces provide crucial insights into the journey of user requests through your application. To utilize traces, you’ll need an instrumented application configured to send data to Grafana Alloy. This can be done by setting up the application to send data to one of the following addresses:

For example, in a Spring Boot application, you can set the OTEL_EXPORTER_OTLP_ENDPOINT environment variable to the following address:

- name: OTEL_EXPORTER_OTLP_ENDPOINT
value: http://grafana-k8s-monitoring-grafana-agent.default.svc.cluster.local:4317

I also added the OTEL_SERVICE_NAME variable to springboot-service

- name: OTEL_SERVICE_NAME
value: springboot-service

Once the application is deployed and generating data, we can analyze the traces in the Explore tool.

Exploring Traces in Tempo

Let’s delve into tracing with Tempo. Tempo is an open-source distributed tracing system and provides visibility into the flow of requests across distributed systems.

In distributed systems, a trace is a collection of data that represents the journey of a single user request as it travels through various services. Each step of this journey is called a span, which includes information such as the service name, operation performed, and time taken. Traces and spans help developers understand the performance and behavior of their applications in complex, interconnected environments.

In the Explore tool, select the grafanacloud-<name>-traces datasource. From the Service Name dropdown list, choose your service (for example, springboot-service) to locate your traces. Click on any trace to uncover detailed insights.

Click on any trace to uncover detailed insights.

We’ve set ourselves up for success with excellent observability for our services, applications, and infrastructure. But there’s more! In the next post, we’ll explore Application Observability, another Grafana product that meets all your observability needs. Stay tuned! 😊

Conclusion

By following these steps, you should have successfully deployed Grafana Alloy to your Kubernetes cluster, using the power of OpenTelemetry for observability.

If you find this helpful, please click the clap 👏 button and follow me to get more articles on your feed.

--

--