OpenTelemetry and Cloud Trace: Usage and More

Irvi Aini
Google Cloud - Community
5 min readApr 27, 2023

Open Telemetry and Cloud Trace

Google Cloud Trace (GCP Cloud Trace) is a performance tool provided by Google Cloud Platform that allows developers to analyze and troubleshoot the latency of their application’s operations. It collects latency data from applications deployed in the Google Cloud and presents it in near real-time in the Google Cloud Console. This enables developers to identify performance bottlenecks and optimize their applications.

GCP Cloud Trace provides insights into how requests propagate through your application, where time is spent, and what is causing latency. Traces consist of a set of spans. Each span represents a single timed event within a trace, like a remote procedure call. Spans can be nested and form a trace tree. The trace tree shows the path of the request through various services and is essential in distributed tracing.

OpenTelemetry, on the other hand, is a set of APIs, libraries, agents, and instrumentation that standardize the generation, collection, and description of telemetry data (traces, metrics, and logs) for observability. OpenTelemetry is a Cloud Native Computing Foundation project and represents the merger of two former projects: OpenCensus from Google and OpenTracing from Uber.

The data model of OpenTelemetry is designed to handle both traces and metrics. Traces in OpenTelemetry are defined similarly to GCP Cloud Trace. A trace is a set of spans, where each span represents a single operation like a function call or a remote procedure call. Spans can be associated with other spans through their SpanContext, forming a trace tree. Metadata can be attached to traces and spans in the form of attributes.

OpenTelemetry works by providing a unified way to generate, collect, and export telemetry data. Instrumentation libraries are used to generate telemetry data from your applications. The data is then collected by the OpenTelemetry Agent or Collector, which can export the data to various backends for analysis. This makes OpenTelemetry a highly flexible tool, as it allows you to choose the backend that best suits your needs, whether it’s Google Cloud Trace, Jaeger, Zipkin, or any other supported backend.

One of the significant benefits of OpenTelemetry is that it standardizes telemetry data, making it easier for developers to switch between different backends or use multiple backends simultaneously. This can be particularly beneficial in complex, distributed systems where a variety of telemetry data may be needed to effectively monitor and troubleshoot the system.

Use Cases Example

To illustrate the usage of Google Cloud Trace, consider an e-commerce application deployed on Google Cloud. This application involves several microservices, such as user authentication, product listing, order management, and payment processing. When a user places an order, it triggers a sequence of operations across these services. To monitor and optimize the performance of this operation, developers can use Cloud Trace.

Firstly, developers need to instrument their application code to generate traces. For applications running on Google Cloud’s App Engine, Cloud Functions, or Cloud Run, traces are automatically generated for all incoming HTTP requests. However, for applications running on other platforms like Google Kubernetes Engine (GKE) or Compute Engine, developers need to use OpenCensus or OpenTelemetry libraries to manually instrument their code.

Once the application is instrumented and generating traces, these traces can be viewed and analyzed in the Google Cloud Console. Each trace shows the path of the request through the application, with spans representing operations in different services. The spans show the start and end times of each operation, allowing developers to identify where latency is occurring. By analyzing these traces, developers can identify performance bottlenecks, such as a slow database query or a third-party API call, and work to optimize them.

Tags, also known as labels in GCP, can be used to further enhance trace analysis. Tags are key-value pairs that can be attached to spans. These can be used to add additional context to spans, such as the user ID for a request or the instance ID of the VM handling the request. For example, if a particular user reports an issue with order placement, developers can filter traces by the user ID tag to analyze the performance of requests from that user.

In addition, developers can set up alerts based on trace data. For example, an alert can be set up to trigger if the latency of the order placement operation exceeds a certain threshold. This allows developers to proactively detect and address performance issues.

Moreover, Cloud Trace integrates with other Google Cloud services for enhanced observability. For example, it can be used with Cloud Logging to correlate log entries with specific traces, providing more context for troubleshooting. Similarly, it can be used with Cloud Monitoring to view trace data alongside metrics and logs in a single dashboard.

OpenTelemetry, Cloud Trace, and Go

To send a trace to Google Cloud Trace using OpenTelemetry in Go, you’ll need to do several things: install the necessary OpenTelemetry packages, set up a trace exporter, and instrument your code to generate traces. Here’s an example of how to do it:

Firstly, install the necessary Go packages:

go get go.opentelemetry.io/otel
go get go.opentelemetry.io/otel/exporters/trace/cloudtrace
go get go.opentelemetry.io/otel/sdk

Then you can use the following Go code:

package main
import (
"context"
"log"
"go.opentelemetry.io/otel"
"go.opentelemetry.io/otel/exporters/trace/cloudtrace"
sdktrace "go.opentelemetry.io/otel/sdk/trace"
)

func main() {
ctx := context.Background()
// Set up a connection to the Cloud Trace exporter
exporter, err := cloudtrace.NewExporter(
cloudtrace.WithProjectID("YOUR_GCP_PROJECT_ID"),
)
if err != nil {
log.Fatalf("cloudtrace.NewExporter: %v", err)
}
// Create a new tracer provider with a batch span processor and the Cloud Trace exporter
tp := sdktrace.NewTracerProvider(
sdktrace.WithBatcher(exporter),
)
// Set the global trace provider
otel.SetTracerProvider(tp)
// Get a tracer
tracer := otel.Tracer("example.com/trace")
// Start a new span
ctx, span := tracer.Start(ctx, "my-span")
span.End()
// Make sure all traces are sent before exiting
tp.ForceFlush(ctx)
}

In this example, replace "YOUR_GCP_PROJECT_ID" with your actual Google Cloud project ID.

This code sets up a new OpenTelemetry trace exporter for Google Cloud Trace and uses it to create a new tracer provider. It then sets this tracer provider as the global tracer provider. After this setup, it gets a tracer, starts a new span (representing a timed operation), and ends the span. Finally, it flushes the tracer provider to make sure all spans are sent to Cloud Trace. In a real application, you would start and end spans around the operations you want to trace, and you might also create child spans for sub-operations.

--

--

Irvi Aini
Google Cloud - Community

Machine Learning, Natural Language Processing, and Open Source.