Datadog APM Tracing for Golang Service

Chirag Samtani
Qoala Engineering
Published in
4 min readJul 7, 2021

Being one of the few insurtech players in SEA, Qoala partners with a lot of other companies, therefore issues to our uptime or latencies should be monitored and studied clearly. Application Performance Management (APM) tools such as Datadog are crucial tools to help increase the observability and for debugging performance issues easily. Datadog has been Qoala’s pick for APM over competitors like NewRelic since the alarms are easily configurable and added via the Datadog Dashboard and APM integration is supported for the programming languages we use.

Golang is the backbone for a lot of our product services so getting insights for those services are crucial. After deploying APMs for Qoala’s services, we were easily able to track performance issues in our APIs and reduce some of the average latencies of those APIs by 88%.

Enough selling, let’s get to implementation!

Prerequisites:

  1. Go version >= 1.12
  2. Datadog agent >= 5.21.1. In order to get Datadog agent setup in your infrastructure, refer to this document: https://docs.datadoghq.com/agent/

More often than not, if you’re using golang for serving REST APIs, you are probably using some kind of routing library. The datadog golang tracing has contrib packages that provide tracing on commonly used libraries such as mux, chi, echo and even gorm and redis etc.

Note: the libraries are not compatible with Opentracing.

Let’s use chi library as an example, to trace all of your APIs, import the contrib package for chi as well as the tracer library.

The same also works for Echo Labstack library as well

The service name is the datadog APM service name and is visible once you view this page: https://app.datadoghq.com/apm/service. Contrib packages give service names by default, for example “echo” is the default service name for echo tracers. In our case, we provided service name as: go-agent-service.

Example of a trace using the server code above

The tracer package is the Datadog APM tracing client integration. Starting a tracer creates a root span, and multiple spans constructs a trace. We will see how that is relevant when we start collecting metrics from our SQL calls. Tracer also has capabilities to separate your APM tracers by environment, this is through span tags.

Traces can be split by environment by configuring the started tracer’s environment

It is good practice to separate your logs by environments and service name. Simply add the service name and environment as environment variables in your container/host.

Next, is to trace SQL calls. Tracing APIs are not as interesting if we cannot get monitoring from calls to databases, caches (redis) etc.

We will assume that you use some kind of ORM library as most developers do when making calls to your database, the popular choice is gorm, so let’s see how we can trace that and apply it as a span to our HTTP request traces.

In the above snippet, we open the mysql connection using a datasource name. We are using the gorm library provided by the contrib package from Datadog instead of using gorm for opening our MySQL connections. The sqltracer register call is required when using the gormtracer library, this registers the dialect that’s used.

Using the datadog gorm library to open traced connections without the sqltracer is NOT possible currently as it is enforced by the datadog gorm library.

Now, comes two questions, how do we pass SQL sessions across handlers that usually live outside the main.go file. For most cases, using golang contexts fits this use-case. Echo has its own derivation of contexts which is simply an interface that represents the context of an HTTP request. We can extend that interface so that we can pass MySQL sessions.

When extending context, include the context itself as a part of the struct.

To initialize the context, we will use the echo middleware. We use the datadog library again and use the WithContext function to attach the gorm library with background context which already has the HTTP request trace attached to it. The WithContext call is essential, since this will attach the SQL calls spans as a part of the parent trace which is the echo http.request.

Below is the full code snippet,

As a result, you will be able to see a flame graph outlining all the spans in your request and the request time associated with those spans. As I mentioned previously, this is not just limited to database calls, but can be expanded to trace Redis calls and also tracing nested API calls as well (for tracing 3rd party latencies).

Trace with request + SQL spans

P.S: If you use Echo in your company and start to see a double response body in the output only when using the APM, make sure you check whether you have a custom Echo middleware that overrides the response body output. If you have such a middleware, a quick fix for this is to check whether the response is committed. Refer to the following: https://github.com/labstack/echo/blob/1ac4a8f3d0c6dc6ff8b9d666a3e860e70edcad87/response.go#L20

--

--