Golang gRPC distributed context propagation

Distributed systems are complex

Green rope meshwork

At Hootsuite, we operate within a complex software architecture, as we use microservices extensively. In such an intricately connected world, understanding what’s happening in order to process a single request is crucial. When running a microservice infrastructure, handling an external request will commonly result in the involvement and orchestration of several different microservices. If something fails in the chain, the time needed to detect the problem can increase by an order of magnitude, because you need to find the failing service(s) and the reason behind the failure.

Context propagation

What makes distributed systems a challenging topic is the fact that the ability to observe the application as a whole and understand what’s happening behind the scenes is rendered far more complex. Context propagation is the fundamental tool that enables spreading information across the entire infrastructure when an action is performed. This is a generic mechanism that serves different purposes. By propagating metadata in the context you can, for example:

  • tag logs produced by different services and re-aggregate them later
  • measure the application performance and breakdown by different services

Context propagation in go

We extensively use golang to build microservices and we use context propagation. The following example will present how you can propagate contextual information across the microservices boundaries in a distributed environment. Bear in mind that there is one object in golang that is passed to every function call and that has some native mechanism to cross the service boundaries, that being the context.

func doSomething(ctx context.Context, params …string) { }

The context package provides functionalities to store request scoped values, manage the request deadlines and cancellations. Remember that if you’re going to store something in the context it should be a value strictly related to the request you’re handling. After that, the context and all the contained values should be safely deleted from the memory. The context itself should be created in the top level of your app and should not be stored in any struct type, just passed as a first argument in the other function calls.

If you’re working on a web server application the context object is usually provided from the networking library. For example, in the net/http it has been integrated since the go.17, with the request struct offering a r.Context() function and the gRPC stack passing the context to any endpoint implemented in the server. Working with the context enables you to store any kind of data within it. In our tracing case, we’re going to use the context to store the metadata received from the request in order to share the information with all the involved services.

Disclaimer
If you want to go down into the rabbit hole, here is a list of interesting resources about the context:
- https://go.dev/blog/context
- https://go.dev/blog/context-and-structs
- https://www.ardanlabs.com/blog/2019/09/context-package-semantics-in-go.html

A gRPC example

Let’s say you’re going to implement a gRPC service to get user information. There is an API gateway in front of your service that provides some metadata such as user-id and request-id. Your business logic involves several other services to retrieve all the information required to fulfill the response (you know, the microservice infrastructure), and we want to propagate the metadata to all these other services too.

gRPC interceptors and golang context join the forces

If you’re not familiar with the gRPC interceptors, they are like a middleware but for the gRPC stack. Just a function to implement this signature. Propagating metadata between gRPC services is very easy. We will store the metadata into the context and then we will propagate them to the downstream gRPC calls. To do that we just need an interceptor to read the metadata from the incoming requests and write them to the outgoing context. In that way the gRPC stack will automatically propagate all the metadata across the services you’re going to call and if they do the same, the metadata will be propagated to the entire infrastructure.

Code example:
A gRPC interceptor to propagate the incoming metadata to the outgoing context

func ContextPropagationUnaryServerInterceptor() grpc.UnaryServerInterceptor {    return func(
ctx context.Context,
req interface{},
info *grpc.UnaryServerInfo,
handler grpc.UnaryHandler,
) (interface{}, error) {
if md, ok := metadata.FromIncomingContext(ctx); ok {
ctx = metadata.NewOutgoingContext(ctx, md)
}
return handler(ctx, req)
}
}

In the example above, we’ve received the user-id and the request-id from the API Gateway. By using that interceptor within all your services, the metadata will go through the entire microservices constellation and this is just the initial part; We can now start doing a lot of interesting things with this data.

Context Propagation is just the start of the journey

Now that we’ve propagated the metadata through the context, we can start playing with it. Coming back to my first statement: “What makes distributed systems a challenging thing is the ability to observe the application as a whole and understand what’s happening behind the scenes.” Observing the application logs can be challenging when the overall application is composed of different microservices, but with the metadata propagation you can now attach these pieces of information to the logs so that they can be filtered and aggregated later. The key point of enriching logs is using a structured logger. With a structured logger we’re going to define a consistent log format, letting other tools extract data sets from them for a later analysis.

Example of plain text log entry

[info] 2022–07–01T15:48:02+02:00 127.0.0.1:53165 hootsuite.User.GetUserInfo "Authenticator Pass"

Example of structure log entry

{
"level": "info",
"message": "Authenticator Pass",
"grpc.start_time": "2022-07-01T15:48:02+02:00",
"system": "grpc",
"span.kind": "server",
"grpc.service": "hootsuite.User",
"grpc.method": "GetUserInfo",
"peer.address": "127.0.0.1:53165",
"user_id": "e46f6c78-d365-465e-b92c-effd33f10a32",
"request_id": "05fd0f39-af14-4191-b26d-3569237de4a7"
}

Zap is one of the universally used golang logging libraries, I’ll provide some examples implemented with it. Logging a structured data is quite easy with zap

logger, _ := zap.NewProduction()
defer logger.Sync()
logger.Info(
"Authenticator Pass",
zap.String(“endpoint”,”GetUserInfo”),
zap.String(“service”, “hootsuite.User”),
)

Using a structured log enriched with the propagated metadata, you can do the following:

  • Filter all the logs produced by a single user, filtering by the user-id field
  • Filter all the logs produced by a single request, filtering by the request-id field

Back to the code example, the metadata is attached to the context object, so what do we need to reach this level of observability ? We need to:

  • Put the metadata into the logger to automatically attach them to every log entry
  • Have a logger instance that is specific for the request we’re handling so that it already have the fields filled

If you’re thinking about another interceptor, you’re right, we can use interceptors for that. I’m assuming the use of zap as a logger library but you can adapt the code to work with the logger of your choice.

First of all we inject all the metadata received into zap, using github.com/grpc-ecosystem/go-grpc-middleware/logging/zap/ctxzap

func ContextToZapFieldsUnaryServerInterceptor() grpc.UnaryClientInterceptor {return func(
ctx context.Context,
method string,
req,
reply interface{},
cc *grpc.ClientConn,
invoker grpc.UnaryInvoker,
opts …grpc.CallOption,
) error {
if md, ok := metadata.FromIncomingContext(ctx); ok {
for k, v := range md {
ctxzap.AddFields(ctx, zap.Strings(k, v))
}
}
return invoker(ctx, method, req, reply, cc, opts…) }
}

Next we use the grpc-ecosystem/go-grpc-middleware/logging/zap logging middleware to fetch a request-scoped from the context. We now have all of the tools in our swiss army knife to do the following:

  • Read the metadata and propagate them to any other microservice we’re depending on
  • Set a request scoped logger that automatically logs the metadata as logging fields

We just need to glue all of the interceptors together and let the magic happen.

func createServer() *grpc.Server {    opts := []grpc.ServerOption{
grpc.UnaryInterceptor(
grpc_middleware.ChainUnaryServer(
ContextPropagationUnaryServerInterceptor(),
grpc_zap.UnaryServerInterceptor(zapLogger),
ContextToZapFieldsUnaryServerInterceptor(),
)),
}
return grpc.NewServer(opts…)}func (s *Server) MyRPCFunction(ctx context.Context, req *pb.MyRequest) (*pb.Response, error) { logger := ctxzap.Extract(ctx)
logger.Info(“My log entry“)
. . .}

The interceptors in the stack will cooperate to do the following:

  • Push the metadata to the outgoing context (ready to be propagated to the outgoing requests)
  • Push the incoming context metadata to the log fields
  • Inject the logger to the context, ready to be extracted as a request-scoped logger

In the following log examples we can see the user_id and request _id propagated through different services

{
"level": "info",
"message": "Authenticator Pass",
"grpc.start_time": "2022–07–01T15:48:02+02:00",
"system": "grpc",
"span.kind": "server",
"grpc.service": "hootsuite.User",
"grpc.method": "GetUserInfo",
"peer.address": "127.0.0.1:53165",
"user_id": "e46f6c78-d365–465e-b92c-effd33f10a32",
"request_id": "05fd0f39-af14–4191-b26d-3569237de4a7"
}
{
"level": "error",
"message": "user info not valid",
"grpc.start_time": "2022–07–01T15:48:02+02:00",
"system": "grpc",
"span.kind": "server",
"grpc.service": "hootsuite.User",
"grpc.method": "GetUserInfo",
"peer.address": "127.0.0.1:53165",
"user_id": "e46f6c78-d365–465e-b92c-effd33f10a32",
"request_id": "05fd0f39-af14–4191-b26d-3569237de4a7"
"error": "error retrieving subscription"
}
{
"level": "error",
"message": "error getting user subscription",
"grpc.start_time": "2022–07–01T15:48:02+02:00",
"system": "grpc",
"span.kind": "server",
"grpc.service": "hootsuite.Subscription",
"grpc.method": "GetUserSubscription",
"peer.address": "127.0.0.1:53165",
"user_id": "e46f6c78-d365–465e-b92c-effd33f10a32",
"request_id": "05fd0f39-af14–4191-b26d-3569237de4a7"
"error": "user subscription not found"
}

Conclusion

In this article, we demonstrated an implementation of context propagation, leveraging Go gRPC middleware. This technique allowed us to readily and easily implement request tracing in our distributed infrastructure. With the addition of a few simple Interceptors, we were able to propagate metadata and start observing the application as a whole even if it’s sliced into different microservices. This opens up new ways to observe our code from many directions, in the previous examples we tagged requests with the request-id and user-id. Similarly, you can choose the granularity level that is better suited for your use cases, for example, you can decorate the request with the service name, you can increment an header value to calculate the longest path in your service chain, there are no limitations to the info you can attach and propagate.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store
Simone Di Maulo

Simone Di Maulo

Staff Developer interested in system design, scalable systems, message queuing system.