In this tutorial, we’ll examine how to use OpenCensus in your gRPC projects in the Go programming language for observability both into your server and then client! We’ll then examine how we can integrate with OpenCensus exporters from AWS X-Ray, Prometheus, Zipkin and Google Stackdriver Tracing and Monitoring.
gRPC is a modern high performance framework for remote procedure calls, powered by Protocol Buffer encoding. It is polyglot in nature, accessible and useable on a variety of environments ranging from mobile mobile devices, general purpose computers to data centres for distributed computing, and it is implemented in a variety of languages: Go, Java, Python, C/C++, Node.js, Ruby, PHP https://grpc.io/
OpenCensus is a modern observability framework for distributed tracing and monitoring across microservices and monoliths alike. It is polyglot in nature, accessible and useable too on a variety of environments from mobile devices, general purpose computers and data centres for distributed computing and it is implemented in a plethora of languages: Go, Java, Python, C++, Node.js, Ruby, PHP, C#(coming soon) https://opencensus.io/
Go is a modern programming language that powers the cloud as well as modern systems programming, making it easy to build simple, reliable and efficient software. It is a cross platform, fast, statically typed and a simple language https://golang.org
With the above three introductory paragraphs, perhaps you already noticed the common themes: high performance, distributed computing, modern nature, cross platform, simplicity, reliability — those points make the three a match #compatibility, hence the motivation for this tutorial/article.
For this tutorial, we have a company’s service that’s in charge of capitalizing letters sent in from various clients and internal microservices using gRPC.
To use gRPC, we firstly need to create Protocol Buffer definitions and from those, use the Protocol Buffer compiler with the gRPC plugin to generate code stubs. If you need to take a look at the pre-requisites or a primer into gRPC, please check out this article https://grpc.io/docs/
Our service takes in a payload with bytes, and then capitalizes them on the server.
To generate code, we’ll firstly put our definition in a file called “defs.proto” and move it into our “rpc” directory and then run this command to generate gRPC code stubs in Go, using this Makefile below:
make should then generate code that’ll make the directory structure look like this
After the code generation, we now need to add the business logic into the server
Our server’s sole purpose is to capitalize content sent in and send it back to the client. With gRPC, as previously mentioned, the protoc plugin generated code for a server interface. This allows you create your own custom logic of operation, as we shall do below with a custom object that implements the
With that, we can now monetize access to generate money $$$. In order to accomplish that though, we need to create clients that speak gRPC and for that please see below:
Our client makes a request to the gRPC server above, sending content that then gets capitalized and printed to our screen. It is interactive and can be run simply by
go run client.go
which when run interactively, will look like this
And now that we have a client, we are open for business!!
It’s been 1 hour since launch. Tech blogs and other programmers are sharing news of our service all over their internet and social media; our service just got so popular and is being talked about all around the business world too, high fives are shared and congrats shared — after this celebration, we all go back home and call it a night. It’s the latest and greatest API in the world, it is off the charts, customers from all over the world come in, what could go wrong?
It hits 3AM and our servers start getting over loaded. Response time degrades overall for everyone. This however is only noticed after one of the engineers tried to give a demo to their family that they restlessly awoke at 2:00AM due to excitement, but the service is taking 15ms to give back a response. In normal usage, we saw about at most 1ms response time. What is causing the sluggishness of the system? When did our service start getting slow? What is the solution? Throw more servers at it? How many servers should we throw at it? How do we know what is going wrong? When? How can the engineering and business teams figure out what to optimize or budget for? How can we tell we’ve successfully optimized the system and removed bottlenecks?
In comes in OpenCensus: OpenCensus is a single distribution of libraries for distributed tracing and monitoring for modern and distributed systems. OpenCensus can help answer mostly all of those questions that we asked. By “mostly”, I mean that it can answer the observability related questions such as: When did the latency increase? Why? How did it increase? By how much? What part of the system is the slowest? How can we optimize and assert successful changes?
OpenCensus is simple to integrate and use, it adds very low latency to your applications and it is already integrated into both gRPC and HTTP transports.
OpenCensus allows you to trace and measure once and then export to a variety of backends like Prometheus, AWS X-Ray, Stackdriver Tracing and Monitoring, Jaeger, Zipkin etc. With that mentioned, let’s get started.
Part 1: observability by instrumenting the server
To collect statistics from gRPC servers, OpenCensus is already integrated with gRPC out of the box, and one just has to import
go.opencensus.io/plugin/ocgrpc. And then also subscribe to the gRPC server views. This amounts to a 7 line change
and then to trace the application, we’ll start a span on entering the function, then end it on exiting. This amounts to a 7 line change too
In the tracing, notice the
trace.StartSpan(ctx, "(*fetchIt).Capitalize") ?We take a
context.Context as the first argument, to use context propagation which carries over RPC specific information about a request to uniquely identify it
How do we examine that “observability”?
Now that we’ve got tracing and monitoring in, let’s export that data out. Earlier on, I made claims that with OpenCensus you collect and trace once, then export to a variety of backends, simulatenously. Well, it is time for me to walk that talk!
To do that, we’ll need to use the exporter integrations in our app to send data to our favorite backends: AWS X-Ray, Prometheus, Stackdriver Tracing and Monitoring
to finally give this code
and with the following variables set in our environment
as well as our prometheus.yml file
go run server.go
2018/05/12 11:40:17 fetchIt gRPC server serving at ":9988"
Part 2: observability by instrumenting the client
and then for client monitoring, we’ll just do the same thing for gRPC stats handler except using the ClientHandler and then also start and stop a trace and that’s it, collectively giving this diff below
or this which now becomes this code
which gives this visualization
Engineers can add alerts with Prometheus https://prometheus.io/docs/alerting/overview/ or Stackdriver Monitoring https://cloud.google.com/monitoring/alerts/ but also the various teams can examine system behaviour simultaneously, be it traces or metrics on a variety of backends. A question one might have is: “how about observability for streaming?” — for streaming you can use the same logic, but since in order to export a trace, the span needs to have been ended. However, with streaming, you have a single persistent connection that’s perhaps infinitely open. What you can do is register unique identifying information from a streaming request and then per stream response, start and end a span!
With that we are off to the races!
Thank you for reading this far and hope this tutorial was useful, you can find all the code in this tutorial at https://github.com/orijtech/opencensus-for-grpc-go-developers
Please feel free to check out the OpenCensus community https://opencensus.io send us feedback, instrument your backends and share with your friends and teams!
This tutorial is part of a bunch more coming where we’ll use different languages, different transports and provide more samples etc.
Thank you again.
Emmanuel T Odeke