Use GFE Server-Timing Header in Cloud Spanner Debugging

Published in

Google Cloud - Community

4 min readMay 27, 2020

Overview

GFE Server-Timing header has recently been enabled in all Cloud Spanner APIs. That means Spanner users can now retrieve the latency metric on GFE for every Cloud Spanner response. It can help to better distinguish the source of latency between client, networking, GFE and Spanner backend.

Background

Today Cloud customers can see the latency metrics and tracing spans of client RPCs by using OpenCensus in the client libraries. (See OpenCensus guide.)
Latency on Spanner requests is automatically captured and exposed as a Stackdriver system metric “spanner.googleapis.com/api/request_latencies”.
In addition, Spanner QueryStats are available in the responses if QueryMode is set to Profile. It includes the query execution elapsed time.
However, there is still a missing piece in the latency breakdown, which is the latency on Google Front End (GFE). Without GFE latency, it is difficult to debug high latency issues, because there could be multiple causes (for example, client application is too far away from GFE) but there is not enough information to differentiate.

Format

Now for every response from Cloud Spanner, no matter whether it’s REST or gRPC, there will be an additional header that contains the total elapsed time on GFE. The format is

server-timing: gfet4t7; dur=[GFE latency in ms]

Users can make use of the GFE latency information in various ways.

Note that the header is associated with a GFE response, so in some error cases when there is no GFE response received (for example, client cancelled the request or GFE unavailable), the header will be missing.

One-off Debugging

The simplest way is to debug queries on the web browser. The example below uses Chrome developer tools:

If users are more used to curl or gcloud CLI, they can also inspect GFE latency in REST responses from command line:

Debugging Programmatically

Next let’s examine how to make use of this header programmatically. We will use Spanner Go, Java and Python clients in the following examples.

Go

In the Go client, simply supply an gRPC Header CallOption when calling Spanner APIs. It will capture headers for an outgoing RPC:

See gRPC metadata documentation for more details.

Java

In the Java client, users need to implement the ClientInterceptor interface and intercept the outgoing RPCs in order to retrieve the headers:

Then when instantiating the Spanner client, supply the HeaderClientInterceptor:

Python

In the Python client, things are a bit more complicated. That’s because Spanner Python APIs is wrapped by google-api-core, and google-api-core doesn’t expose a client option to manipulate headers. To bypass this constraint, we need to call the raw Proto-generated Spanner APIs directly, then retrieve the headers from call.initial_metadata:

Aggregation, Visualization and Debugging

After retrieving the GFE latency metrics, advanced users can export, aggregate and visualize the latency data on monitoring tools, and compare client application latency, GFE latency and Spanner query execution latency side-by-side:

(This example uses Spanner Go client, OpenCensus and Prometheus. The code snippet can be found here.)

Alternatively, users can link the GFE latency as an attribute to Trace Spans:

(This example uses Spanner Go client, OpenCensus and Jaeger. The code snippet can be found here.)

By comparing the latencies between client, GFE and Spanner query execution, users and support engineers should now be able to narrow down the cause of slow Spanner operations. For example,

If the query execution latency is high, users can focus on optimizing schema, indexing and the query for better performance.
If GFE latency is high while Spanner latency is low, likely there is some issue with GFE. Users can include this information in the support tickets so that support engineers can focus on GFE troubleshooting.
If the client-side latency is high while GFE and Spanner latencies are low, there can be a networking issue between the client and the regional GFE. Users may need to fix their network routing.

References

W3C Server-Timing Specs: www.w3.org/TR/server-timing/

Google Front End (GFE) Design Overview: cloud.google.com/security/infrastructure/design#google_front_end_service