Round-trip Time Assertions on HTTP Checks

Published in

The Opsee Blog

2 min readApr 15, 2016

We’re excited to be rolling out RTT (round-trip time) metrics for HTTP health checks. We’re including RTT as extra metadata on every response. The RTT can then be used to set assertions and trigger alerts if it deviates from your expectations.

Round-trip time assertion on a HTTP check

The simplest use case for RTT checks will be timeouts. If you’ve got a timeout set for your service, configure your Opsee check to alert you if the RTT is ever above your timeout.

Why RTT?

So why is this important? In a microservice environment you often have many cooperating services involved in delivering a single response to the user. Any service which has dependencies across the network will need to have some sort of policy to deal with network failures — there will typically be a policy around when to timeout a request to another service and when to retry a request. When dependencies are timing out, it’s important to know that and to understand which dependencies are exhibiting latency.

With retries in the mix, an awareness of RTT latency increases becomes even more important. For instance, consider a dependency that is doing an unusually large amount of work for a particular request. It takes too long and the request times out. However, you need that response, and it’s on the critical path so your code dutifully retries that expensive request. On top of everything else, the original expensive request wouldn’t have been cancelled until the work was mostly done. And, lest you think this is a made up scenario, here’s Jeff Hodges talking about exactly that.

Round-trip Time Assertions on HTTP Checks

Why RTT?

Written by Cliff