Aggregate Availability Check with Signal Sciences Data

Px Mx
Px Mx
Sep 12, 2018 · 2 min read

Having the privilege to work with so many great enterprise customers has its benefits. One great benefit is you are always learning something new. With a diverse set of organizations operating in different ways and seeking to solve different problems, having to learn something new is unavoidable.

In this blog post, I want to share a very simple and helpful availability metric I learned about while responding to a customer’s request. This metric is called aggregate availability, and it comes straight out of Google’s seminal book on the subject, Site Reliability Engineering. I would imagine, aggregate availability may not be new to many readers, but if it new to you, read more about it here.

Image for post
Image for post
Photo by Kevin Ku on Unsplash

It involves a very simple calculation. Aggregate availability is equal to successful requests divided by total requests. For Signal Sciences’ customers, later in this post I provide a Python script to run this calculation across all your sites.

In the use case where I applied this, successful requests were defined as any request that resulted in an HTTP status code less than 500. The 5XX series of error codes indicate some type of server-side error occurred, and an unsuccessful request.

Conveniently, Signal Sciences maintains data on total requests processed by agents, and time-series data with counts for all status codes. This data is available via the API, the two endpoints used are Get Overview Report Data and Get Timeseries Request Info.

Here is the example Python script that uses those API endpoints to calculate aggregate availability for each site.

Note, the script can take one argument, which specifies the time period you want to run the calculation for. In the example below, -7d was provided to run the calculation against all requests for the last 7days.

$./sigsci_site_availability.py -7dSignal Sciences Demo Site1:Total Requests: 1952594 Server Errors: 7670 Aggregate Availability: 99.61%Signal Sciences Demo Site2:Total Requests: 1936173 Server Errors: 35 Aggregate Availability: 100.00%Signal Sciences Demo Site3:Total Requests: 0 Server Errors: 0 Aggregate Availability: No site activity.

Aggregate availability is definitely a useful metric to keep an eye on. A drop in availability would be a huge concern for many organizations. I hope you’ve learned something new in this blog post. I’ll continue to share tips and new learnings like this one in the future.

Photo by Kevin Ku on Unsplash

Originally published at labs.signalsciences.com.

Signal Sciences Labs

The research and tech behind Signal Sciences next-gen web…

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store