The First Law of Latency: There is no THE LATENCY of the system

Published in

Engineers @ Optimizely

3 min readJun 23, 2022

There is no THE LATENCY of the system. You can’t make decisions using the information in this picture.

I got this wrong for years. There is no THE LATENCY of the system because latency is a distribution, not a scalar. This post marks the beginning of a 6-part journey where we will discover the laws of latency and grow our ability to make correct, data-empowered decisions about latency.

The Laws

There is no THE LATENCY of the system
Latency distributions are NEVER NORMAL
DON’T LIE when measuring latency (most tools do… and that’s not ok)
DON’T LIE when presenting latency (most presentations do… and that’s not ok)
You can’t capacity plan without a LATENCY REQUIREMENT
You’re probably not asking for ENOUGH NINES.

Let’s start with the basics: What is latency? It’s easiest to talk about latency using singular words.

A latency is a measurement of how long it took to do one thing.

That was easy. Now the hard part: We’d like to talk about what our users can expect in general when they use our system.

Wrong: “The latency of the system is 676 microseconds.”

This makes no sense because you either have a latency measurement (singular) or you have a distribution of latency measurements. This is the first law: “There is no THE LATENCY of the system.” To talk about plural latency measurements we need to adopt the language of statistical distributions. This is the only way to talk about latency in general.

Still wrong: “The average latency of the system is 676 microseconds.”

Average is an example of a summary statistic. Summary statistics give us lightweight ways to talk about distributions. In particular, average tells us about the center of normally distributed data. Therefore, this latest attempt to talk about latency is still wrong because of the second Law of Latency: Latency distributions are NEVER NORMAL.

Let’s begin to correct our imprecise language by using the language of generalized distributions:

Better: “The system responds to requests with a latency ≤ 100 ms 99.9% of the time.”

Great! But we’re still not communicating decision empowering information because we haven’t disclosed our load and configuration conditions. Let’s try again:

Useful: “Under a constant load of 1,000 requests per second, a single instance responds to requests with a latency ≤ 100 ms 99.9% of the time.”

Now we can make decisions! If we have a latency requirement, we can capacity plan to handle the real-world throughput we expect. If our latency requirement was ≤ 100 ms 99.9% of the time and the desired overall throughput is 10,000 requests per second, we definitely need at least 10 instances.

If our latency requirement was at a different quantile than 99.9%, then we need to report that quantile to capacity plan. This is the fifth Law of Latency: You can’t capacity plan without a LATENCY REQUIREMENT.

Remember, there is no THE LATENCY of the system. If you are talking about latency and not using the language of applicable statistical distributions then you are reasoning incorrectly and in danger of making wrong decisions.

Next week we will take a deeper dive into the Second Law of Latency: Latency distributions are NEVER NORMAL.

The First Law of Latency: There is no THE LATENCY of the system

The Laws

Written by Brian Taylor