Booking.com’s Journey with Brotli

The challenges of improving performance in a complex environment

Mark Zabaro
Dec 10, 2020 · 23 min read
The Transfagarasan road in Romania has many tight curves & winds up a mountain. It gives stunning views & a challenging drive
The Transfăgărășan road in Romania is known for its jaw-dropping views. But you’re gonna have to work for it. Photo CC BY-SA 2.0 by Antony Stanley, from Flickr.

Our first brotli test (and the basics of performance experimentation on our site)

What we measured and how we measured it

Implementation constraint #1: Dynamic, not static, resources

Implementation constraint #2: Single-flush pages

A sequence diagram that shows the approximate mechanism for pages to load faster when early-flushing is used
A sequence diagram that shows the approximate mechanism for pages to load faster when early-flushing is used
An illustration of how an early-flush permits pages to load resources sooner, and thus become ready sooner

Recap of test setup and summary of results

Our second brotli test

Performance metrics: New and Improved!

A density curve of some latency metric sampled from real production traffic
A density curve of some latency metric sampled from real production traffic
When I look at latency curves like this, I’m often reminded of the drawing in “The Little Prince” of a snake who’s swallowed an elephant.

Results of 2nd test

Time passes…

Wait. Does sending fewer bytes actually drive performance?

Looking at brotli with fresh eyes

Okay, so it makes our pages slower. But how?

Compression time?

Decompression time?

Has anyone else had this problem but us?

A “crazy theory”

Why the theory makes a ton of sense

What good is a hypothesis?

Running two tests for the price of one

Table of effects observed when testing the interplay between brotli and Link headers on pages implemented with an early-flush
Table of effects observed when testing the interplay between brotli and Link headers on pages implemented with an early-flush
Summary of effects on time-to-DCL metric sample distribution and sample collection in the A/B/C/D test

Where we are today

What’s still missing or unknown

Recap

Gratitude

Booking.com Development

Software development at Booking.com