EXPEDIA GROUP TECHNOLOGY — SOFTWARE

Traffic Shedding, Rate Limiting, Backpressure, Oh My!

How to stop your service from getting overloaded

Jack Shirazi
Expedia Group Technology

--

Options to avoid overloading of web applications. eg. scaling, overprovisioning, queuing, rate-limiting, backpressure, shedding.

Success is lovely. Too much success can be hard to deal with unless you’re prepared for it. This is true for both life and your applications.

In terms of applications, you ideally want to take all the traffic you can get. To accept all the traffic possible you have 3 options: scale, overprovision, queue. These are not alternatives, you can use a combination, though queuing is typically only an option if you have a service that can handle requests asynchronously. To be clear:

  • scaling is adding sufficient additional capacity to handle the traffic increase (it can be done automatically or manually)
  • overprovisioning is already having sufficient additional capacity to handle any traffic increase
  • queueing is temporarily holding the traffic somewhere and processing it as resources come free
separator

So great, you’re done. But what do you do when you can’t scale further, your provisioning is used up, and you have no (more) queuing capacity? That’s where the other 3 options, from the title of this article, come in.

Typically what happens if your service is overloaded is that at best it starts to respond with longer and longer response times, at worst it starts to error and ultimately fail. So you want to avoid overloading. When you can’t add capacity you have to somehow limit the load so that it’s not “over” loaded. You have 3 options:

  • First, and easiest, you can simply drop traffic. Requests will probably timeout as they won’t get a response. That is the traffic shedding option
  • Better is to rate limit (also called throttling) the requests. This is the most common option used. Here you return a specific error telling the client they’re sending too many requests. For example, HTTP has error code 429 built in to the protocol precisely for this purpose.
  • Most sophisticated is backpressure. This isn’t something you can do solely on the server-side, both the server and the client need to co-operate for backpressure. The server-side part of backpressure is rate limiting. To complete the loop for backpressure, the client needs to understand the rate limiting feedback, and reduce it’s request rate so that it stops exceeding the rate-limit.
separator

So all in all, there aren’t that many options to think through when considering how to handle traffic volumes. We have the following seven options really:

  • There are three options to handle as much traffic as possible — scale, overprovision, queue;
  • And another three to avoid overloading — traffic shedding, rate limiting, backpressure (when the first three aren’t enough);
  • If you don’t apply some of the first six options, you have chosen the last option by default — failing by overloading.

I’d like to thank my colleagues in the Expedia Group Reliability Engineering team who helped me put together this set of options

Learn more about technology at Expedia Group

--

--

Jack Shirazi
Expedia Group Technology

Working in the Elastic Java APM agent team. Founder of JavaPerformanceTuning.com; Java Performance Tuning (O’Reilly) author; Java Champion since 2005