Laravel Rate Limiting in Production

Tobias
Tobias
Jan 3, 2020 · 7 min read
Image for post
Image for post
Symbolic Image for Rate Limiting (by twinsfisch — unsplash.com)

Whenever you develop a Laravel-based application that makes it into production you’ll probably provide some kind of API. Either for internal usage or for the customer to consume. However you most likely want to install some kind of rate limiting mechanism to make sure nobody is overusing your API and thus risking to take down your hosting infrastructure.

A well-known mechanism to prevent APIs from being overloaded with requests (kind of DoS) is a rate limiting. You’ll define a maximum number of requests in a given amount of time and when a client hits that maximum your web server will answer with HTTP 429 Too Many Requests. This indicates the client that there is some kind of cooldown time it needs to wait until further requests will be processed.

Rate Limiting Middleware

Figure 1 — Rate Limiting using Laravel’s “throttle” middleware

This is a pretty cool default functionality. However it comes with a downside you will experience in production: The request still hits your Laravel application before being denied due to an exceeded limit which means the request still generates load on your web-server.

Of course the impact isn’t as high as it normally would be but in fact the web-server needs to boot up the Laravel framework and the request will pass any middleware that is processed before the throttle middleware itself. This alone could cause some database queries to be executed before the check for the exceeded rate limit even happens.

Finally the request causes impact on your cache server (e.g. Redis). Laravel’s rate limiting middleware stores the client’s IP address alongside with the amount of requests in a given time period and performs a check on every request.

Load Balancer Rate Limiting

A load balancer is a piece of software that usually sits in front of your web-server stack. The traffic goes from the left to the right until it hits your Laravel application.

Image for post
Image for post
Figure 2 — Position of the Load Balancer in the Web-Server Stack

It should be desirable to kill unwanted requests as early as possible in that chain of processing to reduce load in the backend. One of the most used load balancers is HAProxy. Although the website looks like it’s software from the 90s, it’s battle-tested and under very active development. At the time of writing this article HAProxy has reached stable version 2.1 and is “cloud ready” for usage with modern technologies like Kubernetes.

HAProxy is mainly a Layer 7 HTTP Load Balancer (however it also supports some more protocols, which is pretty awesome). That means that it can handle SSL offloading, can look into the user’s request and decide a few key things upon the request details:

  • First of all it can decide which backend to use for the incoming request which means you could split up your application into two different Laravel applications: One for the frontend and another one for the backend.
  • It can restrict some URIs to a given IP range or require basic authentication for it. That way I’m able to protect the Laravel Horizon Dashboard in production — it’s only accessible from a specific VPN IP range for additional security.
  • It can split your user’s request between several backend web-servers which means you are able to scale your deployment. You no longer need to get bigger and bigger machines, you can just add some. And you can remove them if no longer needed (e.g. after a huge sale event, when running a web shop).

However this article will focus on the configuration of rate limiting within HAProxy for the sake of performance and stability of your web-server deployment.

Configuration of HAProxy

Figure 3 — Max Connection Settings for HAProxy

When there are more than 90 (3 times 30 connections) concurrent connections HAProxy will put those requests in a queue and will forward them once the active connection count drops below 30 again. Usually this happens within milliseconds, so your users will barely notice under normal circumstances. The queue will be flushed after 10 seconds and the client receives an HTTP 503 Service Unavailable which means HAProxy couldn’t elect a backend server to serve the request to the user.

One would ask why you should limit those connections to the backend server. The idea behind this procedure is that it’s better to serve some HTTP errors to some clients than bury your web backend under such a heavy workload your application becomes inoperable. It’s a kind of a protection for your infrastructure.

HAProxy Rate Limiting

To recognize a user we will use its requesting IP address as dictionary key. And the value we are interested in is the amount of HTTP connection the client establishes.

Figure 4 — Establish Rate Limiting within HAProxy

The first two lines of that configuration example are plain and basic frontend definitions in HAProxy. You create a new frontend (something that receives a user’s request) and bind it to port 80 which is the default HTTP port.

In Line 3 you create a stick-table that stores IP addresses as “dictionary key”, has a maximum size of 100k, thats values expire after 30 seconds and that stores the request rate of the latest 10 seconds for each client. The reason why we are using ipv6 as table type is that the default ip type is not able to store IPv6 addresses due to a limitation of the key length. Although the type suggests that the table can only store IPv6 addresses this is not the case; it can easily store both, so don’t worry.

Afterwards we initialize a so called sticky counter (sc) to count the incoming HTTP requests of each user and deny the request with a HTTP 429 Too Many Requests if the HTTP request rate exceeds 20 requests for the given amount of time we defined in Line 3 (in our case this are seconds).

HAProxy will automatically take care of the table and purge old entries. So after some time the clients will be able to connect to the server again.

Downsides

Image for post
Image for post
Figure 5 — X-RateLimit headers of the Laravel Throttle Middleware

As you can see Laravel will automatically add some additional headers to its response when a request got rate-limited. You can see the hard limit of requests (5 in the example) and the remaining amount of requests you can perform before getting a HTTP 429 from the server. Furthermore it provides a counter and a unix timestamp that shows you when you are allowed to perform new requests after a rate limit hit.

You won’t get those headers with the provided HAProxy configuration above. Therefore I personally decided to use the load balancer rate limiting technique alongside with Laravel’s rate limiting middleware. You easily can configure much higher limits within your load balancer than at your Laravel application and you still get some kind of protection against flooding your application.

For example you could set up the Laravel throttle middleware to prevent more requests than 60 per minute so the user gets one request per second. Then you could configure HAProxy to limit the requests when there are more than 120 requests per minute. So if your user is using your API correctly and honors the rate limiting headers he won’t ever hit your load balancer rate limit. But if the user just ignores the headers and continues flooding your application with requests although they get denied by your Laravel middleware he’ll run into your load balancer rate limiting at some point.

By doing this you can efficiently prevent your infrastructure from being flooded with requests.

Conclusion

  • In production it may be a problem that your user’s requests still hit your web-server backend although the user is already rate-limited.
  • Rate limiting via Laravel Middleware costs more than rate limiting at the edge of your web stack (at the load balancer).
  • HAProxy provides a convenient way to achieve rate limiting using stick-tables and some easy deny rules at the frontend.
  • It’s better to show some users a HTTP error that burying your infrastructure under heavy load (no matters whether it’s a DoS-attack or just a high amount of legit traffic).

In the future I’ll publish more articles about production-specific experiences with Laravel I made in the past. I hope you can get some takeaways for your own projects.

Hosting is fun and there are many ways to fine-tune your application very individually. One-Click-Hosting solutions may suite for many projects, but when it comes to performance and security you may prefer a tailored solution.

The Startup

Medium's largest active publication, followed by +756K people. Follow to join our community.

Tobias

Written by

Tobias

DevOps Engineer — I’m building and breaking stuff that is related to web development.

The Startup

Medium's largest active publication, followed by +756K people. Follow to join our community.

Tobias

Written by

Tobias

DevOps Engineer — I’m building and breaking stuff that is related to web development.

The Startup

Medium's largest active publication, followed by +756K people. Follow to join our community.

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store