Understanding rate limiting on HAProxy

Overcome the initial steep learning curve and get standard security practices in place.

Introduction

HAProxy is a free and open source, high availability load balancer and proxy server. It has become very popular since it’s low on resources and high on performance. It’s been my go-to solution in various projects, since contrary to other alternatives, HAProxy’s community edition bundles in more than enough features for a robust load balancing deployment.

The software has, in my experience, a steep learning curve. Its technical documentation however is very thorough and goes into great detail. I’d go as far as saying that it is the most complete of any open source software I’ve ever used.

In this article I explain how rate limiting can be implemented on HAProxy. It is one of the most widely used server security practices, but implementing it on HAProxy is not an aspect that has been explained and presented well enough. It took me a lot of time, effort and trial-and-error to understand and get it working.

Photo by Fredrick Kearney Jr on Unsplash

No prior knowledge of the software is necessary, as I try to explain all the steps involved.


Setting up the load balancer

We are going to be using Docker and Docker Compose in order to save ourselves some time and be able to concentrate on what matters, within the scope of the article. So let’s better get rid of any infrastructure considerations and simply get the main components up and running quickly.

Our initial goal is to set up a working instance of a HAProxy load balancer with a couple of Apache backend servers behind it.

Clone the repository

$ git clone git@github.com:stargazer/haproxy-ratelimiter.git
$ cd haproxy-ratelimiter

Feel free to take a look at Dockerfile and docker-compose.yml which describe the setup we’ll be using. Explaining them is out of the scope of the article, so for now you need to trust me that they set up a working HAProxy instance called loadbalancer with 2 backend servers behind it, api01 and api02. For the HAProxy configuration we’ll initially use the haproxy-basic.cfg file, and then switch to haproxy-ratelimiting.cfg.

For the sake of simplicity, the initial HAProxy configuration file haproxy-basic.cfg is as basic and stripped-down as it gets. Let’s take a look at it.

The section frontend proxy defines that HAProxy listens to port 80 and forwards all requests to the api backend pool.

The section backend api defines the api backend pool with its 2 backend servers called api01 and api02 and their corresponding addresses. The server that serves any given incoming request is chosen by the roundrobin load balancing algorithm which pretty much means that the 2 available servers are used in turns.

Let’s get all our 3 containers up and running.

$ sudo docker-compose up

We now have the loadbalancer container forwarding requests to the 2 backend servers api01 and api02. Pointing our web browser to the URL http://localhost/ should be enough to get a response by one of the backend servers.

It’s interesting to refresh a few times and observe the logs that docker-compose emits.

As we can see, the requests are handled in turn by the 2 api servers.

We now have an instance of HAProxy running a very basic load balancing configuration, and hopefully by now we have an idea of how that works.


Adding rate limiting to the load balancer

In order to add rate limiting support to our load balancer, we need to modify the configuration file that the HAProxy instance uses. We have to make sure that the loadbalancer container picks up the haproxy-ratelimiter.cfg configuration file.

Simply modify the Dockerfile to use this one instead.

Rate limiting directives

The configuration file haproxy-ratelimiter.cfg is what this article is all about.

Let’s take a closer look

HAProxy offers a very low level set of primitives that offer great flexibility and can be used for a variety of use cases. The generic counters it exposes, often remind me of a CPU’s accumulator register. They store intermediate results, can take various semantics, but at the end of the day they are just numbers. To get a good understanding it makes sense to start from the very end of the configuration file.

The Abuse stick table

Here, we define a dummy backend called Abuse. Dummy, since it’s only used to define a stick-table that the rest of the configuration can refer to by the name Abuse. The stick-table is nothing but a storage space, or better, a lookup table for request data. Our stick-table has the following characteristics:

  • type ip: Requests stored in the stick table will have their IP as key. So, requests from the same IP will refer to the same record. Essentially this means that we keep track of IPs and data related to them.
  • size 100K: The table has a maximum of 100K entries.
  • expire 30m: The table entries expire after 30 minutes of inactivity.
  • store gpc0,http_req_rate(10s): The table records store the general purpose counter gpc0 and the IP’s request rate for the last 10 second interval. We’ll be using the gpc0 to keep track of the amount of times an IP has been marked as abusive. Essentially, a positive value implies that the IP has been marked as abusive. Let’s call this counter the abuse indicator.

All in all, what the Abuse table does is keep track of whether an IP is abusive as well as its current request rate. We therefore have their historical track record, as well as real-time behavior.

Now let’s go to the frontend proxy section and see what’s new there.

ACL functions and rules

An ACL(Access Control List) is a function declaration. The function is only invoked when used by a rule. In and of itself an ACL is nothing more than a declaration.

Let’s see all 3 of them in detail. Keep in mind that since all explicitly refer to the Abuse table which uses the IP as key, the functions are applied on the request’s IP.

  • acl is_abuse src_http_req_rate(Abuse) ge 10: Function is_abuse returns True if the current request rate is greater than or equal to 10.
  • acl inc_abuse_cnt src_inc_gpc0(Abuse) gt 0: Function inc_abuse_cnt returns True if the incremented value of gpc0 is greater than or equal to 0. The initial value of gpc0 is 0, and therefore the function always returns True. In other works, it increments the value of the abuse indicator, essentially marking the IP as abusive.
  • acl abuse_cnt src_get_gpc0(Abuse) gt 0: Function abuse_cnt returns True if the value of gpc0 is greater than 0. In other words it tells whether the IP has already been marked as abusive.

As mentioned earlier, the ACLs are simple declarations. They are not applied on incoming requests unless invoked by some rule.

It makes sense to take a look at the rules defined in the same frontend section. The rules are applied in turn on every incoming request and make use of the ACLs that we just defined. Let’s see what each one does.

  • tcp-request connection track-sc0 src table Abuse: Adds the request to the table Abuse. Since the table has defined the IP as its key, this rule basically adds the request IP to the table.
  • tcp-request connection reject if abuse_cnt: Rejects new TCP connections if the IP has already been marked as abusive. In essense, it forbids new TCP connections from an abusive IP.
  • http-request deny if abuse_cnt: Denies access to request if the IP has already been marked as abusive. This applies to already established connections that are still open, but correspond to an IP that has just been marked as abusive.
  • http-request deny if is_abuse inc_abuse_cnt: Denies access to request if is_abuse and inc_abuse_cnt both return True. In other words, it denies access to the request if the IP currently has a high request rate, and then proceeds to annotate it as abusive.

Essentially we place real-time checks as well as historical. The second rule rejects all new TCP connections if the IP has been marked as abusive. The third rule denies serving HTTP requests if the IP has already been marked as abusive, regardless of its current request rate. The fourth rule ensures that HTTP requests from an IP are denied at the very moment its request rate threshold is crossed. So basically the second rule operates upon new TCP connections, whereas the third and fourth on established connections with the former being a historical check and the latter being a real-time check.

Let’s play!

We can now build and run our containers again.

$ sudo docker-compose down
$ sudo docker-compose build
$ sudo docker-compose up

Now, the loadbalancer should be running in front of the 2 api servers.

Let’s point our browser to http://localhost/. If we refresh a dozen of times quickly, surpassing the threshold of 10 requests per 10s interval, we see that our requests get denied. If we keep doing that, new requests will be rejected very early, before the TCP connection is even established.


Questions

Why is the threshold 10 requests per 10s?

The Abuse table defines http_req_rate(10s), meaning the request rate is measured over a 10s window. The is_abuse ACL returns True for a request rate of ≥10 over the window interval specified by Abuse. That makes a request rate of 10 requests per 10s to be considered abusive.

In the example of this article, we’ve chosen to set a rate limit that we can easily reach to prove the rate limiter’s functionality.

What’s the difference between the rules http-request and tcp-request connection?

From the documentation:

http-request: The http-request statement defines a set of rules which apply to layer 7 processing.

From the documentation:

tcp-request connection: Perform an action on an incoming connection depending on a layer 4 condition

What’s the point of denying HTTP requests, if we anyway reject TCP requests altogether?

Picture this scenario; We have a few TCP connections from an IP, sending HTTP requests to the server. The HTTP request rate increases rapidly above the threshold. That’s when the 4th rule kicks in, denies the requests, and marks the IP as abusive.
Now, it could very well be that the HTTP connections from that same IP remain open (See HTTP persistent connection) and that the HTTP request rate has decreased below the threshold. The 3rd rule makes sure to keep denying HTTP requests, since the abuse indicator indicates that this IP has been marked as abusive.

And let’s now assume that the same IP tries to set up TCP connections a few minutes later. These get dropped immediately since the 2nd rule is in place, sees that the IP is marked as abusive and drops the connections at their birth.


Conclusion

Rate limiting with HAProxy might not be very straight-forward initially. It requires a very basic, possibly unintuitive and low-level thinking to get right. The documentation is perhaps a bit too technical for this and lacks some basic examples. I hope this guide provides a good kickstart for anyone who wants to go down that road!