Fixed Window, Sliding Logs, Leaky Bucket, Sliding Window, Token Bucket
A very brief introduction of Rate Limiting.
Rate-limiting is a procedure that allows you to control the rate at which your users can send requests to your server. Rate limiting is mostly used to protect your servers from unwanted bursts, malicious attacks.
Rate limiting can be handled both on the server and client-side. On the server-side, rate limiting can be implemented with
- Proxy servers like Nginx
- An in-memory high-performance database like Redis
- Third-party services like AWS API Gateway,
- Using your data structure and algorithms in your code.
For now, let’s use Redis to implement rate-limiting using some of the most used rate-limiting algorithms and discuss their pros and cons.
The following are the algorithms discussed and implemented.
- Fixed Window
- Sliding Logs
- Leaky Bucket
- Sliding Window
- Token Bucket
Fixed Window
In this algorithm, we use fixed window intervals to count requests. For 100 req/min, we use 1 minute/60 second interval windows. Each incoming request increments the counter for the currently active window (windows are defined by the floor of the current timestamp), so with a 1 minute/60 second window, if the current time is 10:00:27, the current window would be 10:00:00 and if time is 10:01:27, the window would be 10:01:00.
For each request a user makes,
- Check if the user has exceeded the limit in the current window.
- If the user has exceeded the limit, the request is dropped
- Otherwise, we increment the counter
At regular intervals, say every 10 min or every hour, delete all the expired window keys (instead of setting the current window to expire at the start next window using EXPIRE command in Redis, which is another command/load on Redis for every request).
Pros
- Easy to implement.
Cons
- A burst of requests at the end of the window causes server handling more requests than the limit since this algorithm will allow requests for both current and next window requests within a short time. For example, for 100 req/min, if the user makes 100 requests at 55 to 60 seconds window and then 100 requests from 0 to 5 seconds in the next window, this algorithm handles all the requests. Thus, ends up handling 200 requests in 10 seconds for this user, which is above the limit of 100 req/min
Sliding Logs
In this algorithm, we log every request a user makes into a sorted list(sorted with time).
For each request a user makes,
- Check if the user has exceeded the limit in the current window by getting the count of all the logs in the last window in the sorted set. If current time is 10:00:27 and rate is 100 req/min, get logs from previous window(10:00:27–60= 09:59:28) to current time(10:00:27).
- If the user has exceeded the limit, the request is dropped.
- Otherwise, we add the unique request ID (you can get from the request or you can generate a unique hash with userID and time) to the sorted sets(sorted by time).
At regular intervals, say every 10 min or every hour, delete all the expired request IDs from sorted sets using ZRemRangeByScore from -inf to last window time.
Pros
- Overcomes cons of the fixed window by not imposing a fixed window limit and thus unaffected by bursts of requests at the end of the window
Cons
- It is not memory-efficient since we store a new entry for every request made.
- It is very expensive because we count the user’s last window requests in each request, which doesn’t scale well when large bursts of attacks happen.
Leaky Bucket
In this algorithm, we use limited sized queue and process requests at a constant rate from queue in First-In-First-Out(FIFO) manner.
For each request a user makes,
- Check if the queue limit is exceeded.
- If the queue limit has exceeded, the request is dropped.
- Otherwise, we add requests to queue end and handle the incoming request.
Requests are processed at a constant rate from the queue in a FIFO manner(removed from the start of the queue and handled) from a background process. (you can use LPOP command in Redis at a constant rate, for example for 60 req/min, you can remove 1 element per second and handle the removed request)
Pros
- Overcomes the cons of the fixed window by not imposing a fixed window limit and thus unaffected by a burst of requests at the end of the window.
- Overcomes the cons of sliding logs by not storing all the requests(only the requests limited to queue size) and thus memory efficient.
Cons
- Bursts of requests can fill up the queue with old requests and most recent requests are slowed from being processed and thus gives no guarantee that requests are processed in a fixed amount of time.
- This algorithm causes traffic shaping(handling requests at a constant rate, which prevents server overload, a plus point), which slows user’s requests and thus affecting your application.
Sliding Window
In this algorithm, we combine both the fixed window and the sliding log algorithm. We maintain a counter for each fixed window and we account for the weighted counter value of previous window request count along with current window request count.
For each request a user makes,
- Check if the user has not exceeded the limit in the current window.
- If the user has exceeded, the request is dropped
- Otherwise, we calculate the weighted count of the previous window, for example, if the current window time has been elapsed by 30%, then we weight the previous window’s count by 70%
- Check if the previous window weighted count and current window count has exceeded the limit.
- If the user has exceeded, the request is dropped.
- Otherwise, we increment the counter by 1 in the current window and handle the incoming request.
At regular intervals, say every 10 min or every hour, delete all the expired window keys.
Pros
- Overcomes the cons of the fixed window by not imposing a fixed window limit and thus unaffected by a burst of requests at the end of the window.
- Overcomes the cons of sliding logs by not storing all the requests and avoiding counting for every request and thus memory and performance efficient.
- Overcomes the cons of leaky bucket starvation problem by not slowing requests, not traffic shaping.
Cons
- Not a con, but you need to delete expired window keys, an extra command/load on Redis, which you can overcome in the next algorithm.
Token Bucket
In this algorithm, we maintain a counter which shows how many requests a user has left and time when the counter was reset.
For each request a user makes,
- Check if window time has been elapsed since the last time counter was reset. For rate 100 req/min, current time 10:00:27, last reset time 9:59:00 is not elapsed and 9:59:25(10:00:27–9:59:25 > 60 sec) is elapsed.
- If window time is not elapsed, check if the user has sufficient requests left to handle the incoming request.
- If the user has none left, the request is dropped.
- Otherwise, we decrement the counter by 1 and handle the incoming request.
- If the window time has been elapsed, i.e., the difference between last time counter was reset and the current time is greater than the allowed window(60s), we reset the number of requests allowed to allowed limit(100)
In this algorithm, there is no need for background code to check and delete expired keys.
Pros
- Overcomes all the above algorithms cons, no fixed window limit, memory and performance efficient, no traffic shaping.
- No need for background code to check and delete expired keys.
Finally, I would say not every algorithm suites all the problems, use whichever is comfortable to your needs. But I prefer and recommend Token Bucket.
That’s all for now. Let me know about your views, comment below for any clarifications