Riding the swell: A guide for API rate limiting

Published in

Engineering the Skies: Qantas Tech Blog

5 min readJun 21, 2023

Our Loyalty platform is used by multiple business units, internal channels and external partners, each with varying usage and requirements. To ensure fair access and prevent any group from monopolising the API requests, we implemented rate limiting with custom limits for each consumer.

What is rate limiting?

Rate limiting is a technique employed to regulate the frequency of incoming requests to an online service. Its purpose is to avoid an overload of the service due to a large number of requests in a short duration. The implementation of rate limiting involves preventing, slowing down, or rejecting requests that exceed the established rate limit. For instance, in the HTTP protocol, a status code of “429 — Too many requests” will be returned.

Rate limiting benefits

Security — rate limiting mitigates the potential security risks posed by brute force attacks on security-intensive functionalities such as login and promo code entry. Additionally, it helps to defend against both denial of service (DoS) and distributed denial of service (DDoS) attacks.
Controlling operational costs — in the context of auto-scaling resources using a pay-per-use model, rate limiting can be employed to impose a virtual limit on the scaling of resources and thus help to regulate operational costs.
Prevent cascading failures — in circumstances where legacy systems may be unable to handle a high volume of requests efficiently, the implementation of rate limiting reduces the risk of system failure, thereby increasing the availability of the overall system.

Rate limiting on the Loyalty platform

Rate limiting has allowed us to set thresholds for the maximum number of requests that could be made per unit of time by each client or user, tailored to their specific needs. Additionally, we recognise that security is about well-defended layers and we saw rate limiting as an additional layer of protection against DDoS attacks.

We utilise Kong API Gateway to enforce security, and it offers two types of rate-limiting plugins (basic & advanced). The advanced plugin has the capability to support consumer-based rate limiting.

The approach we have taken involves imposing a predefined rate limit on an API endpoint and then establishing an additional rate limit for the specific consumer that supersedes the default rate limit configuration.

Implementation

The code below illustrates how to interact with Kong using Kong Admin APIs which offer endpoints for managing services, routes, plugins and other elements.

Step 1, create a service named loyalty-service.

curl -XPOST 'http://<kong-admin-api-url>/services' \
-d 'name=loyalty-service'                          \
-d 'url=http://www.example.com/api'

Once it has been created successfully, it produces a response similar to the following.

{
    "id": "<uuid>",
    "created_at": 1677475642,
    "updated_at": 1677475642,
    "name": "loyalty-service",
    "retries": 5,
    "protocol": "http",
    "host": "www.example.com",
    "port": 80,
    "path": "/api",
    "connect_timeout": 60000,
    "write_timeout": 60000,
    "read_timeout": 60000,
    "tags": [],
    "tls_verify": true,
    "tls_verify_depth": null,
    "enabled": true
}

Step 2, create a route named service-route for the API endpoint (/member/some-action) on the service, to support CORS, HEAD and OPTIONS methods must be enabled.

curl -XPOST 'http://<kong-admin-api-url>/routes'     \
    -H 'Content-Type: application/json'              \
    -d '{                                            \
            "name": "service-route",                 \
            "methods": ["POST", "HEAD", "OPTIONS"],  \
            "paths": ["/member/some-action"],        \
            "service": {"name": "loyalty-service"}   \
        }'

Step 3, obtain the consumer ID through the consumer’s name.

The value of the field id in the response will be utilised in a subsequent step. As our consumers have already been created, we simply need to retrieve the consumer ID; if it does not exist, a new consumer must be formed first.

curl -XGET 'http://<kong-admin-api-url>/consumers/<consumer_name>'

Response:

{
    "id": "9919ac80-3b39-48d7-895c-378d94fdd44b",
    "created_at": 1677475689,
    "username": "<consumer-name>",
    "custom_id": "my-custom-id",
    "tags": []
}

Step 4, create a default rate limiting at the route level.

Since there are multiple instances operating in production, we must have a centralised location to store the rate limiting counter. Therefore, we are using our internal Redis cluster, as indicated by the strategy field. The limit refers to how many requests are permitted in the established window_size, which in the example below is five requests per second.

It is necessary to be aware that the identifier is set to service level, as this ensures that all consumers have the same rate limiting settings. In the example below, if Channel 1 sends three requests in one second, Channel 2 and Channel 3 can only send two requests in that same second. The value can be adjusted depending on the necessary needs.

curl -XPOST http://<kong-admin-api-url>/routes/service-route/plugins \
    -d "name=rate-limiting-advanced"                                 \
    -d "config.limit=5"                                              \
    -d "config.window_size=1"                                        \
    -d "config.identifier=service"                                   \
    -d "config.sync_rate=0"                                          \
    -d "config.strategy=redis"                                       \
    -d "config.hide_client_headers=false"                            \
    -d "config.redis.host=<redis-host-url>"                          \
    -d "config.redis.port=<redis-port>"                              \
    -d "config.redis.ssl=true"

Step 5, create another rate limiting at the route plus consumer level.

When Kong recognizes a particular consumer, it is aware that there are overridden settings for rate limiting. In this instance, the settings have been altered to ten requests per second and the identifier is set to the consumer level. The consumer field value is the consumer identification obtained from step 3.

curl -XPOST http://<kong-admin-api-url>/consumers/9919ac80-3b39-48d7-895c-378d94fdd44b/plugins \
    -d "name=rate-limiting-advanced"                                                           \
    -d "config.limit=10"                                                                       \
    -d "config.window_size=1"                                                                  \
    -d "config.route.name=service-route"                                                       \
    -d "config.identifier=consumer"                                                            \
    -d "config.sync_rate=0"                                                                    \
    -d "config.strategy=redis"                                                                 \
    -d "config.hide_client_headers=false"                                                      \
    -d "config.redis.host=<redis-host-url>"                                                    \
    -d "config.redis.port=<redis-port>"                                                        \
    -d "config.redis.ssl=true"

Result

Previously, we attempted to use window_type as sliding and sync_rate as 0.1, and the outcome was generally positive, yet there were a few spikes during load testing. This was because our window size was the smallest (1 second) that Kong permits, and we were utilising the Redis cluster to store the request counter and to share it asynchronously (synchronising every 0.1 seconds). On occasion, the synchronisation could experience some delay, and the sliding algorithm is more intricate than the other algorithm (fixed), thus taking more time to calculate.

Later, we modified the window_type to fixed and the sync_rate to 0 (synchronously), and the outcome appears to be more stable. The colour deep green represents the default consumers, while light green highlights the specific consumer (the decline in the middle is due to a decrease in load, and is not related to rate limiting).

With the aforementioned configuration, the implementation of rate limiting has been promoted to our production environment, and it has been functioning effectively since its deployment.

References:

Information has been prepared for information purposes only and does not constitute advice.