Developing a Rate-Limit Middleware for Vapor, the Swift-Server side server

5 min readJun 27, 2024


Foto de ALTEREDSNAPS de Pexels

The term Middleware refers to the logic that is applied to client requests before they reach the route handler or modify the Vapor server’s response before being sent to the client.

Middleware is a logic chain between the client and a Vapor route handler. It allows you to perform operations on incoming requests before they get to the route handler and on outgoing responses before they go to the client.

Vapor allows for the creation of these types of components and adding them to our backend developments in a simple way. In this article, we will see how to add them to a project and how Vapor manages them.

Additionally, we will develop a Middleware component for one of the tasks, limiting the requests that our server handles.

Statistics and Rate-Limit middleware solution design approach

Let’s do it

Suppose that we are working on the first stage of the development of a web service, and the client wants the first two features added to the service:

  • Statistics. Fetch data from users and store it in a database.
  • Rate-Limit. Our client will expose a public API, but in an early stage wants to take control over the number of requests each developer performs against the server.

Both features are not part of the client’s core business so we will add those features as Middleware in our Vapor server.

Coding Middleware

Create a Middleware for Vapor needs that our types conform with Middleware or AsyncMiddleware protocols.

The difference between those protocols is that the Middleware protocol uses the EventLoop response type and the AsyncMiddleare implements the async/await pattern returning a Response type.

Vapor, in his documentation, recommends saving the Middleware code files in a folder named Middleware under the Source/App folder.

Pay attention to the fact that the Middleware can modify the incoming request or the outgoing response, and even short circuit and send a custom response instead of passing the request to the next Middleware in the chain.

Middleware execution order

Another important aspect is the order in which different Middlewares will be executed by the Vapor server. That order is set at the code declaration level.

That means that the first Middleware executed will be the Statistic Middleware, and the second the Rate-Limit Middleware.

A request workflow in this case is the following:

  1. The user requests one of our endpoints.
  2. The Statistic Middleware processes the request and passing to the next Middleware, in our case the Rate-Limit one.
  3. The Rate-Limit Middleware process also the request, and passes to the next Middleware if available, if not, continue to the endpoint.
  4. A controller processes the request and sends a response.
  5. Vapor sends the controller response to the Rate-Limit middleware and modifies the response if needed.
  6. The next middleware to process the response is the Statistic middleware.
  7. When the last middleware finishes processing the response, Vapor sends it to the client

In this case the middleware is applies to all the requests, no matter the endpoint, but Vapor allows to use the middleware only for a group of routes.

About Rate-Limit middleware

Rate-limiting is a crucial concept in backend development that involves controlling the number of requests a user or client can make to a server within a specified time frame.

It helps protect the server from being overwhelmed by too many requests, which can lead to degraded performance or complete service outages. Here’s a detailed explanation tailored for a software developer:

Key Concepts

There are three main concepts that you should know about rate limit:

Rate-Limit Threshold:

  • This is the maximum number of requests allowed within a certain period. For example, you might set a limit of 100 requests per minute per user.

Time Window:

  • The period during which the requests are counted. Common time windows are seconds, minutes, hours, or days.

Client Identification:

  • Typically, clients are identified using API keys, IP addresses, or user tokens to apply rate limits individually.

Why Rate-Limiting is Important

You should implements a rate limit solution if you want to:

Preventing Abuse:

  • It prevents a single client or a group of clients from overwhelming the server, intentionally or unintentionally.

Ensuring Fair Usage:

  • It ensures that resources are shared fairly among all clients.

Maintaining Performance:

  • It helps maintain optimal performance levels by preventing server overload.
  1. Security:
  • It can help mitigate certain types of attacks, such as Denial-of-Service (DoS) attacks.

Implementing Rate-Limiting

Rate-limiting can be implemented at various levels:

Application Level:

  • Implemented directly within the application code using middleware or filters.

API Gateway Level:

  • Many organizations use API gateways (e.g., Kong, AWS API Gateway) that have built-in rate-limiting features. This approach is scalable and can be applied uniformly across multiple services.

Load Balancer Level:

  • Some load balancers, like NGINX or HAProxy, support rate-limiting and can apply limits before requests reach your application servers.

Techniques for Rate-Limiting

Different algorithms apply the rate limit in different ways, it is important to know them to choose the right one for your needs.

Checkpoint package implements the following:

Fixed Window Algorithm:

  • Counts requests in fixed time intervals (e.g., minute). Simple but can lead to bursts at the boundary of windows.

Sliding Window Algorithm:

  • More flexible as it counts requests in a sliding window, reducing bursts.

Token Bucket Algorithm:

  • Tokens are added to a bucket at a fixed rate. Requests consume tokens, and if the bucket is empty, requests are limited. Allows bursts up to a defined limit.

Leaky Bucket Algorithm:

  • Similar to the token bucket but with a fixed rate of request processing. Excess requests are discarded or delayed.

Checkpoint. The rate limit middleware for Vapor

I’m currently working in a personal project named Checkpoint that develop a rate limit implementation for Vapor servers.

You can take a look to the current project status at this GitHub repository.


If you are a backend developer, sure that at some point in your career will face a problem or a requested feature that a rate limit technique will solve.

Links of interest




Me gusta leer, las camisetas y las zapatillas de deporte // iOS Software Designer @ Globant // Creador de MoveMAD, Ambiently, Shelves y alguna cosa más