MicroService Patterns: Rate Limiting with Spring Boot

7 min readAug 13, 2023

Part of the Resilience4J Article Series: If you haven’t read my other articles yet, please refer to the following links:
1. Circuit Breaker Pattern in Spring Boot
2. MicroService Patterns: Retry with Spring Boot

Have you ever been curious about rate limiters in the HTTP world? Think of them as traffic controllers. They manage the rate of traffic from clients or services, limiting the number of requests allowed within a specified period. If the request count exceeds the set limit defined by the rate limiter, all the excess calls are blocked.

Here are a few benefits of using rate limiters:

Prevent resource shortages by malicious attacks (e.g. DoS or DDoS) either intentional or unintentional by blocking the excess calls.
Reduce cost. Cutting down on extra requests means fewer servers and focusing resources on important APIs.
Prevent servers from being overloaded. Rate limiters can filter out extra requests caused by bots or users’ misbehavior.

What is Resilience4j-Ratelimiter?

Resilience4J Ratelimiter provides an easy way to configure and apply rate-limiting strategies to specific parts of the application. Developers just set limits on the number of requests (limit-for-period) in a certain time (limit-refresh-period). This helps developers be more proactive in managing traffic, preventing service overload, and ensuring the stability and reliability of their applications.

Now, let’s take a closer look at the cycle in Resilience4j-RateLimiter:

Warm-Up Period: When you start the application or after a reset, there might be a warm-up period during which the rate limiter gradually increases the allowed request rate. This is to prevent a sudden surge of traffic right after starting, which could potentially overload the system.
Steady State: Once the warm-up period is over, the rate limiter enters a steady state. During this phase, the rate limiter allows requests to pass through based on the configured rate limits. For example, if you’ve set a limit of 100 requests per minute, the rate limiter will allow approximately one request every 0.6 seconds.
Limit Exceeded: If the incoming request rate exceeds the configured limits, the rate limiter starts rejecting excess requests immediately. This helps prevent the system from getting overwhelmed.
Replenishing Tokens: The rate limiter continuously replenishes “tokens” at a rate corresponding to the configured limits. Each allowed request consumes one token. If the system is not fully utilizing the allowed rate, the unused tokens accumulate, allowing for occasional bursts of requests.
Cooldown Period: If the rate limiter has been rejecting requests due to rate limits being exceeded, there might be a cooldown period during which the rate limiter gradually increases the allowed request rate again. This is to prevent a sudden surge in traffic once the limits are relaxed.

To learn more about Resilience4j-RateLimiter, refer to the official documentation: Ratelimiter

That sums up the concept of the rate limiter pattern. Pretty simple, isn’t it? 😃 Moving forward, let’s dive into the practical implementation of the rate limiter pattern in Spring Boot.

MicroServices Demonstration

Our demonstration has 2 services named payment-service and payment-processor.

Scenario

The Payment service handles incoming payment requests from shoppers and forwards them to the Payment Processor for processing.
Payment processor handles and dispatches outcomes.

We’ll implement rate limiting on the Payment service to control the rate of incoming payment requests.

Payment processor

Let’s build the payment processor first since it is a dependent service.

Please keep in mind that the real-world payment processor might involve much more complex implementations. For the purpose of our demonstration, I’ve simplified it to display success messages.

Prerequisites

Java 17
Maven Wrapper
Spring Boot 3+

Defining Dependencies

Create the project as a Spring Boot project with the dependencies provided inside below POM file. I have named it payment-processor.

https://github.com/buingoctruong/rate-limiting-pattern-spring-boot/blob/master/payment-processor/pom.xml

Properties

server:
  port: 1010
spring:
  application:
    name: payment-processor

Service

public interface PaymentProcessorService {
    String processPayment(String paymentInfo);
}

@Service
public class PaymentProcessorServiceImpl implements PaymentProcessorService {
    @Override
    public String processPayment(String paymentInfo) {
        // Simulated logic to process payment
        return "Payment processed: " + paymentInfo;
    }
}

Controller

@RestController
@RequestMapping("/api/v1/processor-payment")
@RequiredArgsConstructor
public class PaymentProcessorController {
    private final PaymentProcessorService paymentProcessorService;
    @PostMapping
    public String processPayment(@RequestBody String paymentInfo) {
        return paymentProcessorService.processPayment(paymentInfo);
    }
}

We finished building payment-processor. Run and access the link http://localhost:1010/api/v1/processor-payment using the request body “Payment Information”, the expected response should be as below.

Payment processed: Payment Information

Payment service

When it comes to payment service, the most interesting part might be configuring Rate Limiter, and monitoring its status through Actuator. I will strive to provide a straightforward explanation, so let’s get started!

Prerequisites

Java 17
Maven Wrapper
Spring Boot 3+
Resilience4j
Actuator

Defining Dependencies

Create the project as a Spring Boot project with the dependencies provided inside below POM file. I have named it payment-service.

https://github.com/buingoctruong/rate-limiting-pattern-spring-boot/blob/master/payment-service/pom.xml

Model

public interface Type {
}

@Data
public class Success implements Type {
    private final String msg;
}

@Data
public class Failure implements Type {
    private final String msg;
}

Service

Here is a place where every logic is written.

The tricky part is how to call an external API. Fortunately, Spring has RestTemplate can help us do that.

RestTemplate is a central spring class used to consume the web services for all HTTP methods. ( Remember we need to create a RestTemplate Bean, see in Setup part below)

public interface PaymentService {
    Type submitPayment(String paymentInfo);
}

@Service
@RequiredArgsConstructor
public class PaymentServiceImpl implements PaymentService {
    private final RestTemplate restTemplate;
    private static final String SERVICE_NAME = "payment-service";
    private static final String PAYMENT_PROCESSOR_URL = "http://localhost:1010/api/v1/processor-payment";
    @RateLimiter(name = SERVICE_NAME, fallbackMethod = "fallbackMethod")
    public Type submitPayment(String paymentInfo) {
        HttpHeaders headers = new HttpHeaders();
        headers.setContentType(MediaType.APPLICATION_JSON);
        HttpEntity<String> entity = new HttpEntity<>(paymentInfo, headers);
        ResponseEntity<String> response = restTemplate.exchange(PAYMENT_PROCESSOR_URL,
                HttpMethod.POST, entity, String.class);
        Success success = new Success(response.getBody());
        return success;
    }

    private Type fallbackMethod(RequestNotPermitted requestNotPermitted) {
        return new Failure("Payment service does not permit further calls");
    }
}

As you noticed that we’re annotating the method with “@RateLimiter”. The attribute “name” is assigned as “payment-service” which means every configuration of “payment-service” instance is applied for this method. (you can see details configurations in Properties part below). Then we’re using “fallbackMethod” attribute as well with the purpose to call a backup method in case the rate limit is exceeded and an exception occurs. We need to be careful that both methods should return the same data type. Now you might understand why I’m using “Type interface” for both model classes to implement.

Setup

@Configuration
public class RestConfig {
    @Bean
    public RestTemplate restTemplate() {
        return new RestTemplate();
    }
}

Properties

server:
  port: 9090
spring:
  application:
    name: payment-service
management:
  endpoint:
    health:
      show-details: always
  endpoints:
    web:
      exposure:
        include: health
  health:
    ratelimiters:
      enabled: true
resilience4j:
  ratelimiter:
    instances:
      payment-service:
        limit-for-period: 5
        limit-refresh-period: 15s
        timeout-duration: 10s
        register-health-indicator: true

Allow me to provide a brief introduction to resilience4j-ratelimiter configurations.

limit-for-period: The number of allowed requests during one “limit-refresh-period”.
limit-refresh-period: Specifies the duration after which “limit-for-period” will be reset.
timeout-duration: Sets the maximum wait time for the rate limiter to permit subsequent requests.

We finished building payment-service. Run sequentially payment-processor and then payment-service, access to the link http://localhost:9090/api/v1/payment-service using the request body “Payment Information” for one of the expected responses below.

// Successful Response
{
    "msg": "Payment processed: Payment Information"
}

// Rate limit exceeded response when requests exceed limit
{
    "msg": "Payment service does not permit further calls"
}

Playing with Rate Limiting

Both services are running already, access the link http://localhost:9090/actuator/health to view Rate Limiter details.

{
  "rateLimiters": {
    "status": "UP",  // "UP" suggesting that the rate limiters are functioning properly
    "details": {
      "payment-service": {
        "status": "UP",
        "details": {
          "availablePermissions": 5,  // number of allowed requests within the specified rate limit
          "numberOfWaitingThreads": 0 // number of threads waiting for their requests to be processed
        }
      }
    }
  }
}

Hit API 3 times http://localhost:9090/api/v1/payment-service using the request body “Payment Information”, then refresh the actuator link http://localhost:9090/actuator/health we will see the change.

{
  "rateLimiters": {
    "status": "UP",
    "details": {
      "payment-service": {
        "status": "UP",
        "details": {
          "availablePermissions": 2, // number of allowed requests now is 2
          "numberOfWaitingThreads": 0
      }
    }
  }
}

Wait for 15 seconds (possibly less if the period began before API access), then refresh the actuator link http://localhost:9090/actuator/health, and we’ll observe the allowed requests resetting to 5.

{
  "rateLimiters": {
    "status": "UP",
    "details": {
      "payment-service": {
        "status": "UP",
        "details": {
          "availablePermissions": 5,
          "numberOfWaitingThreads": 0
      }
    }
  }
}

Hit API 6 times http://localhost:9090/api/v1/payment-service using the request body “Payment Information”. The sixth request will be delayed by 5 seconds due to exceeding limits. While waiting, refresh http://localhost:9090/actuator/health for the following details.

{
  "rateLimiters": {
    "status": "UNKNOWN",
    "details": {
      "payment-service": {
        "status": "RATE_LIMITED",
        "details": {
          "availablePermissions": 0,
          "numberOfWaitingThreads": 1
      }
    }
  }
}

We have just explored the concept of rate-limiting and conducted a brief demonstration to observe its behavior.

Hope you can find something useful!

The completed source code can be found in this GitHub repository: https://github.com/buingoctruong/rate-limiting-pattern-spring-boot

I would love to hear your thoughts!

Thank you for reading, and goodbye!