Resilience Unleashed: Building an Intuitive Long Delay Retry Framework

Bharat
Deutsche Telekom Digital Labs
6 min readDec 7, 2023

Problem Statement

In modern software development, seamless API operations face challenges when hardware devices are inactive during calls. To ensure success, a robust delayed retry framework is essential. We often need to execute operations on hardware via APIs, but if devices are off during calls, success is uncertain. To address this, we require a user-friendly framework capable of storing and retrying operations after a delay. Storing operations in memory for an extended period is impractical, necessitating an external solution to resubmit operations after delays, mitigating resource and capacity concerns.

Introducing the Delayed Retry Framework

To address the challenges posed by delayed operations, we decided to build a framework that would store these operations in an external system capable of handling several types of data. After evaluating multiple available solutions, including Rqueue due to its ability to handle high volume and throughput efficiently, we chose to build our custom delayed retry framework on top of it.

Framework Details

To make this solution developer-friendly and easily accessible, we designed the `DelayedRetry` framework. This framework provides a simple yet powerful interface to submit and process requests for delayed retries.

Critical Components of the framework:

DelayedRetryHandlerConfig

This class is responsible for housing configuration properties that define the behavior for various retryable operations within the Delayed Retry framework. By configuring properties in this class, users can easily customize the behavior of the framework without the need to alter core modules. This flexibility ensures that adding new operations or making adjustments is as straightforward as adding configuration settings and implementing the IDelayedRetryProcessor interface, which is discussed in more detail later.

DelayedRetryable

At the heart of the Delayed Retry framework, the DelayedRetryable class handles the processing of operations submitted to the framework. It leverages the configuration defined in DelayedRetryHandlerConfig to determine the appropriate implementation for handling the request processing. Based on the configuration, it also determines the number of retries and the conditions under which retries should occur.

package delayed.retry.core;

import com.fasterxml.jackson.core.JsonProcessingException;
import com.fasterxml.jackson.databind.ObjectMapper;
import lombok.AllArgsConstructor;
import lombok.Getter;
import lombok.extern.slf4j.Slf4j;
import org.springframework.context.ApplicationContext;
import org.springframework.context.annotation.Scope;
import org.springframework.stereotype.Component;

import java.util.HashMap;
import java.util.Map;
import java.util.Optional;

@Component
@Scope("prototype")
@Slf4j
@Getter
@AllArgsConstructor
public class DelayedRetryable<E, D> {

private DelayedRetryHandlerConfig delayedRetryHandlerConfig;
private ApplicationContext applicationContext;
private IDelayedQueue delayedQueue;
private ObjectMapper objectMapper;

public void submitOperation(final E request, final OperationDetails operationDetails) {

final String operationName = operationDetails.getOperation();
final DelayedRetryHandlerConfig.RetryProperties retryProperties = delayedRetryHandlerConfig.getRetryProps(operationName);
final IDelayedRetryProcessor<E, D> processor = (IDelayedRetryProcessor<E, D>) applicationContext.getBean(retryProperties.getProcessor());
final String backOff = retryProperties.getDefaultBackoff();
RetryProcessRequestInfo retryProcessRequestInfo = RetryProcessRequestInfo.builder()
.delay('+' == backOff.charAt(0) ? 0 : 1).attemptNumber(1).operationDetails(operationDetails)
.retryProperties(retryProperties)
.build();
try {
processor.preProcess(request, retryProcessRequestInfo);
if (retryProcessRequestInfo.getInitialDelay() > 0) {
pushToDelayQueue(request, retryProcessRequestInfo, backOff);
return;
}
final D response = processor.process(request, retryProcessRequestInfo);
processor.onSuccess(request, response, operationDetails);
} catch (final Exception ex) {
if (retryProperties.isEnable()) {
final Optional<DelayedRetryHandlerConfig.RetryableExceptions> exProperties = Optional.ofNullable(retryProperties.getRetryableExceptionsProperties().stream()
.filter(exceptionProperties -> ex.getClass().getName().equalsIgnoreCase(exceptionProperties.getRetryableException()))
.findAny().orElse(null));
if (exProperties.isPresent()) {
retryProcessRequestInfo = getRetryRequestInfo(operationDetails, 1, exProperties.get().getBackoff());
pushToDelayQueue(request, retryProcessRequestInfo, exProperties.get().getBackoff());
return;
}
}
processor.onFailure(request, ex, operationDetails);
}
}

private RetryProcessRequestInfo getRetryRequestInfo(final OperationDetails operationDetails, final int attempt, final String backoff) {
return RetryProcessRequestInfo.builder()
.attemptNumber(attempt)
.delay('+' == backoff.charAt(0) ? 0 : 1)
.operationDetails(operationDetails).build();
}

public void pushToDelayQueue(final E request, final RetryProcessRequestInfo retryProcessRequestInfo, final String backOff) {
final OperationDetails operationDetails = retryProcessRequestInfo.getOperationDetails();
if (retryProcessRequestInfo.getInitialDelay() > 0) {
retryProcessRequestInfo.setAttemptNumber(-1);
}
final int attempt = retryProcessRequestInfo.getAttemptNumber();
final Long retryInterval = retryProcessRequestInfo.getInitialDelay() > 0 ?
Long.valueOf(retryProcessRequestInfo.getInitialDelay()) : RetryDelayCalculator.calculateDelay(retryProcessRequestInfo.getDelay(), backOff);
final Map<String, Object> headers = retryOperationHeaders(request, attempt, operationDetails, retryInterval);
headers.putAll(contextHeaders());
final RetryMessage retryDelayMessage = getRetryMessage(request, headers);
delayedQueue.pushMessageToQueue(retryDelayMessage, retryInterval);
}

private Map<String, Object> retryOperationHeaders(final E request, final int attempt, final OperationDetails operationDetails, final Long retryInterval) {
final Map<String, Object> headers = new HashMap<>();
final Class<?> requestClass = request.getClass();
headers.put(Constants.OPERATION_HEADER, operationDetails.getOperation());
headers.put(Constants.OWNER_HEADER, operationDetails.getOwnerId());
headers.put(Constants.ATTEMPT_HEADER, attempt + 1);
headers.put(Constants.DELAY_HEADER, retryInterval);
headers.put(Constants.CLASS_NAME_HEADER, requestClass.getName());
return headers;
}

private RetryMessage getRetryMessage(final E request, final Map<String, Object> headers) {
final RetryMessage retryDelayMessage = new RetryMessage();
retryDelayMessage.setHeaders(headers);
try {
retryDelayMessage.setPayload(objectMapper.writeValueAsString(request));
} catch (JsonProcessingException e) {
throw new RuntimeException("unable to serialize the request", e);
}
return retryDelayMessage;
}

}

IDelayedRetryProcessor

This interface plays a crucial role in the Delayed Retry framework. Users must implement the methods defined in this interface to provide the processing logic for a given operation. The methods include:

process(E request, RetryProcessRequestInfo retryProcessRequestInfo): This method processes the request, which can be an API call, database operation, or any other type of operation based on the specific requirements.

onSuccess(E request, D response, OperationDetails operationDetails): A hook method to handle actions in the event of a successful operation.

onFailure(E request, Exception ex, OperationDetails operationDetails): This hook method is used to manage actions when an operation results in failure.

package delayed.retry.core;

public interface IDelayedRetryProcessor<E, D> {

D process(E request, RetryProcessRequestInfo retryProcessRequestInfo);
//process the request which can be api call, db call or any operation as per requirement

void onSuccess(E request, D response, OperationDetails operationDetails);
//Hook to handle if the operation was success

void onFailure(E request, Exception ex, OperationDetails operationDetails);
//Hook to handle if the operation was failure
}

Next, we will explore how developers can harness the framework to retry requests with customizable delays.

Define Configuration

The framework relies on configuration properties to determine which IDelayedRetryProcessor implementation should be used for each submitted operation. Users can configure these settings according to their specific requirements. The configuration allows for flexibility in enabling or disabling different operations, defining maximum retry attempts, specifying backoff times, and handling retryable exceptions.

delayed:
retry:
queue:
queueName: demoQueue
deadLetterQueue: demoDDL
concurrency: 3
retry: 3
connectionDetails:
redisSentinelConfig:
master:
sentinels:
- host: redisHost1
port: 6379
- host: redisHost2
port: 6379

handler:
operations:
componentRetryTest:
enable: true
processor: retryableTestProcessor
defaultMaxAttempts: 3
defaultBackoff: '+10'
retryableExceptionsProperties:
- maxAttempts: 3
retryableException: "java.lang.RuntimeException"
backoff: '+10'

Implement IDelayedRetryProcessor

Users are required to implement the IDelayedRetryProcessor interface for their specific use cases. The implementation should define how requests are processed, what actions to take in case of success, and how to handle failures.

package retry.app;

import delayed.retry.core.IDelayedRetryProcessor;
import delayed.retry.core.OperationDetails;
import delayed.retry.core.RetryProcessRequestInfo;
import org.springframework.stereotype.Component;

@Component
public class RetryableTestProcessor implements IDelayedRetryProcessor<String, String> {

public String process(String request, RetryProcessRequestInfo retryProcessRequestInfo) {
if (request.equals("Unspecified_Operation")) {
throw new RuntimeException("Retryable exception");
}
return "OK";
}

public void onSuccess(String request, String response, OperationDetails operationDetails) {
}

public void onFailure(String request, Exception ex, OperationDetails operationDetails) {
}
}

Submit the request to the DelayedRetry Framework

This part of the application illustrates how simple it is to submit a request to the Delayed Retry Framework. Users need to provide the request object and operation details, making it a straightforward process to initiate operations within the framework.

package retry.app;

import delayed.retry.core.DelayedRetryable;
import delayed.retry.core.OperationDetails;
import org.springframework.beans.factory.annotation.Autowired;
import org.springframework.boot.CommandLineRunner;
import org.springframework.boot.SpringApplication;
import org.springframework.boot.autoconfigure.SpringBootApplication;

@SpringBootApplication(scanBasePackages = {"delayed.retry", "retry.app"})
public class MainClass implements CommandLineRunner {

@Autowired
private DelayedRetryable delayedRetryable;

public static void main(String[] args) {
SpringApplication.run(MainClass.class, args);
}

@Override
public void run(String... args) throws Exception {
OperationDetails operationDetails = OperationDetails.builder()
.operation("retryTest").build();
delayedRetryable.submitOperation("Unspecified_Operation", operationDetails);
}
}

We leveraged this framework by specifying the operations for retry, configuring essential details, and implementing necessary steps. The framework adeptly manages delays and retries, streamlining the intricate process of handling delayed operations.

Real-World Application: Smart Home Automation

To illustrate the practical application of this framework, let us consider a scenario in smart home automation. Imagine a software vendor wants to push an update to the smart home appliances. However, the vendor is unaware of whether the appliances are turned on at the end customer’s premises. By utilizing the `DelayedRetry` framework, the vendor can store the update operation and retry it after a delay, ensuring that the update is successfully delivered whenever the appliances are active.

Conclusion

Efficiently managing delayed retries is a critical aspect of modern software operations, particularly when dealing with operations involving hardware devices. The `DelayedRetry` framework, built on top of Rqueue, offers an elegant and developer-friendly solution to handle these challenges effectively. By simplifying the process of delayed retries, this framework contributes to smoother operations and better user experiences, making it an asset in the toolkit of software developers across various domains.

--

--