Don’t Waste Your Time: Reacting to the Passage of Time Using Cloud Events and AWS Step Functions.

Published in

SSENSE-TECH

7 min readJul 15, 2022

“The Persistence of Memory” by Salvador Dalí

This article is part 2 of “Changing Perspective: Embracing Temporal Modeling to Capture the Passage of Time”; reading it beforehand is recommended.

As we learned in the first article of this series, traditional mechanisms reacting to the passage of time are accomplished by proactively determining if the time passed using techniques such as polling combined with UNIX-based cron jobs. While this mechanism can do the job, it has drawbacks such as inefficiency in the events consumed and management and prevention of concurrency. This second article will focus on how we can respond accurately to the passage of time in a reactive and event-driven manner, leveraging Step Functions’ usage and its set of features based on time.

The Problem

As mentioned in the previous article, our fictitious solution introduced a remorse period where a customer can cancel their order if they change their minds within 30 minutes of placing it. For this specific business requirement, time is an essential component, and we need to be able to lock the order after this defined time has passed reactively.

The problem to solve is essentially straightforward, but we have a couple of constraints with our tech requirements:

The architecture of our service must use Serverless technologies.
The use of a reactive approach rather than a polling one.
The accuracy of the event that is going to be triggered.
The ability to communicate, if necessary, the event to other BCs and other domains in the company.

Design & Implementation — AWS Step Functions to the Rescue

Step Functions are a service designed by AWS that helps us orchestrate lambdas, service integrations with the cloud provider, and state management of the app using state machines. This service has a state called Wait, and its primary function is to provide a way for any workflow to wait for a determined number of seconds or for a specific timestamp to be reached prior to progressing. I want to emphasize “a specific timestamp to be reached” as this is an essential feature for us. Instead of trying to guess when an event should happen by actively checking the time, we will let the time itself be the trigger that broadcasts the event.

As illustrated in the previous article, we will implement a Scheduler mechanism that will help us use the passage of time as an event, leveraging Step Functions and its Wait state. The scheduler is divided into three main components:

Scheduler
Passage of Time
Event Broadcasting

Scheduler Component

This component takes care of communicating the domain event that you want to schedule and sends it to the passage of time mechanism, which in this case will be a workflow with Step Functions. The Scheduler component can receive any kind of payload related to the domain event and the name of the event itself to help further downstream mechanisms to react or route depending on the specific event.

Since we want to keep this scheduler mechanism as generic and reusable as possible for this domain, and in other domains, we divided this component into three main subcomponents: a Timer, a Scheduled Event, and a ScheduleTimer.

ScheduleTimer

The schedule timer is the abstraction of the mechanism that will use the passage of time to broadcast our events. In this case, ScheduleTimer will be an interface, while the concrete implementation will be the Step Functions communication mechanism.

It will take a Timer, extract its ID to be used as an event ID, and the executionDate. In our example, this will be the date when the Order needs to be locked. The event payload to be broadcasted from the timer is converted to a string.

export class StepFunctionScheduleTimer implements ScheduleTimer {
 constructor(private readonly stepFunctions: StepFunctions) {}
 
 public async schedule(timer: Timer): Promise<TimerId> {
   const executionId = await this.stepFunctions
     .startExecution({
       name: timer.event.id,
       stateMachineArn: "ARN",
       input: JSON.stringify({
         executionDate: timer.executionDate.toISOString(),
         payload: timer.event.toString(),
       }),
     })
     .promise();
 
   Return TimerId.create(executionId);
 }
}

The previous code sample is just for illustration purposes as it does not handle error scenarios.

ScheduledEvent

The scheduled event provides the abstraction of the structure of the events that our services will be receiving. We chose to represent those events using a specification called Cloud Events.

As a helper method, we have a static factory that will take only two parameters, the event’s contents and the event’s name.

export class ScheduledEvent {
 private constructor(private readonly _cloudEvent: CloudEvent<unknown>) {}
 
 public static create(message: unknown, eventName: string): ScheduledEvent {
   // Add your Validations in here
 
   return new ScheduledEvent(
     new CloudEvent({
       specversion: "1.0.1",
       id: `uuid`,
       type: eventName,
       source: "my-system",
       time: new Date().toISOString(),
       datacontenttype: "application/json",
       data: message,
     })
   );
 }
 
 // getters
}

Timer

The timer is the mechanism that aggregates the Scheduled Event and then the Date of when the event will be broadcasted back to the domain. It is important to mention that the execution date passed must be in the future.

export class Timer {
 private constructor(
   private readonly _event: ScheduledEvent,
   private readonly _executionDatetime: Date
 ) {}
 
 public static create(
   scheduledEvent: ScheduledEvent,
   executionDatetime: Date
 ): Timer {
   const currentDatetime = new Date();
   if (executionDatetime < currentDatetime) {
     throw new ExecutionDateInPastException();
   }
 
   return new Timer(scheduledEvent, executionDatetime);
 }
 
  // getters
}

Passage of Time Component

The passage of time component is the mechanism that will broadcast the messages when a certain point of time has been reached. As mentioned before, it leverages the usage of Step Functions and time-based features, in this case, the Wait state.

The State machine execution receives two values that we send from the ScheduleTimer. The first one is the timestamp of when we need to trigger the event, in this case, locking the order and the event payload which is the cloud event itself. The definition of this state machine will look something like this:

{
   "StartAt": "Wait",
   "States": {
       "Wait": {
           "Type": "Wait",
           "TimestampPath": "$.executionDate",
           "Next": "SendEvent"
       },
       "SendEvent": {
           "Type": "Task",
           "Resource": "arn:aws:states:::sqs:sendMessage",
           "Parameters": {
               "QueueUrl": "QUEUE_URL",
               "MessageBody.$": "$.payload"
           },
           "End": true
       }
   }
}

Wait is evident, it will wait until the timestamp that the scheduler sends is reached. A Send Event is a task that, using Step Functions Integration Patterns, will send the event to a queue in SQS or any other message broker/event stream in the Event broadcasting component.

In this example, the events sent by the passage of time will be added to a queue that will later be consumed by the event broadcasting component. More complex setups can be designed if you are expecting a higher throughput, including having multiple queues based on the type of event.

Event Broadcasting Component

This component is the last one, as it will get the message that was sent by the Passage of Time component and route it to its corresponding destination. For simplicity’s sake, we will continue with the current example of locking an order.

In our example, the component will receive the event sent by the scheduler through the queue and this message will contain the order ID of the order to be locked. The code below illustrates a simple consumer of the events and triggers the expected behavior.

export const handler: SQSHandler = async (sqsEvent: SQSEvent): void => {
 const records = sqsEvent.Records;
 
 for (const record of records) {
   // Get the data from the message
   const { data: orderLockData } = new CloudEvent<OrderLockData>(
     JSON.parse(record.body)
   );
 
   // Using Command Based Ports following Hexagonal Architecture
   const lockOrderCommand = new LockOrderCommand(orderLockData.orderId);
   
   // You can get this handler by using a Dependency Inversion Container
   const lockOrderHandler = new LockOrderHandler();
  
   await lockOrderHandler.handle(lockOrderCommand);
 }
};

And as simple as that, we are reacting to the passage of time in an event-driven manner.

Limitations

We have talked both implicitly and explicitly about the benefits of using this type of design to react to the passage of time, however, this implementation has its limitations to take into account:

The solution is not cloud-agnostic, it depends on the availability and the accuracy of the cloud provider to be effective.
Even though we can react to the passage of time until we reach a point in the future, Step Functions have a limitation of waiting a maximum of one year.
Step Functions quotas impose a limit, and since we may have a very high number of activities waiting for a timestamp to be reached, you have to factor in to see if that is reached.
There will be a delay between the time an event was sent and when it is actually consumed. Like with any event-driven approach, eventual consistency is a factor, so you have to make sure that your application can sustain this and adjust accordingly.

Conclusion

Adopting the passage of time into your solution is not something that is necessarily complex, but it requires a mindset change in order to achieve it. To offload most of the work to the infrastructure, we leverage AWS Step Functions, only keeping two functions in our application: scheduling the event, and reacting to its existence when the time comes.

It is important to highlight that the code examples provided are one of many ways to handle this pattern, and that your use case may dictate a different approach, with multiple queues, a stream of events, or a more robust routing logic.

Like with any solution, because we rely on an existing service, be sure to assess if the limitations and service quotas are acceptable for your application. If you find them acceptable, then go ahead and start preparing for the future!

Editorial reviews by Catherine Heim & Mario Bittencourt

Want to work with us? Click here to see all open positions at SSENSE!