The journey of our scheduler

--

A scheduler is a tool or system that can be used to schedule and manage tasks, events, or resources. We normally have use cases where we need to trigger an event or action at a specific time, such as triggering an automation, automatically exporting data from your system to another system at a specific time every day, or managing batch processing jobs that run at specific intervals.

Initial days with Quartz

Quartz is a widely used, open-source, Java library for scheduling jobs. It provides a powerful and flexible scheduling mechanism that can be integrated with any Java application. Quartz Scheduler can be used to schedule and execute jobs at specific times, or at regular intervals. It also provides support for cron-like expressions, which can be used to define complex scheduling patterns.

In an ideal world, you would have several classes, bean configuration, a database with which you would keep the state, and the list goes on. In short, you would have a lot of code to maintain.

After we accomplished a working solution for our first scheduler application to export data, a library was created which bundled these beans, classes, properties, etc., and voilà. Any new project could simply import the library, and it would start working without having to do anything from scratch. Happy days!

But as everyone is aware, "the best code is no code at all". CloudWatch was the answer.

CloudWatch scheduler

Amazon CloudWatch Events allows you to create rules based on AWS event patterns or schedule. The schedule can be specified using standard cron or rate expressions.

For example, you can create a rule that runs an AWS Lambda function every day at a specific time, or a rule that runs an EC2 instance every hour.

When the schedule is triggered, it invokes the target that you have configured. Some of the patterns we have are invoking a lambda, pushing the event to an SQS queue, or starting an ECS task.

Here is an example of how to create a rule that sends an SQS message every day at 8:00 a.m. UTC time:

aws events put-rule \
--name DailyDataFileExport \
--schedule-expression 'cron(0 8 * * ? *)' \

aws events put-targets \
--rule DailyDataFileExport \
--targets '{"Id":"1","Arn":"arn:aws:sqs:REGION:ACCOUNT_ID:QUEUE_NAME"}'

In this example, DailyDataFileExport is the name of the rule, cron(0 8 * * ? *) is the schedule expression for 8:00 AM UTC time every day, and the target is the SQS queue with the specified ARN.

You can also use the aws cloudwatch put-rule command to create the rule and the aws cloudwatch put-targets command to add the SQS target. It’s important to note that you will need to replace REGION, ACCOUNT_ID and QUEUE_NAME with the correct values for your SQS queue.

When the rule is triggered, it generates the below event (sample) that would be an input to the SQS target:

{
"version": "0",
"id": "53dc4d37-cffa-4f76-80c9-8b7d4a4d2eaa",
"detail-type": "Scheduled Event",
"source": "aws.events",
"account": "123456789012",
"time": "2023-01-02T14:53:06Z",
"region": "us-east-1",
"resources": [
"arn:aws:events:us-east-1:123456789012:rule/DailyDataFileExport"
],
"detail": {}
}

You can quickly explore different targets, inputs, etc. via AWS Console:

Time zone issues

Well, the above set-up is good enough if you do not have to worry about the time zone, as the CloudWatch rule has always been in UTC–at least, up until now.

Let’s analyse a real world example. We have a CloudWatch scheduler for system A, which is a daily automation that takes a dump of data in a certain file format and pushes it to an SFTP server. System B, a legacy system that receives this data, is not able to act on events, so it needs to run at a specific time for polling. Our scheduler runs at 6:00 p.m. UTC, and the task takes about one hour with some buffer. System B is in the AEST time zone, so it is scheduled at 5:00 a.m. With daylight saving time, it will miss the file. To solve this, system B always runs at 6:00 a.m.

Well, with EventBridge Scheduler, we can set a time zone now.

EventBridge Scheduler

You can create, execute, and manage tasks using the serverless Amazon EventBridge Scheduler in any time zone.

The below message in the AWS Console was so satisfying:

You could use create-schedule command to create a schedule with different options

aws scheduler create-schedule \
--schedule-expression "cron(0 8 * * ? *)" \
--name DailyDataFileExport \
--target '{"RoleArn": "role-arn",\
"Arn": "QUEUE_ARN",\
"Input": "YOUR MESSAGE TO THE TARGET" }' \
--schedule-expression-timezone "Australia/Sydney"
--flexible-time-window '{ "Mode": "OFF"}'

You could use the special keywords in your target payload. These keywords would be replaced with their respective values. You could pass your custom data if you want to apply some logic. For example, if you have one service to perform various actions, like an upload or download automation based on a different schedule, you would create multiple schedules and pass an action as EXPORT or IMPORT to perform different logic.

{
"arn": "<aws.scheduler.schedule-arn>",
"scheduledTime": "<aws.scheduler.scheduled-time>",
"executionId": "<aws.scheduler.execution-id>",
"attemptNumber": "<aws.scheduler.attempt-number>",
"action": "EXPORT"
}

You could configure a dead-letter queue to troubleshoot if a schedule fails for some reason.

Putting everything in AWS Console, you should have something like this:

Reference

  1. AWS EventBridge — schedule
  2. AWS CLI — create schedule
  3. AWS schedule expressions

Information has been prepared for information purposes only and does not constitute advice.

--

--