Schedule a million emails: The AWS EventBridge Secret

Adithya M
7 min readMar 3, 2024

--

https://en.wikipedia.org/wiki/File:Amazon_Web_Services_Logo.svg

In the fast-paced world of digital communication, timing is everything. Whether it’s delivering promotional offers, critical updates, or personalized messages, the ability to schedule and send a million emails with precision can make all the difference.

In this article, we’ll delve into a real-world scenario where the need arose to schedule the dispatch of a million emails, each uniquely timed for optimal engagement, all while embracing the true essence of serverless computing. By harnessing the combined power of AWS services like API Gateway, SQS (Simple Queue Service), Lambda, and SES (Simple Email Service), we’ll uncover how AWS EventBridge Scheduler simplifies and automates this complex operation.

Eventbridge Scheduler

Amazon EventBridge Scheduler is a serverless scheduler that allows you to create, run, and manage tasks from one central, managed service. It allows us to create schedules for recurring patterns, or configure one-time invocations. It boasts exceptional configurability, ensuring precision right down to the last minute.

Official Documentation : https://docs.aws.amazon.com/scheduler/latest/UserGuide/what-is-scheduler.html

Problem Statement

When our system receives a POST API call initiated by a user or another system, it includes essential email parameters such as the recipient’s email address, subject, body, and the designated time for sending. Our responsibility is to process this request and schedule the email to be sent at the specified time.

Design

  • We’ll configure an API Gateway to efficiently handle incoming POST API calls.
  • This API Gateway is integrated with AWS SQS -Request Queue. Although we can directly invoke AWS Lambda, adding a queue in between gives us two advantages. Firstly, process the messages in batches and avoid too many Lambda invocations. Secondly, the queue serves as a safety net; if processing of request results in a failure, we can have a retry mechanism.
  • Upon successful conversion of the request to an SQS message, a 200 Success response is sent back.
  • SQS messages acts as a trigger to the Request Processor Lambda. Feel free to configure the messages batch size as per your requirements.
  • Request Processor Lambda processes the request message and creates an one-time schedule with Email Sender Lambda as the target.
  • If the number of schedulers within your AWS account surpass a predefined limit, currently set at 1 million, an exception of ServiceQuotaExceededException will be triggered, notifying you of the quota constraint. In this case, push the message to a Dead Letter Queue (DLQ).
  • Two approaches to redrive messages back to the Request Queue: event-driven, triggered by specific conditions, and periodic, executed at scheduled intervals. For the first one, refer to this article. For the second one, we will use EventBridge Rule to invoke Redrive Lambda at regular intervals.

Ensure that the re-drive process occurs only when ample scheduler quota becomes available.

  • Upon reaching its scheduled time, the scheduler invokes Email Sender Lambda function, passing along the designated payload.
  • The Lambda processes this payload and sends an email to the recipient using Simple Email Service.
  • We should make sure that the schedulers are deleted upon successful invocation. This frees up the quota. (Handled inside scheduler creation logic)

Implementation

Create a Post API and integrate it with SQS :

There is already a good article on how to do this, follow all the steps. While creating the queue, set the visibility timeout to sufficiently high duration(say, 10 min) to avoid the message being processed again. Also create a dead letter queue. The url of this queue will be added to request processor lambda code.

JSON body for the Post API:

{
"email" : {
"emailAddress":"testabc@gmail.com",
"subject":"Test Subject",
"body":"Test email body"
},
"timeToSend":"2024-03-05T08:15:47"
}

Create a Request Processor Lambda:

  • Go to the AWS console, navigate to Lambda, click on Create Function.
  • Select Author from scratch, for the runtime environment, choose Python (or any other language of your choice). Again click on Create Function.
  • Once the function is created, head to Configuration > Permissions in the Lambda Console.
  • In the Execution role, click on the Role name. It redirects you to the Lambda IAM role.
  • In the IAM console, add this Customer inline policies under Permissions policies. Add permissions > Create inline policy.
  • In create policy editor, choose JSON and add this and save. These are all the permissions required for your lambda.

(Add your account id instead of 123456789012)

{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"iam:PassRole",
"sqs:ReceiveMessage",
"sqs:DeleteQueue",
"scheduler:CreateSchedule",
"sqs:GetQueueAttributes"
],
"Resource": [
"arn:aws:sqs:eu-west-1:123456789012:*",
"arn:aws:iam::123456789012:role/*",
"arn:aws:scheduler:eu-west-1:123456789012:schedule/*/*"
]
}
]
}
  • Go back to the Lambda console, Add trigger in the Function overview. Select SQS as source and choose the queue you created in the previous step. Configure Batch size, window and concurrency as per your requirement.
  • Let’s add the code to the lambda function and deploy. Make sure that the Python version is 3.12 (or above). Some eventbridge functions are not supported in older versions. (Some AWS resources will be created in the next steps)

Create Email Sender Lambda: Create a lambda similar to the above one but we need not explicitly add any trigger. Scheduler will take care of invocation. Only permission policy and python code are different for this lambda.

Please note that the sender email address needs to be verified in SES. Attaching official documentation : https://docs.aws.amazon.com/ses/latest/dg/creating-identities.html#just-verify-email-proc

Permission policy :

{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "VisualEditor0",
"Effect": "Allow",
"Action": [
"ses:SendEmail",
"ses:SendRawEmail"
],
"Resource": [
"arn:aws:ses:eu-west-1:123456789012:configuration-set/*",
"arn:aws:ses:eu-west-1:123456789012:identity/*"
]
}
]
}

Lambda function code:

Create EventBridge Scheduler role that allows invocation of Email Sender Lambda : Each Eventbridge Scheduler requires an Execution role and role needs permission policy. We will first create policy and attach it to a role.

  • Head to IAM console. Select Policies under Access Management and Create policy. In the policy editor, add the below JSON and save it with a meaningful name.
{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"lambda:InvokeFunction"
],
"Resource": [
"arn:aws:lambda:eu-west-1:123456789012:function:email-sender-lambda"
]
}
]
}
  • Go back to IAM console. Select Roles under Access Management and Create role.
  • In trusted entity type, choose Custom trust policy and add this and click Next.
{
"Version": "2012-10-17",
"Statement": [
{
"Sid": "Statement1",
"Effect": "Allow",
"Principal": {
"Service": "scheduler.amazonaws.com"
},
"Action": "sts:AssumeRole"
}
]
}
  • In this step, we need to attach the permission policy we just created. Search for it and click on the checkbox. Choose next and create the role with a meaningful name. (This role ARN (Amazon Resource Name) needs to be added to the Request processor lambda)

Voila! You have successfully created the entire infra to schedule and send millions of emails.

Testing

  • Make a POST API call as mentioned in the first step of implementation. If the API-SQS integration is successful, you would get a response as mentioned in the link.
  • Check cloudwatch logs for the lambdas if their invocations are correct.
  • Go to Amazon EventBridge in the console, select Schedules under Scheduler. Here you can find all the schedules you have created, select them to check the schedule, target and payload.

DLQ redrive :

  • Not covering DLQ redrive lambda steps in this article since the article is already too long. Periodic DLQ redrives, scheduled at regular intervals like every 12 hours, offer a safer and more controlled approach compared to event-driven redrives based on the number of messages in the DLQ because latter could lead to infinite redrive.
  • The drawback of this mechanism is, some emails will be sent later than the specified time even though the other schedulers are scheduled to be sent after these emails.

Quota Increase

We can request AWS to increase Scheduler quota at account-level. Here are the steps,

  1. Go to the AWS Management Console and navigate to the Service Quotas.
  2. In the navigation pane, choose AWS Services.
  3. In Find Services, search for Amazon EventBridge Scheduler and select it.
  4. In the list of quotas mentioned, select Number of schedules. Here choose Request increase at account-level.

Endpoints and Quota: https://docs.aws.amazon.com/general/latest/gr/eventbridgescheduler.html

Scaling issues? Quota increase insufficient? An alternative

Instead of Eventbridge Scheduler, use a DynamoDB which has both Partition key and Sort key. Partition key should be of Unix time in minutes. Idea is to store all records of same time in a single partition, this helps fetching the records faster. Sort key could be a unique ID for your reference.

Real challenge is when your requests per minute is very high, where a single consumer micro-service/lambda is unable to process all these requests in a minute. Well, spin up more instances? You will end up sending duplicate emails. One way is to sub-partition our table further.

(Yes, this compelling content warrants its own dedicated article. Expect one where we’ll dive deep into distributed world!!)

Conclusion

Eventbridge Scheduler is one of the most powerful AWS services that allows you to schedule a million tasks. As long as your load is well within this range, there is no need to build your own infrastructure and maintain it. It is cute and convenient to use, plus AWS lets you invoke more than 270 services.

Post Script

  • Apologies, this ended up becoming a lengthy article. Should have split into two, one explaining the scheduler and another detailing its implementation.
  • Forgive me, if my python code isn’t upto the industry standards. Used it for convenience. Not a python developer.
  • Embracing serverless while staying true to my microservices roots.

I thank Suman for introducing me to Eventbridge Scheduler and sharing the best practices around it.

--

--