While exploring the serverless architecture, multiple use cases could be built on top of it. Recently had use cases of the feature of Report Scheduling.
Mainly this feature reports being scheduled and sent through the email. Sounds easy at a go, but few complications are challenging when developing such use cases with serverless.
Use-Case: Report Scheduling
_> User able to choose report timezone they want to receive, Then able to pick the time when they want to get the report.
_> User can choose the report type they want to schedule. Example: Transaction report, Summary report.
_> User can pick the report occurrence: Daily, Weekly, Monthly, Custom.
Now the solution built on top of these uses cases and parameters is developed around certain edges of requirements.
If you tried searching for such scheduling patterns you would have come across the Dyanmodb TTL mechanisms.
What is Dynamodb TTL(as an ad-hoc)?
DynamoDB TTL as an ad-hoc scheduling mechanism that solves a problem of scheduling without adding any events based cron or any triggers externally.
Perhaps it serves with TTL (Time to Live) which is when the data insert in DB it adds one row which is marked as TTL. (Where you have to enable the TTL functionality for Dynamo DB table — Reference is here)
For this, Refer to above image AWS-SchedulewithDynamoDB; the approach is 1) One lambda will create events in dynamo DB with TTL, Once the time arrives dynamo DB ad-hoc will trigger the lambda(Report Execution lambda will be paired up with dynamo DB to receive the trigger and perform the invocations).
2) Once the Report Execution lambda is invoked successfully it will add the entry to SQS for further processing. (Why SQS? _> To avoid any failures, and retry with DLQs.)
3) SQS will trigger the Emai sender lambda with the request payload and send email to the respective users with SNS used for generating notifications at different subscription points.
[NOTE- Here dynamo DB is used just for storing the schedule report events, It won’t store the data. As TTL events get deleted/removed once its executed/timed-out. Refer here]
> Takeaway from Dynamodb TTL(as an ad-hoc) approach to schedule any event.
- As this approach is pretty much scalable, It can take millions of request at each time and process it on the go with Lamba.
- This approach is cost-effective and hassle-free from the scalability point of view, as the request is scaling up, lambda will scale up too, although the dynamo DB will handle the stream as well for sending the request at a time. (That is configurable at lambda pairing with dynamo DB)
- Few drawbacks, with these there’s might suffer delays in sending tasks, it does can scale to support many tasks all expiring at the same time, perhaps it cannot guarantee that tasks are executed on time.
There’s another approach which does give you some delivery guarantee that tends to deliver the addressed tasks in time using a similar architecture but with little changes in handling, which will eliminate the TTL mechanism.
The event-Driven architecture is mostly decoupled and uses limited polling to add new events when it is time to work with them.
For this, Refer above image AWS-SchedulewithEventDriven; the approach is
1) Lambda will put request and create an item to DynamoDB with scheduled time event when the task is scheduled.
2) Report Execution Lambda will run in a window of every 5mins (with EventBridge cron) and reads the DynamoDB items for tasks scheduled time events.
3) Once the tasks found that needs to be executing in next 5mins by Report Execution Lambda it will execute and add to SQS. (It will execute the report and add to SQS with an original time of schedule event with SQS visibility — hiding task until they’re due) therefore SQS will only visible that task when the time arrives.
4) And SQS will trigger Email Sender lambda that will send email and SNS Notifications to subscribers.
> Takeaway from Event-Driven Approach approach to schedule any event.
- Reliably process the tasks whether you have just 1 or millions at that moment. The more you have, the time it takes to run them all will increase due to the lambda concurrency limit. (As your scale increases, you can request a higher Lambda concurrency quota to be able to process the required number of reports within the limited time you have.)
- This is reliable and can give you throughput window of timely report/task execution. But it comes with DB lookups and some latency of handling request of higher frequent tasks. That needs to be handled accordingly.
Conclusion -> There’s would be other approaches too handling such types of use cases. but this I found minimal and easy-going with utmost scalability improvements as going further. Its the choice matter going with any approaches from about, One is timely and the other one comes with easy-going triggers.
If you like this article would be glad to see your thoughts in comments. If you have any questions or problems regarding AWS infra reach me out will surely help.
Hope this article helps you in some way,
Looking forward to the next article. Thanks!