Export AWS DynamoDB data to S3 on a recurring basis using Lambda
--
Have you ever wanted to configure an automated way to export dynamoDB data to S3 on a recurring basis but to realise that the console only allows you to do a single export.
In this article, i will be walking through the setup to automate the export process by using lambda as an orchestrator to invoke the API and EventBridge as a trigger to determine the frequency of export.
Implementation Steps:
- Enable Point-In-Time-Recovery (PITR) for your dynamoDB
- Create a lambda function that will invoke the
exportTableToPointIntime
aws sdk dynamodb api. - Create an Eventbridge to trigger the lambda
I will not be covering the provisioning steps for S3 and DynamoDB. If you are familiar with Infrastructure as Code, you can refer to my terraform code here.
1) Enable Point-In-Time Recovery (PITR) for your dynamoDB
Go to DynamoDB → Select the table → Backup section → Edit PITR → Enable PITR
2) Create Lambda Function
We will be creating the lambda function (Nodejs 16.x) to invoke the AWS SDK API. In this example, we will be creating a new IAM Role with the default permission require for lambda.
Copy the following code into the lambda function and replace the bucket_name
and table_arn
- The code will extract today date and format it in this format (yyyy/mm/dd/).
- Each s3 data will be stored prefix with the formatted date
- The code will then invoke the dynamoDB api to trigger the export exportTableToPointInTime
For the code change to take effect, we will have to redeploy the lambda function.
Lastly, for the code to work we will need to update the IAM Role that we created earlier with the necessary permission to allow PutObject
to S3 and ExportTableToPointInTime
from DynamoDB.
Go to IAM → Roles → Search for the role name (export-to-s3-role) → Click on Edit → Add the JSON snippet below into the existing role → Click on Review Policy → Save Changes
{
"Sid": "DynamoDBPermission",
"Effect": "Allow",
"Action": [
"dynamodb:ExportTableToPointInTime"
],
"Resource": "*"
},
{
"Sid": "S3Permission",
"Effect": "Allow",
"Action": [
"s3:PutObject"
],
"Resource": "*"
}
Now our lambda function is ready, all we need is a trigger to invoke our lambda function.
3) Create an EventBridge to trigger the Lambda
Go to Amazon Eventbridge → Rules → Create Rule → Configure schedule → Select the lambda function created earlier to invoke.
Conclusion
We will be able to see our lambda function being updated with eventbridge
as a trigger. Let’s verify our implementation to see if the DynamoDB export is being triggered and the data is being uploaded to S3 successfully. The export process after the lambda trigger the api will take up to 30 minutes depending on the size of your DB.
You will be able to see the object in the following directory yyyy/mm/dd/AWSDynamoDB/{ExportID}/data/
. The object is zip in gz format you can download it and unzip it — gunzip {filename}
— you will see the data in JSON format.
With these 3 steps, you can now export your DynamoDb table data to s3 on a recurring basis for other functions such as cross account sharing of data via S3, backup to S3 and etc. You can refer to the entire terraform code based in my github account. Do comment below or reach out to me via LinkedIn if you need clarification on the implementation.