AWS: Export DynamoDB into S3 using DataPipeline
AWS Data Pipeline is a web service that you can use to automate the movement and transformation of data
Here I will show an example of exporting dynamodb table content into the s3 bucket through the data pipeline
First, create a table in dynamodb like shown below
And then create 2–3 items in a created table with very basic info.
After that create an s3 bucket as shown in the diagram
and here we won’t add any configuration and permission for now just creating a basic bucket. After creating a bucket it will be empty only, Now we will go to the data pipeline section which is one of the AWS services and create a new data pipeline
Choose export DynamoDB template from the drop-down and mention the output S3 bucket we have created and the DynamoDB table.
For now, we can run our pipeline on pipeline activation but we can Schedule the pipeline job every 15 minutes or more than that but not less than 15 minutes like a weekly basis or monthly and other details, logging we will disable for now, we don’t need it and IAM roles we will select to default.
If for the official purpose we are creating data pipeline then we should create tags as it keeps track of cost and billing. After specifying all the details click on Edit on Architect.
then we will land on this page now We need to select Activities and here we will find many fields to fill Name of activity, Type, Output(where to backup content), Input(from where we are transferring the data), Maximum Retries we can define like till how many tries this process should repeat if any error occurs in between the process.
Then we need to select Add an optional field which will create a new action like below
to keep track of the success and failure add an optional field one On Success and other is On Fail, for now, keep it as DefaultAction1 for on success and for on fail select create a new action, So for you, it will create new action DefaultAction2
Now select Others like shown below
And here name and subject we can fill with the suitable title and type we need to select sns alarm which will notify whether the export is successful or failed
and topic Arn we need to select from Simple notification service (one of Aws service) as shown in below image.
Above I have mentioned SuccessSnsAlarm and now will show FailurSnsAlarm
now all fields we have filled in, We can proceed with saving the data-pipeline
and after saving this will ask for activating it.
So we need to click on activate it and after that, it will activate our created data pipeline. Maybe it will take some time to process and it will show status as like shown in the below diagram.
after some time the status will be changed to processing and it will dump our dynamodb content into S3 bucket we can go and check in S3 bucket we need to refresh it as sometimes it won’t show. After refreshing, we can see the file is there in our bucket
Now I have opened this file and I can see the data exactly present in dynamodb. Below I will show both the data present in dynamodb and the file which I have exported in S3.
and here is the exported file with the same data
After completing all the process we can check the status in a data-pipeline section it will come as Finished.
I hope this article helps in exporting data from dynamodb to S3, sometimes data is of no use after 1–2 days but to keep track of the data for history purpose we need to save it somewhere and it saves our cost as well.