Building a Serverless Data Pipeline with AWS S3 Lamba and DynamoDB

Yi Ai
Yi Ai
Mar 5, 2019 · 3 min read

AWS Lambda plus Layers is one of the best solutions for managing a data pipeline and for implementing a serverless architecture. This post shows how to build a simple data pipeline using AWS Lambda Functions, S3 and DynamoDB.

Every day an external datasource exports data to S3 and imports to AWS DynamoDB table.

On a daily basis, an external data source exports data of the pervious day in csv format to an S3 bucket. S3 event triggers an AWS Lambda Functions that do ETL process and save the data to DynamoDB.

Before getting started, Install the Serverless Framework. Open up a terminal and type npm install -g serverless to install Serverless framework.

Create a new service using the AWS Python template, specifying a unique name and an optional path.

$ serverless create --template aws-python --path data-pipline

Then can run the following command project root directory to install serverless-python-requirements Plugin,

$ serverless plugin install -n serverless-python-requirements

Edit the serverless.yml file to look like the following:

- serverless-python-requirements
dockerizePip: non-linux
layer: true #Put dependencies into a Lambda Layer.

You need to have Docker installed to be able to set dockerizePip: true or dockerizePip: non-linux.

This will create a dev.document.files bucket which fires the importCSVToDB function when an csv file is added inside the bucket.

handler: handler.importCSVToDB
- {Ref: PythonRequirementsLambdaLayer}
documentsTable: ${self:custom.documentsTableName}
bucketName: ${self:custom.s3bucketName}
- s3:
bucket: ${self:custom.s3bucketName}
event: s3:ObjectCreated:Put
- suffix: .csv

Full sample serverless.yml as below:

Now, let’s update our to create the pandas dataframe from the source csv in S3 bucket, convert dataframe to list of dictionaries and load the dict object to DynamoDB table using update_item method:

As you can see from above lambda function, We use Pandas to read csv file, Pandas is the most popular data manipulation package in Python, and DataFrames are the Pandas data type for storing tabular 2D data.

Let’s deploy the service and test it out!

$ sls deploy --stage dev

To test the data import, We can manually upload an csv file to s3 bucket or using AWS cli to copy a local file to s3 bucket:

$ aws s3 cp sample.csv s3://dev.document.files

And there it is. You will get data imported into the DynamoDB DocumentsTable table.

You can find complete project in my GitHub repo:

Alternatively, You can use AWS Data Pipeline to import csv file into dynamoDB table

AWS Data Pipeline is a web service that you can use to automate the movement and transformation of data. With AWS Data Pipeline, you can define data-driven workflows, so that tasks can be dependent on the successful completion of previous tasks. You define the parameters of your data transformations and AWS Data Pipeline enforces the logic that you’ve set up.



how hackers start their afternoons. the real shit is on Take a look.

By signing up, you will create a Medium account if you don’t already have one. Review our Privacy Policy for more information about our privacy practices.

Check your inbox
Medium sent you an email at to complete your subscription.

Elijah McClain, George Floyd, Eric Garner, Breonna Taylor, Ahmaud Arbery, Michael Brown, Oscar Grant, Atatiana Jefferson, Tamir Rice, Bettie Jones, Botham Jean

Yi Ai

Written by

Yi Ai

AWS Community Builder | AWS & AZURE Certified Engineer | A Cloud Technology Enthusiast | AWS Certified Security/Machine Learning/Database Analytics Specialty

Elijah McClain, George Floyd, Eric Garner, Breonna Taylor, Ahmaud Arbery, Michael Brown, Oscar Grant, Atatiana Jefferson, Tamir Rice, Bettie Jones, Botham Jean

Medium is an open platform where 170 million readers come to find insightful and dynamic thinking. Here, expert and undiscovered voices alike dive into the heart of any topic and bring new ideas to the surface. Learn more

Follow the writers, publications, and topics that matter to you, and you’ll see them on your homepage and in your inbox. Explore

If you have a story to tell, knowledge to share, or a perspective to offer — welcome home. It’s easy and free to post your thinking on any topic. Write on Medium

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store