Building a serverless application on AWS — Part 2: Create a DynamoDB table and store data
In the first article in this series I gave you a brief example how to build a simple API and ship it to AWS. Now it’s time to put more logic into the API.
But… so far I didn’t explain anything what you want to build in the end. Let’s change that now; here are the requirements:
A user is able to upload an image. For each uploaded image three different thumbnails should be created (small, medium, large). Before doing that the uploaded image needs to be validated. Only
image/jpg
,image/jpeg
andimage/png
are allowed. At any given time the user is able to check the current status of the image processing. An image can have one of the following status:waiting_for_upload
,waiting_for_thumbnails
,published
and in case of an validation errorinvalid_content_type
.
Prerequisites
Before you start the implementation let me give you an introduction into AWS DynamoDB first.
Amazon DynamoDB is a fully managed NoSQL database service that provides fast and predictable performance with seamless scalability.
One of the most important part of using a NoSQL database (AWS DynamoDB) begins before you ever put data into it; designing the table and the corresponding primary keys. AWS DynamoDB supports two types of primary keys to choose from, a Hash Key and a Hash and Range Key.
- A Hash Key consists of a single attribute that uniquely identifies an item
- A Hash and Range Key consists of two attributes that together, uniquely identify an item
To see both types of primary keys in action I will provide you an example of each with the writing and reading operation together with the AWS CloudFormation AWS::DynamoDB::Table
resource.
Example: “Hash Key” based primary key
1. Write operation
2. Read operation
3. AWS CloudFormation resource
Example: “Hash and Range Key” based primary key
1. Write operation
The difference is that you are forced to add the required “Range Key” (createdAt
) in the put
operation.
2. Read operation
The difference is to use the query
operation instead of put
.
3. AWS CloudFormation resource
The difference is to define createdAt
as RANGE
key (Line 7–8 + 12–13).
Let’s get back to work now…
So which type of primary key should you choose? Well, it depends. Do you want to store multiple items for the same ID (hash key) or is it enough to have only one item per ID (hash key)?
With the requirements in mind I assume you already designed a simple entity like this:
Referring to that entity the “Hash Key” based primary key is enough. The only thing we want to do is to update the status
of our entity along the different processes. Therefore let’s create the DynamoDB table right away. You can find the matching AWS CloudFormation resource in Figure 1, Line 6–17.
As you also want to store the uploaded images somewhere; I highly recommend to use AWS S3 for that. The AWS CloudFormation resource for that can be found in Figure 1, Line 1–4.
As soon as you created these two resources you can start to change the logic of the API.
Disclaimer; AWS API Gateway has a Payload size limit of 10mb (which cannot be increased), because of that we need to find a workaround in case our user wants to upload an image which has more than 10mb in size. For that we are using an API offered by AWS S3 to generate a pre-signed URL. A “pre-signed URL” is a URL which grants the user temporary access to AWS S3, but only to one specific object in it. The pre-signed URL expires after a given time (default: 5 minutes).
To be concrete: The user is doing a GET
request against your API and as a response the user get an uploadUrl
. In the second step the user has to do a PUT
request against the uploadUrl
with the image binary as payload and application/x-www-form-urlencoded
as Content-Type.
Let’s check how this is implemented:
- persist initial image data with status
waiting_for_upload
(Figure 2, Line 13–20) - generate
uploadUrl
with help of the AWS S3 SDK (Figure 2, Line 22–29) - return
uploadUrl
(Figure 2, Line 31–35)
Two last steps are missing
First: In our updated API we are referencing two new environment variables BUCKET_NAME
and IMAGE_TABLE
. So we need to pass them to our application as well. To achieve that you need to change the ApiLambda
resource and add Environment.Variables
property. Check Figure 3, Line 24-27
.
Second: Two new AWS services are called in your application; AWS.S3()
and AWS.DynamoDB.DocumentClient()
. With the existing configuration your application will fail with a PermissionDenied
error. To tackle that you need to add two more policies to ApiLambdaExecutionRole
resource. Check Figure 3, Line 13-14
.
Hint: As soon as you are using a new AWS service (e.g. DynamoDb, S3, etc.) in your application it’s most likely that you have to add a new policy to your ExecutionRole
in order to grant access to these services. Following this small hint saves you a lot of (debugging) time. Trust me.
At any given time the user is able to check the current status of the image processing.
Following this quote out of the requirements you need to setup an API route to receive the latest status of the image processing. It’s easy as its sounds.
Get the latest data out of AWS DynamoDB (Figure 4, Line 8–20) and return it (Figure 4, Line 22–27).
In my next article I will explain you how to validate uploaded images and how to create thumbnails from them.
That’s it. Feel free to check the GitHub repository and see how it evolves through this article series. Hope I helped you a little bit in the big -serverless- world. Stay tuned!
-Maik