DynamoDB Streams — Using AWS Lambda to Process DynamoDB Streams for Change Data Capture

In this article, we are going to learn DynamoDB Streams — Using AWS Lambda to Process DynamoDB Streams for Change Data Capture of DynamoDB Tables.

Using AWS Lambda to Process DynamoDB Streams

DynamoDB supports streaming of item-level change data capture records in the near-real time. By the end of the article, we will do Hands-on Lab : Using AWS Lambda to Process DynamoDB Streams for Change Data Capture.

I have just published a new course — AWS Lambda & Serverless — Developer Guide with Hands-on Labs.

Working with Streams on Amazon DynamoDB

We can build applications with DynamoDB streams and take action based on the contents. Most of the applications can benefit from data capturing changes into DynamoDB table. The following are some example use cases:

  • A new customer adds data to a DynamoDB table. This event invokes another application that sends a welcome email to the new customer.
  • An application automatically sends notifications to the mobile devices of all friends in a group as soon as one friend uploads a new picture.
  • A mobile app modifies data in a DynamoDB table, it could be view count of YouTube video or like pictures on Instagram that thousands of updates per second.
  • A financial application modifies stock market data in a DynamoDB table. Different applications running in parallel track these changes in real time.

A DynamoDB stream is an ordered flow of information about changes to items in a DynamoDB table. When we enable a stream on a table, DynamoDB captures information about every modification to data items in the table.

Whenever an application creates, updates, or deletes items in the table, DynamoDB Streams writes a stream record with the primary key attributes of the items that were modified.
A stream record contains information about a data modification to a single item in a DynamoDB table. We can configure the stream so that the stream records capture additional information, such as the “before” and “after” images of modified items.

DynamoDB Streams writes stream records in near-real time so that you can build applications that consume these streams and take action based on the contents.

In order to read and process DynamoDB streams, our application must connect to a DynamoDB Streams endpoint and send API requests. A stream consists of stream records. Each stream record represents a single data modification in the DynamoDB table to which the stream belongs. Each stream record is assigned a sequence number, reflecting the order in which the record was published to the stream.

Stream records are organized into groups, or shards. Each shard acts as a container for multiple stream records, and contains information required for accessing and iterating through these records.

Using AWS Lambda with Amazon DynamoDB

Amazon DynamoDB is integrated with AWS Lambda so that we can create triggers and automatically respond to events in DynamoDB Streams. With triggers, we can build applications that react to data modifications in DynamoDB tables.

We can enable DynamoDB Streams on a table, and after that an item in the table is modified, a new record appears in the table’s stream. AWS Lambda polls the stream and invokes your Lambda function synchronously when it detects new stream records.

So We can use an AWS Lambda function to process records in an Amazon DynamoDB stream. With DynamoDB Streams, we can trigger a Lambda function to perform additional work each time a DynamoDB table is updated. Lambda reads records from the stream and invokes your function synchronously with an event that contains stream records. Lambda reads records in batches and invokes your function to process records from the batch.

Example DynamoDB Streams record event:

{
“Records”: [
{
“eventID”: “1”,
“eventVersion”: “1.0”,
“dynamodb”: {
“Keys”: {
“Id”: {
“N”: “101”
}
},
“NewImage”: {
“Message”: {
“S”: “New item!”
},
“Id”: {
“N”: “101”
}
},
“StreamViewType”: “NEW_AND_OLD_IMAGES”,
“SequenceNumber”: “111”,
“SizeBytes”: 26
},

Hands-on Labs : Process DynamoDB Streams using AWS Lambda for Change Data Capture of DynamoDB Tables

We are going to do Hands-on Labs : Process DynamoDB Streams using AWS Lambda for Change Data Capture of DynamoDB Tables. Here you can find the overall architecture of our hands-on lab:

Using AWS Lambda to Process DynamoDB Streams

We basically capture new order create operation into DynamoDB and Process DynamoDB Streams using AWS Lambda to perform business logic.

In this case business logic could be send a confirmation e-mail to customer, but we don’t implement it. I will give you an assignment to perform business logic.

In this lab, we will create an AWS Lambda trigger to process a stream from a DynamoDB table.

  1. A user inserts/updates/deletes an item to a DynamoDB Order table.
  2. A new stream record is written to reflect that a new item has been added to OrderTable.
  3. The new stream record triggers an AWS Lambda function (ordercreatedFunction).
  4. The lambda function writes to CloudWatch Logs

Create a DynamoDB Table with a Stream Enabled

The easiest way to manage DynamoDB Streams is by using the AWS Management Console. On the DynamoDB console dashboard, choose Tables and select an create table.

  • Create table — order; id, name, status, date

On the Exports and streams tab, in the DynamoDB stream details section, choose Enable. In the Enable DynamoDB stream window, choose the information that will be written to the stream whenever the data in the table is modified:

  • Key attributes only — Only the key attributes of the modified item.
  • New image — The entire item, as it appears after it was modified.
  • Old image — The entire item, as it appeared before it was modified.
  • New and old images — Both the new and the old images of the item.

Select — New and old images

When the settings are as you want them, choose Enable stream.

Configure DynamoDB Stream as an Event Source of Lambda Function

We are going to Create a Lambda Function and Configure DynamoDB Stream as an Event Source of Lambda Function.

If you remember the AWS Lambda invocation types, we use event source mapping polling invocations when interacting with DynamoDB streams. Because streams are using shards and queue mechanism like amazon SQS and kinesis. So we will create a lambda function and create event source mapping polling invocations between lambda and DynamoDB.

  • goto Lambda — create function — newOrderCreatedFunction

The important thing we should do is giving required permissions. Lambda needs the following Execution role permissions to manage resources related to your DynamoDB stream. Add them to your function’s execution role.

  • dynamodb:DescribeStream
  • dynamodb:GetRecords
  • dynamodb:GetShardIterator
  • dynamodb:ListStreams

The AWSLambdaDynamoDBExecutionRole managed policy includes these permissions. Attach that Policy.

To configure your function to read from DynamoDB Streams. We have 2 options:

1- Add trigger into DynamoDB table
2- Add trigger into Lambda function

I choose the create lambda triggers in the Lambda console, create a DynamoDB trigger. Add Trigger Lambda;

  • To create a trigger — Open the Functions page of the Lambda console. — Choose the name of a function. — Under Function overview, choose Add trigger. — Choose a trigger type. — Configure the required options, and then choose Add. — Lambda supports the following options for DynamoDB event sources.
  • Event source options — Order DynamoDB table — Set Batch size

As you can see that Create a Lambda Function and Configure DynamoDB Stream as an Event Source of Lambda Function.

Test and Verify DynamoDB Streams Event Source Mapping Lambda Function

We are going to Test and Verify DynamoDB Streams Event Source Mapping Lambda Function.

  • Create item in Order table
  • id — 1
    name — IPhone order
    status — IN_PROGRESS
    date — 12.12.2022

Create Item on DynamoDB table. Now we can check Lambda logs that trigger with DynamoDB Streams:

2022–06–09T13:58:36.902Z caabdc23-b422–44bb-ac1d-94411898bc7d INFO event: {
“Records”: [
{
“eventID”: “2a55a9ad1b8981ad01184d0d21150e90”,
“eventName”: “INSERT”,
“eventVersion”: “1.1”,
“eventSource”: “aws:dynamodb”,
“awsRegion”: “us-east-2”,
“dynamodb”: {
“ApproximateCreationDateTime”: 1654783116,
“Keys”: {
“id”: {
“S”: “2”
}
},
“NewImage”: {
“date”: {
“S”: “12.12.2022”
},
“name”: {
“S”: “samsung order”
},
“id”: {
“S”: “2”
},
“status”: {
“S”: “IN_PROGRESS”
}
},
“SequenceNumber”: “1300000000000877783207”,
“SizeBytes”: 54,
“StreamViewType”: “NEW_AND_OLD_IMAGES”
},
“eventSourceARN”: “arn:aws:dynamodb:us-east-2:xxx:table/order/stream/2022–06–09T13:42:39.824”
}
]
}

See “eventName”: “INSERT”, and details included. If we update order item, this will include New and Old Image details.

As you can see that Lambda polling events with Event source mapping APIs from Amazon Dynamo DB Streams and we have seen all change data capture differences into our incoming event objects of Lambda function.

Hands-on Labs : Develop Lambda Function for DynamoDB Streams Event Source Mapping

According to incoming event, I am going to develop our lambda function;

exports.handler = async (event) => {
console.log(“event:”, JSON.stringify(event, undefined, 2));

event.Records.forEach(async (record) => {
console.log(‘Record: %j’, record);

console.log(‘Table Event : %j’, record.eventName);
console.log(‘Order Status : %j’, record.dynamodb.NewImage.status.S);
// TODO : business logics; send email to customer, send sns notifications
});
};

We have iterate incoming records from DynamoDB Streams and process records with logging details. But at this stage, we can also perform some real-world use cases like send email to customer or send SNS notification.

Next Step: Amazon SNS fan-out pattern to Amazon SQS

At this point, I would like to give you an assignment that you can develop with your AWS environment. Here is very great article about How to perform ordered data replication between applications by using Amazon DynamoDB Streams

In this article, explains Process DynamoDB streams using AWS Lambda.
And offers to Use fan-out patterns to process DynamoDB streams.

There are several fan-out patterns for processing DynamoDB streams:

  • Lambda fan-out pattern
  • Amazon SNS fan-out pattern to Amazon SQS
  • Kinesis Data Streams fan-out pattern

You can pick any of these fan-out patterns as an assignment but I have given to you the second one Amazon SNS fan-out pattern to Amazon SQS.

I have already developed this kind of example in the below article:

As you can see that we have successfully developed Hands-on Lab : DynamoDB Streams — Using AWS Lambda to Process DynamoDB Streams for Change Data Capture. To see full developments of this hands-on lab, you can check below course on Udemy.

Step by Step Design AWS Architectures w/ Course

I have just published a new course — AWS Lambda & Serverless — Developer Guide with Hands-on Labs.

In this course, we will learn almost all the AWS Serverless Services with all aspects. We are going to build serverless applications with using AWS Lambda, Amazon API Gateway, Amazon DynamoDB, Amazon Cognito, Amazon S3, Amazon SNS, Amazon SQS, Amazon EventBridge, AWS Step Functions, DynamoDB and Kinesis Streams. This course will be 100% hands-on, and you will be developing a real-world application with hands-on labs together and step by step.

Source Code

Get the Source Code from Serverless Microservices GitHub — Clone or fork this repository, if you like don’t forget the star. If you find or ask anything you can directly open issue on repository.

References

--

--

Mehmet Ozkaya
AWS Lambda & Serverless — Developer Guide with Hands-on Labs

Software Architect | Udemy Instructor | AWS Community Builder | Cloud-Native and Serverless Event-driven Microservices https://github.com/mehmetozkaya