Lambda Event Filtering
Choose what you are really interested in.
Without event filtering, every event/message is sent for consumption by lambda. This leads to inefficient & complex implementation of lambda code to handle only specific events (like INSERT & DELETE). At 2021 re:Invent, AWS announced that we can filter events when we create Event source mapping for SQS, Kinesis & Dynmaodb streams. With this announcement, the architecture diagram of the event source mapping object is as below.
In the above diagram Source system (SQS, Kinesis, or Dynamodb) & target lambda function has been removed for focussing only on the mapping object.
From the above diagram, we see that the filter is applied before batching and after the polling service. This has two direct meanings:
- Filter reduces the number of events that will be batched. This implies that the filtered events that count towards batching are reduced.
- The filter is after the polling service, which means all events are read from the source. In the case of SQS, unmatched events/messages will be marked as processed and deleted. So they are lost. In Kinesis and Dynamo, we have persistent streams so we can replay the events or can have multiple consumers and process unmatched events. This means while using event filtering with SQS we should use a fan-out pattern and have the capability to process unmatched events.
- Since the filter is after polling, this means, we are charged for all get messages API calls. So no reduction in this cost.
- Since lesser messages are sent to lambda, the processing time will reduce as we don't have to implement filter logic.
- To apply filter on complex filter types or more filter rules, will add some latency. Although it is not documented by AWS, in most cases, it would be very little, but for latency-sensitive application, this needs to be benchmarked before productionizing.
We can have up to 5 different rules on one event source mapping. Filter rules are the same as event bridge rules. The syntax is as below:
"<STRING REPRESENTATION OF JSON>"
}# EVENT PATTERN IN JOSN
"Metadata1": [ rule1 ],
"Data1": [ rule2 ]
Note: I have formatted the filter for readability. We can use JSON.dumps() method if we are using CDK.
Full Rule syntax
Note: Lambda evaluates formats of the incoming message
body and of our filter pattern for
body. If there is a mismatch, Lambda drops the message. So we need to ensure that the format of
body in your
FilterCriteria matches the expected format of
body in messages that we receive from your SQS. For Kinesis & Dynamodb we need to ensure both the data field & data field in filter criteria (data, dynamodb) are in JSON format
Refresher on batching
With Batching, we can increase the average number of records passed to the function for each invocation. This is helpful in bringing down the number of invocations and optimizing the cost. If batching is applied a Lambda function is invoked when any one of the following conditions is met:
- The payload size reaches 6MB (Max payload for Lambda)
- The Batch Window reaches its maximum value
- The Batch Size reaches its maximum value.
With SQS source since events are processed as a whole batch, failure in a single message will put everything as failed. To avoid that we can report which message-ids failed. To do so we use ReportBatchItemFailures in the FunctionResponseTypes. And then in the response we add the following:
The number of messages in the batch are also dependent on the following configuration:
Receive Message Wait Time
Delivery Delay (SQS)
Concurrency of Lambda