Dealing with DynamoDb write capacity limits and Lambda timeouts

I came across a couple of libraries that can help you have your cake and eat it too when it comes to API Gateway, Lambda and DynamoDb. (the cake is the effortless scaling of DynamoDb, and eating it means you can still afford the cake and hit it with heavy PUT and POST operations.)

This is a resolution for a problem I mentioned in my article Top 5 things I learned from trying to build a serverless website

If you’re doing Serverless/ApiGateway/DynamoDb like me to aggregate data using your REST endpoints then you’ve probably come across troubles with your self imposed rate limit. Yes, if you have massive data then you should use Kinesis to stream it to ElasticSearch, S3 or Redshift. In this case I’m only talking about 900 records that I want to pull from a vendor’s API and then store it in my database for easy geolocation searches.

At first I started out with what I thought was reasonable for my provisioned capacity on DynamoDb of 5 read capacity and 5 write capacity. Then I started getting timeout errors from my Lambda. (I was inadvertently suppressing the errors in my lambda instead of passing them back up to API Gateway.)

For me, I had to upgrade the memory on my Lambda to 512mb and increase the timeout to one minute and then throttle my ETL service to only process one PUT at a time. I also had to increase my Write Capacity units to 50 on DynamoDb. I was hoping this would be enough for awhile- because it worked when I tested it- but when I went out this morning to check on my application to see what EventBrite icons were available on my map, I noticed that either the city is not really busy today or else something failed.

I want to start with DynamoDb provisioned capacity of 5/5, but to get it working I have to set to 10/50. Domo arigato, mister roboto. But still not enough unless I throttle my ETL code.

Sure enough, I go to my Lambda function (PutBanana) and pull up the Monitoring tab and see this.

I see in “Invocation duration” my average execution time (the green hump) was about 73 millisecond with a max (the blue hump) of 534 milliseconds. Then I have 99 errors under the “Invocation errors” graph.
Thoughts were more meaningful on two-way pagers. These were so cool.

I have 99 errors in my “Invocation errors” box!!! Tell me. Where is the love. Where is the love, the love, the love.

So after searching YouTube for this screenshot from The Black Eyed Peas video, I then found two recommended Node libraries for respecting your DynamoDb write capacity units.


The first one is called node-rate-limiter. This library allows you to specify a rate limit that corresponds to a second, minute, hour or day. You then pass in an array of requests and it allows you to get away with setting cheap capacities on DynamoDb without getting back the horrible 400 status codes or Lambda timeouts from hammering your endpoints with updates.

The github is here

Your implemented code might look something like this if your DynamoDb write capacity is set to 5 per second.

var RateLimiter = require('limiter').RateLimiter;
// Allow 150 requests per hour (the Twitter search limit). Also understands
// 'second', 'minute', 'hour', 'day', or a number of milliseconds
var limiter = new RateLimiter(5, 'second');

// Throttle requests
limiter.removeTokens(1, function(err, remainingRequests) {
// err will only be set if we request more than the maximum number of
// requests we set in the constructor

// remainingRequests tells us how many additional requests could be sent
// right this moment


Still, if you have other people who might use your API, you should have a backup plan for retries.

The author of the above code also recommended oibackup


This is the github link

You would use oibackoff to retry a failed request again using a certain delay time interval system, say Fibonacci or something like that.

I hope this helps you to avoid 400 errors and timeouts on API Gateway, Lambda and DynamoDb; and that it allows you to keep your Dev DynamoDb instances set to low provisioned capacity. Because cheap is a virtue.

Task timed out after 3.00 seconds

^^ error I was originally getting in my lambda logs

If you liked this article, please favorite and follow.

I write about my misadventures in building serverless applications as well as other random things that fascinate me.

A single golf clap? Or a long standing ovation?

By clapping more or less, you can signal to us which stories really stand out.