Testing AWS Lambda Concurrency Limits (Safety Throttles)
NOTE: I wrote this months ago and it has sat in my drafts since then. It contains some useful info though, so I’m going to post it as-is.
Lambda is pretty close to being a home run. It’s a breeze to configure, use, and maintain.
Don’t know what AWS Lambda is? Here’s a snippet from the docs:
AWS Lambda is a compute service where you can upload your code to AWS Lambda and the service can run the code on your behalf using AWS infrastructure.
To say that a different way: It’s a zero-configuration and maintenance auto-scaling cloud event handler. Code goes in, data comes out and from your point of view there are no servers, containers, packages or anything else.
And seeing is understanding so here’s a very simple node.js Lambda function that randomly succeeds or fails. It’s the cloud/async equivalent of doing `someAsyncFunction(event, function(err, data) { … });`:
exports.handler = function(event, context) {
var time = new Date().getTime();
if (time % 2 === 0) {
context.succeed({
yay: 'it succeeded. this data gets passed back to the callee depending. depending on how the function was invoked'
});
}
else {
context.fail(new Error('fail invocation with an error'));
}
};It’s a joy to use except for a couple problems with one being annoying and the other being a more serious concern.
node.js v0.10.x
With node.js being my current language of choice, I’m happy to see Lambda supporting it. However, Lambda is currently stuck at v0.10.x which is very, very old.
Since node.js v0.10.x was released, the node.js band broke up AND got back together AND formed a non-profit foundation AND jumped to v4.x. Also v0.12 was released.
tl;dr; v0.10 is ancient to the point of being problematic as some libraries move to es6.
An old node.js version is annoying at most. However, the Concurrency Limits are a little more of a serious concern in my eyes.
Safety Throttles
Concurrency Limits — or Safety Throttles as the Lambda documentation defines them — are pretty much a concurrency limit on how many executions of a Lambda function can be ongoing.
That’s not completely accurate because as with everything AWS, there are some nuances to how your current concurrency is actually calculated.
I’m going to skip over all that though. For now, just know that it’s not a hard limit and you can burst well above it. Additionally, you can easily request an increase once you’re ready for production.
Once you hit that limit though, your lambda function invokes will start getting throttled.
If your account exceeds the safety throttle at any time […] your functions get throttled […] If Lambda functions are invoked synchronously, it returns a throttling error (error code 429). If Lambda functions are invoked asynchronously and are throttled, they are retried for up to 15–30 minutes […]
The short of it is you have one big pool of Lambda concurrency that all your functions share per-account per-region.
That’s right, be careful. If you’re developing and testing a Lambda function be sure to do it in an isolated AWS region or else it could lead to production functions getting throttled.
Yikes! Throttling is scary! What exactly does it mean though? I decided to conduct some tests and find out.
Test Code
in my tests, i could burst above 100. my invocation type was event. docs mention s3 bursting but seems to apply here too.
when throttled; console would sometimes 429 and sometimes not. it was definitely more likely to succeed in console — even when throttled
no way to abort execution. deleting function did not immediately clear up throttling. shortly after deletion though, the functionality did return but it was unclear if this was coincidence or deletion does in fact abort execution of existing long-running invocations. i did not re-run the test.
even when throttled lambda invoke with invocation type of Event does in fact get accepted even when throttled (as documented). did not test whether they do in fact get retried for 15–30m
when throttled; RequestResponse invocations would succeed and a seemingly predictable rate of about 1/4.
i’d guess that throttled-success-rate of 25% goes down the more functions you have in a throttled state.
even with only one function being throttled though, every other function was immediately throttled so this seems less like “might” — as the docs state — and more like “will”. this might possibly be related to the time of day i was testing? maybe off-peak times they’re more lenient.
functions created after throttle is active are still throttled.
things work well and very much as advertised during regular use. it’s the atypical use/situations that give me pause, which i detail in AWS Lambda: Unsafe at Any Scale.
Here’s the code I used to trigger throttling along with the function code.