Lambda programming errors that could cost you thousands of dollars a day!

This is a real story. One that is still unfolding.. Within a week we accumulated a bill close to $10K due to a programming error. I am writing this to help someone else be better prepared and to avoid the dangers, that could run up bills of several thousands of dollars.


We started SLAppForge a few months back, to build high productivity Development Tools for Serverless computing. We launched Sigma, the browser based IDE for Serverless computing in February, and got hundreds of users trying out our platform in beta. We were happily working on new features and testing activities, and writing many articles and samples about Serverless computing utilizing AWS — which we support at the moment. On the 17th of May, I got a call from the AWS Business Development team for Sri Lanka (located in Bangalore, India) about general stuff, after which I was asked about our AWS bill for the month, which was over $7,500!

As the Founder of a startup, with my personal credit card provided to AWS —my heart stopped a beat — with a lot of worry and hurt, since the first thought that flowed into my mind was that someone hacked into our account, and used it for some illegal purpose.


Finding out the culprit — a Lambda function!

Our whole team went into panic mode, trying to find out what happened. We realized that the culprit was a Lambda function we had written ourselves under our testing account.

We had configured a Lambda to trigger from CloudWatch Events, using the S3 API Call via CloudTrail configuration. For this to work, we have enabled CloudTrail to write events into a S3 bucket. Unfortunately we had mistakenly configured the trail to log S3 events for all buckets. As a result, a loop was created wherein every CloudTrail log write (which results in an S3 API call) generated another event, and that continued exponentially!

Additionally, for each of these S3 events, the above-mentioned Lambda got invoked, resulting in a huge number of invocations as well. Our daily cost for the test account which was under $5 shot up to around $ 1,400 per day!

The daily cost shot up from less than $5 to around $1,400!

Trying to shutdown

Unfortunately there is no enable/disable for Lambda functions within AWS. And there is no account lockdown button. We didn’t want to delete the function right away, to help us analyze the issue or read logs, and for possible forensic investigations. So we disabled the triggers from invoking the function and thought that should prevent invocations.


Still getting charged!

Looking at the dashboards, the executions went down, and we thought we had nailed it. We opened a case and alerted AWS and requested for a review of the billing. The following day, we found out that our account was still getting charged, and the Lambda function has was executing again?! We requested AWS for a lockdown of our account, since the charges had accumulated further to around $10K! Then we figured out that it could be due to the throttled events that had been queued up earlier, getting replayed back again, although the triggers were disabled.


Disabling the Lambdas

Finally, we iterated through all Lambda functions, and set the concurrent execution limit to zero for each, which finally seemed to stop the billing.


This can happen to you too!

I’m quite sure this could happen to anybody, and its a very unfortunate situation to get into, especially if you are a freelancer, an individual user, or a startup. I’m also sure there are many other combinations where you could end up in a high cost billing loop. Having provided your credit card to AWS with the billing occurring only at the end of a month is extremely dangerous, since AWS does not enforce a limit on your account spending.

Any AWS account could get compromised, attacked with a Denial of Service attack, or fall into a genuine developer error like this. Unlike with EC2 etc, the Lambda costs can be very unpredictable — provided that the only limit which prevented our account from incurring even more each day, was the 1000 concurrent invocation limit, which I wish would have been much less by default.

If not for the call from the AWS Business Development team, we would have been incurring thousands of dollars more each day, until the billing period came after several days later.


What I wish AWS would do!

Here is my humble wish list for AWS to please consider:

  • Allow users to set a hard limit on billing (e.g. Account spending limit on Facebook), after which AWS services would stop serving our requests. (Budgets are not enough, and they should be on by default with at least with a warning at say $200 of spending)
  • Allow prepaid credit, and deduct from this instead for individual accounts and SMEs. This will also be a hard limit which the user does not want to overrun.
  • Charge the credit card in chunks — say every $250 incurred, without waiting for a month end event, and tens of thousands of dollars in accumulated billing. This way at least the bank will block the card and alert the user, even if the person concerned was on an flight when something bad happens.
  • Set the default concurrent execution limit to 5 or 10, and increase it only after an explicit request from a user, so that they would be aware of the possible effects.
  • Have an easily accessible account LockDown button somewhere on the AWS console to stop the chargeable services at once in an emergency

Finally,

  • I believe something like this can be easily detected with AI capabilities for anomaly detection by AWS itself. I believe its not a difficult thing to trigger an event, where an alert to the user would be issued as a warning, and subsequently a lockdown of the account would occur automatically within an hour or two. For many users like us, a service denial is much preferred than a bill of thousands of dollars :(