How to Monitor AWS Lambda with Zero Overhead

With Thundra, you can gather useful insights about your AWS Lambda function such as detecting errors and performance degradations. Thundra gives you flexibility to instrument your code either manually or by using automated approach with no code change. Now, it is time to send the monitoring data to Thundra for evaluation. Thundra gives you two options to send your monitoring data:

  • Send it synchronously just before function execution ends.
  • Write the monitoring data to CloudWatch and let Thundra handle the rest.

Let’s review both approaches and go through the advantages and disadvantages of each.

Both are easy to configure through environment variables and you don’t have to change any code to switch between a manual approach, an automated approach, or no monitoring at all. To send monitoring data synchronously, you just make an HTTP call to Thundra’s collector inside your Lambda function. However, we consider this synchronous approach risky. An HTTP call can fail due to numerous reasons. If your Lambda is behind Amazon’s Virtual Private Cloud (VPC) or if the Thundra collector is somehow unresponsive at that moment, you may experience loss of your monitoring data or, even worse, your Lambda function may fail to execute if HTTP call fails.

Moreover, making an HTTP call to Thundra’s collector will increase the time to execute your Lambda function due to network communication latencies. This has a direct impact to billed duration and causes indirect increased cost. We can’t let alone our customers only with this option and, we came up with a solution.

Our suggestion is to take advantage of our asynchronous option for sending monitoring data:

Here’s a diagram showing how asynchronous monitoring works with Thundra:

Asynchronous monitoring structure of Thundra

With asynchronous sending of monitoring data, Thundra instructs your Lambda function to write some logs to Cloudwatch for us. This triggers Thundra’s Lambda, “thundra-monitor-cw”, to sends those logs to Thundra’s collector. This approach can be applied to any Lambda function independent of programming language.

While using asynchronous monitoring, you will not have to worry if your function will fail because of us since Thundra’s Lambda to send data runs independent of yours. The most important advantage of using asynchronous monitoring is that writing logs to Cloudwatch only adds negligible overhead to your Lambda. We call this, “zero overhead”. Learn more about the advantages of asynchronous monitoring by reading our detailed blog post.

To achieve asynchronous monitoring, subscribe the thundra-monitor-cw Lambda to your Lambda’s log group. It is very straightforward to setup the thundra-monitor-cw Lambda and you can automate setup using Thundra’s deployment tool or using serverless framework. Check out our documentation to set up asynchronous monitoring for your environment.

PROVING OUT OUR ZERO-OVERHEAD CLAIM

In order to prove that we really add zero overhead to the execution of your Lambda, we conducted an experiment. Our experiment used the naive-hi-handler Lambda to compare the effects of asynchronous versus synchronous data monitoring on three different levels of monitoring and within as well as within and across AWS regions. Here are the monitoring levels we included in our testing:

  • No Monitoring — We ran the Lambda without any monitoring to establish a baseline
  • Basic — we gathered only invocation duration, any errors, and logs.
  • Advanced — We enabled every possible monitoring option of Thundra, such as monitoring local values, request and response, line-by-line tracing, and JDBC and AWS SDK integration. Our goal was to produce as much monitoring data as we could in an effort to increase communication durations as much as possible.

Our experiment therefore consisted of testing five different configurations:

  • No monitoring
  • Basic monitoring and synchronous communication
  • Basic monitoring and asynchronous communication
  • Advanced monitoring and synchronous communication
  • Advanced monitoring and asynchronous communication

We also wanted to test the effects of synchronous versus asynchronous monitoring across different regions. The assumption is that sending synchronous monitoring across regions would increase Round-Trip Time (RTT) and the costs of running your Lambdas.

We ran the Thundra collector in Oregon (us-west-2) region and ran tests of the naive-hi-handler located in the following regions:

  • Oregon (us-west-2)
  • N.California(us-west-1)
  • London(eu-west-2)

Here are the results of our experiment keeping in mind that business logic of our test Lambda written in Java takes 100ms on average with 1536MB memory.

Experiment results for zero overhead

Our experiment proves that asynchronous monitoring adds minimal overhead of — ~2 ms to the invocation of your function. This is pretty negligible, especially considering the overhead doesn’t increase when the Thundra collector and your AWS Lambda function are not located in the same region.

On the other hand, synchronous monitoring adds ~25–30 ms of overhead when the Thundra collector and your Lambda are in the same region. And, this overhead only gets worse as the distance between the Lambda and collector increases.

Our experiment reinforces our original hypothesis. We recommend using asynchronous monitoring as the best option to monitor your Lambda while not slowing down execution . With asynchronous monitoring, you also avoid the risk of your Lambda function failing due to complications with transferring the monitoring data.

Try it out! Take advantage of serverless monitoring with zero overhead and help give us feedback on how we can make Thundra better, by signing up for the beta!