AWS Lambda: Faster Is Cheaper

CPU-bound functions execute faster AND cost less as CPU power increases

I decided to dig into AWS Lambda by writing a test harness that executes the same CPU-bound function many times, comparing the runtime duration across varying amounts of CPU power. I wanted to ask the question: does it cost the same amount to run the same compute task with varying levels of CPU?

AWS Lambda offers various allocations of memory, from 128 MB at the low end to a maximum of 1536 MB. Lambda’s pricing is based on the amount of memory allocated to the function multiplied by the time consumed. As memory increases, so does CPU power (doubling the memory causes the CPU to double), so a CPU-bound function will run faster. The higher CPU power will cost more but the time consumption will cost less. What happens to the effective cost in this case? Does it remain the same as CPU power increases and time consumption decreases?

The Compute Task

I wanted a function that taxes the CPU without stretching memory or waiting for I/O. Enter the Sieve of Eratosthenes, a classic algorithm for finding all prime numbers up to a given limit. I created a Lambda function containing this algorithm and configured it to calculate all prime numbers up to one million.

Lambda enables configuring the memory size of each function. I pasted this Sieve of Eratosthenes algorithm into four Lambda functions with the following memory sizes: 128, 256, 512 and 1024 MB. Each function would perform the exact same calculation, with the same memory consumption, but with different CPU power (the CPU power doubles when the memory size doubles). Therefore we would expect the 256 MB function to perform the calculation in half the time as the 128 MB function. Since Lambda pricing is based on (memory * time), we would expect the effective cost of performing the calculation to remain the same while getting it done in half the time. I set out to confirm this.

The Test Harness

I wanted a test harness that invoked the four Lambda functions repeatedly and with high concurrency. I wanted to run as many parallel executions as I could without being throttled by AWS (the default max concurrency of Lambda for a new AWS account is 100). The reason for stretching for maximum concurrency was so I could have fun doing concurrent programming in Golang and so my test would run faster.

The test harness runs each function many times, capturing the duration of each run. These runs vary in actual duration which I suspect is a result of the multi-tenant nature of Lambda. The average duration is calculated and output at the end of the test.

The Result

I ran the test harness to calculate all prime numbers up to 1,000,000 and to do this 5,000 times per function (four functions: 128, 256, 512 and 1024 MB). The test harness limited the maximum concurrency so as not to get throttled by AWS.

The following table and chart show the average execution time and effective cost for each of the four functions, where each one was invoked 5,000 times. As we see, the effective cost goes down as CPU power increases, while the execution time at the highest memory size is less than one tenth than at the lowest memory size.

Function     Avg Execution Time     Effective Cost
128 MB 20.8 sec $0.217
256 MB 9.8 sec $0.206
512 MB 4.4 sec $0.182
1024 MB 1.9 sec $0.162

In conclusion, increasing the CPU power of a CPU-bound Lambda function reduces the duration without increasing the cost. Faster is cheaper in this case.

The Code

The code for both the Lambda function and test harness, as well as a detailed report, are here:

Test harness invocation and output

This was the command line used to invoke the test:

go run main.go -execs 5000

This was the output:

Triggering 4 Lambda functions 5000 times each, all in parallel
Each function will loop 1 time(s) and in each loop calculate all primes <=1000000
function 128mb returned status code: 504
function 128mb returned status code: 504
function 128mb returned status code: 504
function 128mb returned status code: 504
Number of lambda executions returning errors: 4
Stats for each Lambda function by Lambda memory allocation:
128mb 20.771653sec(avg) $0.217241(total) to calculate 4996 times all prime numbers <=1000000
256mb 9.834174sec(avg) $0.205920(total) to calculate 5000 times all prime numbers <=1000000
512mb 4.350742sec(avg) $0.182317(total) to calculate 5000 times all prime numbers <=1000000
1024mb 1.928931sec(avg) $0.161776(total) to calculate 5000 times all prime numbers <=1000000
Total cost of this test run: $0.767254

This test took 40 minutes to run. The test harness limited the concurrency to 80, meaning that no more than 80 Lambda functions were executing at the same time.

The 128 MB function encountered a timeout from the API Gateway four times. The API Gateway has a timeout of 30 seconds. This four executions is sufficiently small that we can disregard it when analyzing the test results.