How to configure custom docker images for AWS lambda functions (ubuntu to run bash/shell scripts on linux and node nodejs)

5 min readAug 28, 2023

AWS provides a set of lambda runtimes and steps to implement your own. Nevertheless, setting up custom images to execute bash/shell scripts and very recent versions of nodejs via API Gateway v2 using json payloads proved to be more challenging than expected.

Whys and wherefores

There are multiple reasons behind building a custom runtime for lambda

Flexibility #1: The number of supported runtimes by AWS is limited to a few languages and versions. You might want to use a newer version not yet supported or an older version not supported anymore.
Flexibility #2. Many frameworks, packages, libraries, tools, commands, or instructions are not available in the list of supported runtime languages. It could be much faster to develop a function with a runtime being very powerful at handling certain types of operations compared to other programming or scripting languages. For example, kubectl program to send commands to a kubernetes cluster has no equivalent in nodejs or python.
Simplicity. On last point, even when the binary program like kubectl is copied and called from a supported runtime, running the command can become much more complicated than anticipated when dealing with piping, loading profile, redirecting stdin/stdout/stderr, encoding/decoding data of special characters, managing async callbacks, handling output and errors
Familiarity. DevOps are typically more comfortable with bash/shell scripting compared to traditional programming languages.
Cheaper. Lambdas are billed for the duration they are running. Provisioning an EC2 instance with either commands to configure it and/or baked AMI to set up in advance might incur more costs. Also, when the project does not have long running instances, it would be overkill to think about RunCommand or kubernetes jobs for executing work that lasts only a few seconds/minutes.

Side notes on Flexibility #1. It takes time for AWS to make newly released versions of nodejs to lambda runtimes and other frameworks. Here are my observations

For major version: Around 7 months. The last upgrade was for nodejs 18.x which was announced on November 18, 2022 although nodejs 18 was released 7 months earlier on April 19, 2022. Current is 20.x (released on April 18, 2023) but there is no nodejs20.x available for lambda yet.
For minor version: More than 6 weeks. Current LTS is 18.17.1 (released on August 9, 2023). AWS lambda created with nodejs18.x are now using version 18.16.1 (released on June 20, 2023).
New version might not be supported right away in AWS SAM, CloudFormation, and AWS CDK which could delay the adoption when your project CI/CD pipelines are leveraging by Infra as Code.

Proof of Concept

Lambdas are awesome for running short-lived non real-time background work. They can be called ad-hoc via the API Gateway, directly with the Function URL, from the lambda API, and the Console. They can also be scheduled using the CloudWatch Events Rules or be triggered by a multitude of events. I created a minimal PoC with Custom Images using API Gateway v2 (awsCustomLambdas). It includes 5 functions based on 3 docker images (public.ecr.aws/lambda/nodejs:18, node:20-bookworm, and ubuntu:22.10) and 2 Zip referencing different images (provided.al2 and nodejs18.x).

bonjour: public.ecr.aws/lambda/nodejs:18 (image)
salut: node:20-bookworm-slim (image)
hello: ubuntu:22.10 (image)
coucou: provided.al2 (runtime & zipped code)
howdy: nodejs18.x (runtime & zipped code)

Straightforward architecture

The user makes a POST request to the API Gateway which calls the lambda function. The lamba pulls the image from the ECR repository and is loaded with a dedicated IAM role to execute.

Here are the steps involved in setting up the above design.

There are 4 scripts to set up and 3 to test. It should take less than 5 minutes to have your first lambda working

build_image.sh takes 12 to 260 seconds (see details of timing below)
create_role.sh takes 10s
create_lambda.sh takes 6s
create_api.sh takes 22s

Timing and image details related to script build_image.sh

Only nodejs18.x runtime with zipped code has fast enough initialization and execution durations to be considered for customer facing apps. The high latency of others makes them only good candidates for handling non-interactive backend workloads — i.e. scheduled or event processing.

Durations and memory usage for each lambda types

Challenges encountered

Defining which script should be configured as entrypoint and command in the Dockerfile. Lambda would report Runtime.InvalidEntrypoint, Runtime.ImportModuleError, or simply exit with code 2. Playing with Image configuration overrides in the Console saved me a lot of time debugging.
API Gateway v2 is leaner and more difficult to debug than its predecessor. The biggest problem I spent effort on was getting status code 500 when calling with curl even though the lamba returned 200 in the Console. The root cause was the encoding of the body response payload in json. Documentation does not provide examples more complex than returning a simple string. When dealing with a json response from a bash/shell script, a json as string will work when the lambda is invoked directly by fail when the call is coming from API Gateway v2. The quotes need to be escaped in the json or the payload encoded in Base64. An example of valid payload would be ‘{ “isBase64Encoded”: false, “statusCode”: 200, “body”: “{ \”allo\”: \”hehe\” }” }’. API Gateway v2 has only access logs which makes debugging a nightmare. Note the response will be encoded in the reply even if the response was not encoded. More on the topic in my re:Post question AWS Custom Lambda returns status OK but API Gateway v2 fails with 500 (see the accepted answer).
Finding values for the lambda Triggers and Permissions when set up as AWS_PROXY was not obvious.
Configuring the Function URL via Console is much simpler than figuring the related AWS CLI commands. The Console is able to put a StringEquals condition for lambda:FunctionUrlAuthType to the policy statement but this option is not available in the CLI command aws lambda add-permissions.
Vague documentation on AWS CLI commands like “aws apigatewayv2 create-stage” with option “ — access-log-settings” which has a “Format=string” field without sample payload. When done with the Console it is simple to configure json format; unfortunately the same cannot be said about the CLI.

Final thoughts

The PoC completion unleashed more power to lambda functions thanks to various custom runtimes. AWS has extensive documentation on their products and features and yet I wish there would be more examples of different commands and payloads. The API Gateway v2 has great potential despite needing a better user experience with debugging capabilities.