This is a follow up to our previous article on Managing multi-environment serverless architecture using AWS API Gateway. To see an implementation of the approach described in this article, see the sample project on GitHub.
Error handling in microservices or serverless architectures can be tricky. Different components may be integrated using different protocols and be built using different stacks, and yet any client facing error responses should honour the same API contract.
In this article we are going to look at a simple serverless To-Do app built using SAM (Serverless Application Model), AWS CloudFormation, API Gateway and Lambda functions written in Go.
Our goal is to meet the following requirements in relation to error handling:
- Consistent error responses, regardless of the error’s type or origin
- Clear separation of external and internal interfaces
- No leaking of private error details
Client facing errors
This is our API contract for error responses. The clients expect any non
2XX responses to contain an application/json body with this shape.
AWS API Gateway
Let’s investigate the request flow with AWS API Gateway and AWS Lambda. As we can see below, there are two sources for errors, the API Gateway itself (Gateway Responses) and the integration function (Integration Responses).
Gateway Responses represent errors that occur before reaching the integration (such as access control errors, internal configuration errors, etc), or when the integration response cannot be mapped to a method response. These can be customised to fit our error schema by using a simple mapping template. Here’s the relevant section of our CloudFormation template.
AWS provides a full list of response types that can be used to define response mappings.
For the purpose of this example we chose to map only the catch-all
5XX types. In the case of
4XX errors we return
error.responseType as our code, and
error.messageString as our message, which will provide validation and access control error details to clients. But in the case of
5XX errors we hardcode the code and message in order to avoid leaking internal configuration error details to clients.
The strategy for handling errors returned by a Lambda function is dependent on how the function is integrated with the endpoint, which can be done using either a proxy integration or a custom integration. The former proxies HTTP requests to the Lambda whereas the latter decouples the function from the original HTTP request further and completely relies on request and response mappings.
Our Lambda handlers use the custom integration type. This allows us to write handlers with clearly defined inputs and outputs, without any knowledge of the HTTP request initially made to the API gateway.
Not only does this keep the function simple, it also makes it easier to invoke and test our command with a simple event payload and inspect the result. We found sam-local to be a useful tool during development and leverage it for integration testing, as will be explained later.
Lambda error responses
AWS Lambda uses its own error schema which can later be inspected and modified by API Gateway. This is why the following Lambda handler may not do what you expect.
Instead of outputting a simple error string, the Go Lambda runtime wraps the message string in a custom error type which results in the following response.
However we want to retain error codes and reliably separate private and public error details. In order to do that our handler needs to return a structured error of an error type that produces a json encoded string upon a call of its
Error() method, thereby resulting in a json-within-json integration response.
We introduced a
lambdaError type, meant to be used by a Lambda handler function to wrap errors before returning them.
As you can see, our internal error schema contains a
public_message and a
code will be useful for matching on the gateway later, the
public_message is a human readable string that does not leak any technical details, and the
private_message is the detailed error string.
Invoking our handler now returns the following error response.
This is not particularly elegant, but it’s the only way to return a structured error message when using custom Lambda integration in API Gateway.
Response mappings in API Gateway
A response received from the Lambda function then gets mapped by API Gateway in order to conform to our external API contract. To map errors we rely on integration response mappings, which regular expressions, to match error codes.
Our example app’s matching strategy is to first create a matching rule for the absence of error (success response) and then for any expected errors that can be proxied to the client, and finally a catch-all for unexpected errors that simply maps to a hardcoded
500 Internal error.
Below is a snippet from the template that deals with integration response mapping, which deals with ‘not found’ errors.
The regular expression matches a specific error code inside the
errorMessage string value (which happens to be our json encoded structured error), then maps it to a
404 status code and uses the velocity template language to decode the json string and fit it to our expected output shape, keeping out the private message, which still gets logged.
The resulting error returned by the gateway to our client is just as we expect.
The most important thing to note here is that we can manipulate the integration errors in any way that we see fit, for example, changing a code like
RESOURCE_NOT_FOUND if that’s what the client expects.
Below is the flow diagram from before, now annotated with the error payloads at different stages:
The good news is that we were able to fulfil our error handling requirements using AWS API Gateway and Lambda, but we did find that aspects of the solution have some drawbacks.
The json-within-json hack used to return a structured error along with the regex matching seem particularly brittle. This could perhaps be ameliorated by the introduction of support for a protocol such as gRPC on the integration side.
In the end the trade-off is some clunkiness vs the time that it would take to roll out your own gateway service.
One more thing: Integration testing with SAM Local
Full integration testing including all gateway mappings, validation, authorisers, etc. would have to be done in a testing CloudFormation stack. However, SAM Local helps considerably with testing the Lambda responses and gives us a sense of how our handler command will behave when executed in the real Lambda environment.
sam local invoke we can execute a handler cmd in a local environment by providing it with an event file similar to what API Gateway would send it.
We decided to automate this process by writing tests that execute sam invoke with a prebuilt cmd binary and check the response payloads.
You can find our helper invoke function here.
Note: SAM Local is also able to bring up a local API gateway however custom integration is currently unsupported which meant that we couldn’t use that functionality.
- Sample App Source
- AWS Lambda Go
- AWS SAM Local
- Dave Cheney’s error package
- VTL Reference
- Set up API Gateway Request and Response Data Mappings https://docs.aws.amazon.com/apigateway/latest/developerguide/mappings.html
- API Gateway Mapping Template Reference https://docs.aws.amazon.com/apigateway/latest/developerguide/api-gateway-mapping-template-reference.html
- Set up Gateway Responses to Customize Error Responses https://docs.aws.amazon.com/apigateway/latest/developerguide/customize-gateway-responses.html
- Domain errors discussion https://softwareengineering.stackexchange.com/a/351062
- API Gateway pattern http://microservices.io/patterns/apigateway.html