Building BAM! Serverless Framework
BAM! is a serverless framework that makes it quick (hence, the name) and easy to get small applications up and running using Node.js and Amazon Web Services (AWS). It is optimized for the deployment of AWS Lambda functions integrated with Amazon API Gateway endpoints, and on average, it takes only 20 seconds for initial deployment of a dependency-free lambda. BAM! also allows for the creation of Amazon DynamoDB tables, which can be used to persist data between lambda invocations.
In comparison to other technologies, the biggest benefits of serverless are, arguably, the rapidity of iteration and the abstraction of infrastructure management. The irony is that the experience of working with serverless technologies often lacks the very simplicity the idea of serverless computing promises to deliver. As a result, BAM! prioritizes speed and convenience for the developer.
This article will examine the design decisions we made while building the BAM! framework, as well as our solutions to a variety of challenges we faced.
Amazon Web Services
Amazon is by no means the only provider of serverless technologies. There are many others such as Microsoft, Google, and IBM; however, with over 1 million active customers, AWS currently enjoys prominence among engineers working with the cloud¹. We therefore built BAM! for use with AWS.
AWS provides a veritable litany of cloud based services; however, we designed BAM! to interact with four in particular: AWS Lambda, Amazon API Gateway, AWS IAM, and Amazon DynamoDB.
Introduced in November 2014, AWS Lambda is Amazon’s take on Functions as a Service (FaaS), providing an event-driven, serverless computing platform, which runs functions in response to events while automatically handling the provisioning of required resources². To accomplish this, AWS will spin up a new server instance within a container as needed, and tear down idle instances after a period of inactivity to avoid waste.
Using AWS Lambda, it is possible to code some small piece of functionality on one’s local machine and deploy it to the cloud, whereafter it can be triggered by events such as a call to an API endpoint from anywhere in the world. BAM! was built to make this process easier. While AWS Lambda supports a variety of languages and runtimes, BAM! supports Node.js 8.10.
Amazon API Gateway
Far and away one of the most common event sources encountered in cloud computing is the hitting of an API endpoint³, and the creation, management, and hosting of APIs is precisely what Amazon’s API Gateway service was designed to handle.
As the name suggests, API Gateway most often acts as the outward, app-facing component of Amazon’s cloud infrastructure. Engineers can use it to build anything from a fully RESTful, secure backend to a slim, lightweight interface for interacting with a single service. BAM! was designed specifically in the latter context of generating API Gateway endpoints that are integrated with AWS Lambda functions.
One common thread permeating the whole of Amazon’s cloud infrastructure is the necessity for managing access permissions between various cloud services, which is where the AWS Identity and Access Management (IAM) service comes into play. IAM allows developers to securely control authentication and authorization for use of resources at a highly granular level.
One final service BAM! supports is DynamoDB, Amazon’s NoSQL key-value and document storage service. We selected DynamoDB because its lack of rigid schema is a natural means of providing flexibility to users.
The services mentioned above are by no means an exhaustive list, but rather a brief discussion of those most relevant to the BAM! framework.
Challenges of AWS
While AWS is an industry leader in the new and exciting field of serverless computing, using its services is certainly not without its share of difficulties. Chief among these is a high degree of complexity, which steepens the learning curve for developers new to the technology and can prove to be burdensome in a more seasoned engineer’s day-to-day experience with the platform. There are over one-hundred fifty AWS services⁴, each with its own configuration and jargon the developer needs to be familiar with. Even more, many tasks that seem discrete are actually composed of several micro-tasks under the hood.
We will be walking through an example to see just how complex working with AWS can be. For context, here is the scenario:
Suppose a co-worker wants to view company data, such as a sales report, and has come to you, the developer, for help. While you could manually access this data and send it to them, or write a script to produce the relevant data and manually distribute it, these options can be tedious, especially if you are frequently asked to retrieve a variation of this information. Instead, with the AWS software development kit (SDK), you can create a lambda function integrated with an API gateway. The lambda will have access and logic to process the data and the endpoint will be configured to call the lambda and display the output in a browser. This way, you will be able to give your co-worker the endpoint and enable them to retrieve the data at their convenience without disrupting your workflow.
Manual Deployment with AWS
Before you can even send a request to AWS, it is necessary to perform several local operations to prepare a deployment package for the lambda. This includes confirming your AWS credentials are properly set up; deciding what configuration details to use (e.g. profile and region); creating a properly formatted, local Node.js file for the lambda function, which should contain
exports.handler; installing any node package dependencies; zipping all relevant files into a deployment package; and verifying the lambda name does not conflict with any existing resources.
Now, the deployment package is ready to be sent to AWS. Start by using the AWS Lambda
createFunction() method. This method takes several parameters including the zip file you just created and an IAM role. The lambda must assume an existing IAM role, which determines what permissions the lambda will have to interact with other AWS services. Then, create an API Gateway object using the
createRestApi() method. It would be simple and intuitive if those were the only two steps, but there are still several more steps to complete before a callable endpoint exists.
At this point, the API Gateway will only consist of a root resource representing the root path of the API. If you want your gateway to support access to path parameters within the lambda function code, you must create an additional resource for the greedy path. The greedy path matches the path portion of the URL after the root slash (
/). To accomplish this, call the Amazon API Gateway method
createResource(). Even this operation is complex in that you need to retrieve the resource ID of the API’s root resource using the
getResources() method and supply it as a parameter to the
To review, now the lambda, API gateway, and path resources exist. The following three steps are needed to add each HTTP method to the root path and integrate the API resource and associated methods with the lambda. Note: these steps will need to be repeated for the greedy path (
First, use the AWS Lambda
addPermission() method to give permission to the API Gateway so that it can invoke the lambda function. Next, use the Amazon API Gateway
putMethod() method to add the HTTP method to the resource. Finally, call the
putIntegration() method to complete the integration between the lambda and the resource.
Generally speaking, integrations provide a way to take an incoming HTTP request to an API gateway and pass it along to some other AWS service, possibly with some intermediate processing. There are a number of integration types, including
AWS_PROXY, also known as "Lambda Proxy". If you want to expose as much of the HTTP request data as possible to your lambda, you will want to choose Lambda Proxy.
As mentioned before, the integration will need to be repeated for each HTTP method on the greedy path.
Finally, call the
createDeployment() method to bundle all of these resources, methods, and integrations into one deployment resource.
Only now is the rest API endpoint ready to be called.
As you can see from the above, working with AWS directly is usually a complex process. A considerable amount of knowledge about each service is needed, and it can require some digging to determine the sequence of commands to achieve a desired outcome. Many frameworks have come into being to enable developers to make the most of AWS while avoiding parts that are cumbersome.
Each of these frameworks has a different set of tradeoffs, but BAM! was optimized for the use case we just described. The complexity of deploying a lambda integrated with an endpoint is handled by one BAM! command (
deploy). The goal with BAM! is to have the right functionality for a small application, while being quick and easy to use for the developer.
BAM! is designed to be human-friendly. We spent considerable time deciding which AWS services to integrate with Lambda and how to make working with those services most helpful for a developer. According to a 2018 report by Serverless Framework⁵, HTTP endpoints account for more than 2/3 of all event sources. This is why BAM! is centered around lambdas connected to endpoints. We aimed to simplify common scenarios for developers using these services.
Our objective was to utilize an architecture that allows the developer to get up and running quickly. BAM! has flexible commands, requires no configuration, supplies instructional templates, and adapts to the developer’s local lambda file organization.
When a BAM! command is first issued, a hidden
.bam directory is created. This directory acts as a staging area for
package.json file creation, dependency installments, file compression, and lambda deployment. Additionally, this directory contains a number of JSON files, which keep track of the resources deployed using the BAM! framework.
If you’ve used BAM! to deploy the lambda together with an API gateway (to provide data for your co-worker, for example), the following topology will be generated to process the HTTP
GET request sent to the endpoint.
There are several types of endpoints, and in most cases, the best is an Edge-optimized endpoint because AWS routes the user’s request through an AWS Cloudfront Edge Location (data center).
Cloudfront will route the user’s request to Amazon API Gateway, which checks the IAM role and associated policies to confirm the path resource has permission to invoke the lambda. If API Gateway receives a successful response from IAM, the lambda will be invoked to perform some processing. This could involve interacting with other services such as a database or even another lambda. In this example, the lambda should produce a sales report, so a database query is made for last month’s total sales.
Then, the data is sent back to AWS Lambda for further processing, metadata can be persisted to a DynamoDB table, and the response is returned to API Gateway. With Lambda Proxy integrations, the lambda must return a JSON formatted response which the gateway can transform into an HTTP response. Finally, API Gateway routes the response back to the Edge location and to your co-worker, who can view the sales report in their browser.
BAM! is an opinionated framework that allows for deployment of AWS resources without having to deal with JSON and YAML files.
Instead, BAM! finds the necessary information needed to call various SDK methods automatically. For instance, BAM! calls the AWS Security Token Service’s (STS)
getCallerIdentity() method in order to get the developer’s account number. Additionally, BAM! uses the region and default profile specified in the hidden config and credentials files, which exist in the
.aws directory after proper installation of the AWS command line interface (CLI). Lastly, BAM! creates a default role with permissions to interact with CloudWatch logs and invoke other lambdas.
Note that since BAM! uses the
aws-sdk, the framework does not directly touch the developer’s AWS Access Key or Secret Access Key.
The BAM! framework includes six templates to help a developer get going quickly. These templates ensure lambda functions will be compatible with both the AWS Lambda programming pattern for Node.js and Lambda Proxy integration with Amazon API Gateway. Templates can be created with or without instructional comments, and although useful, these templates are not a requirement for deploying a lambda with the BAM! framework.
The basic template is created by default when
bam create is run without any flags. This template shows the developer how to handle any or all of the HTTP methods in one lambda function and exposes query and path parameters. All of the remaining templates extend this one and include functionality to invoke another lambda; access HTML, CSS, and JS; and/or interact with a DynamoDB table.
Flexible Local File Organization
The BAM! framework allows the developer to organize lambda files in any way. For example, all lambda files could exist within one directory or could be organized into specific project directories. Because of the way BAM! is designed, the framework can handle either scenario.
Suppose a developer is deploying a lambda function they have written in Node.js, which requires the
fs native Node module and the
uuid npm package. After copying the lambda file or directory to the staging area, BAM! parses the require statements within the file, determines which dependencies are native to Node (in this case
fs), creates a
package.json file, adds only the non-native dependencies to it (in this case
uuid), installs modules, zips all the files together, and deploys the lambda to AWS.
The advantages of designing the BAM! framework this way are:
- the developer’s local files remain uncluttered
- the deployment package sent to AWS is lightweight instead of being bloated with unnecessary dependencies
- the developer can ultimately deploy a lambda quickly without being well versed in the nuances of AWS services
Of course, as with any engineering project the process of designing BAM! came with a host of challenges, of which the most interesting are discussed below.
First, it is notable that BAM! exists in a wider ecosystem, and we had to account for the possibility that users of the framework would also interact with AWS through means beyond our control. In practice, this meant that there were a plethora of conceivable edge cases where, say, the existence of a necessary resource could not be taken for granted.
For example, consider the case when a user has previously deployed an integrated lambda and API endpoint, and thereafter attempts to add additional HTTP methods for the API to support.
The challenge here is that unbeknownst to BAM!, the user could, meanwhile, have deliberately or accidentally deleted the API Gateway in the AWS console, while leaving the lambda intact. Under such circumstances, there will exist a lambda without an associated endpoint in the cloud. In this case, BAM! will still maintain local records of both the lambda and endpoint. If any attempt is made to update HTTP methods, the SDK will raise an exception, as there is no such API to which those methods can be attached.
We accounted for this problem by first having BAM! check for the existence of the lambda and API, and respond intelligently to the circumstances. The presence of added HTTP methods signals to BAM! the presumed existence of an API Gateway instance; BAM! will therefore create a new API behind the scenes and add the desired methods, overwriting the old local record with the replacement API’s identifying information.
In fact, BAM! will attempt to respond in a convenient and expected manner under a variety of differing circumstances when redeploying a lambda. To add a new API when one does not exist in the cloud, BAM! accepts either the
--addEndpoint flag, or, to reiterate, the user may imply the addition of an API by adding HTTP methods. In the latter case, BAM! is making what we feel to be a reasonable assessment of user intent.
If BAM! finds the lambda itself does not exist in the cloud while attempting to redeploy, it will simply warn the user to run the
deploycommand instead. Inversely, since AWS enforces uniqueness of lambda names for a specific region and account, we designed BAM! to warn a user to
redeploy whenever they attempt to
deploy a lambda with the same name as a preexisting function in the cloud.
Considering further user interaction with AWS beyond the confines of BAM!, we had to account for the possibility that developers may wish to use our framework to pull down the code for existing lambdas created without using BAM!, or for which they have no local Node.js file. To provide functionality in this regard, we added the
get command to our framework.
On a related note, unless the user dictates otherwise, BAM! will create a default role to be assumed by all deployed lambdas in the absence of a specified alternative. The associated challenge is that, while BAM! will not itself permit this operation, the user can, much the same as with other resources, delete the default role in the AWS console. Without some workaround, all future lambda deployments will fail, owing to the nonexistence of a resource pivotal to the framework’s operation. As in previous cases, our simple solution is for BAM! to check AWS for the existence of this role and rebuild it if necessary.
The ultimate point here is that BAM! was designed to function in a manner compatible with a developer’s broader interactions with AWS.
There is a well known problem of persisting data with FaaS. To reiterate, cloud providers such as AWS will perform autoscaling on behalf of developers with a variety of services, and AWS Lambda is no exception. This means that as the demand for computational resources arises, AWS will spin up a new server instance within an ephemeral container to run whichever lambda is being invoked.
This is ultimately beneficial to developers in that the task of provisioning resources can be abstracted away from day-to-day operations. However, any data stored by a lambda, for instance via closure of its handler function over some mutable object, will not persist beyond the lifetime of the container and will typically be lost due to teardown after roughly 15 minutes of inactivity⁶.
Unless a developer wishes to spend money keeping an idle server running, which would largely defeat the purpose of autoscaling serverless technologies anyway, they must find some alternative means of persisting data. For this purpose, we designed BAM! to support DynamoDB, Amazon’s service for NoSQL key-value and document storage. Supporting DynamoDB for optional persistence provides highly desirable functionality, while, at the same time, staying true to our intended use case of building and deploying small applications. We selected DynamoDB over other storage options, such as AWS RDS, Amazon’s relational database service, since the absence of rigid schema facilitates more agile and versatile development.
Finally, we encountered an interesting pair of challenges related to timing, specifically an issue with latency within Amazon’s systems as well as a challenge with working around Amazon’s own rate limits.
Recall that, by design, a single BAM! command actually comprises an automated sequence of a large number of individual SDK operations under the hood. While we maintain that abstracting away these sequences of operations drastically improves user experience, we consequently had to account for cases where one operation relies upon Amazon’s system being in a particular state generated by a previous operation.
Within the BAM! framework’s Node.js source code, each SDK operation is performed sequentially, with each operation asynchronously awaiting its predecessor. The problem is that with the AWS SDK, if a request is made, say to deploy a lambda, Amazon comes back with an optimistic AWS Request Object⁷ in the event of success. In other words, even though we are awaiting each SDK call, the next asynchronous operation is performed upon receipt of an optimistic successful response, not when the state of Amazon’s system is actually prepared to accept the next operation. This latency, if left unhandled, would periodically cause the failure of one, and therefore all subsequent SDK operations.
Similar to the challenge posed by unintended latency was that of SDK operations failing due to Amazon’s own rate limits. In situations like this, Amazon throttles one of the many individual steps part way through a sequence of operations.
Since both Amazon’s rate limits and infrastructural latency are beyond our control, it is fair to say that this was not so much a traditional system design problem, but rather an opportunity to take a preexisting architecture and find clever workarounds to accomplish our engineering goals. Fundamentally, both of these challenges boiled down to timing, with BAM! unceremoniously attempting operations doomed to cascading failure. The only key difference was that in the case of rate limits the relevant timing constraints were deliberately imposed.
Our solution was to gracefully retry SDK operations behind the scenes until the state of Amazon’s system is prepared to handle that operation. To this end, we created a function, affectionately referred to as bamBam, to provide this retry functionality.
bamBam was designed to anticipate the type of exceptions that are raised upon encountering one of the two problems described above; respectively, we found the SDK throws an
InvalidParameterValueException in the case of the aforementioned latency issue, and a
TooManyRequestsException when AWS is throttling requests.
In either case,
bamBam responds to the anticipated error by retrying a supplied async callback wrapping an SDK operation, and does so recursively in a loop until either the operation succeeds or an unanticipated exception is raised.
All of the challenges described above presented an opportunity to refine BAM! and further optimize it for building and deploying small applications. Designing a framework atop Amazon’s cloud computing services and simplifying developer interaction therewith has required a significant degree of ingenuity and will hopefully save developers a great deal of frustration.