Your Express-way to Serverless APIs

Part 1 — Serverless Architecture and AWS Lambdas

Gita Alekhya Paul
SRMKZILLA
15 min readAug 10, 2020

--

Introduction to this Serverless Computing Series:

If you are an upcoming web developer, you may have heard the following words… “Serverless”, “FaaS”, “Event-driven Architecture”, and whatnot. Often amazed by these buzzwords, young developers are confused when and how to start.

A popular saying goes: “There is no cloud. It’s just someone else’s computer.” So do you think serverless is no computer at all? All these confusions and misconceptions will be cleared in our following series.

The following Medium series mainly focuses on serverless architecture, a trending topic and a popular choice for upcoming developers.

Did you know?

Interestingly, the serverless revolution was not started by any so-called “tech giants” back in the day. The first PaaS (Platform-as-a-service) was launched with the name of Zimki, by a company called Fotango in March 2006, a London-based company owned by Canon Europe.

The series is divided into two parts:

  1. Serverless Architecture and AWS Lambdas:
    The following article will cover the basics of serverless architecture, its application lifecycle and the modelling strategies, meanwhile providing you with all the pros and cons of why you would like to build and deploy serverless applications. After that, we will quickly move on to what Lambda functions are, how we can use them to build APIs and the best practices to make serverless applications.
  2. Using Netlify Functions to create your very first Serverless API:
    The second article will focus on how we can use Netlify Functions, to build our very first Serverless Web Application, while understanding how Netlify uses Lambda functions and getting a grasp over the infrastructure as a whole. So, keep a look-out for this article in the future!

An Insight into Serverless

Okay, a lot of talking, now let’s get back to work, shall we?

Is Serverless truly server-less?

TL;DR No. It just means you don’t manage your servers and they are fully managed by your provider. Confused? I was too. Read on to understand the differences.

What are PaaS and FaaS?

Let me define two words beforehand, FaaS and PaaS:
PaaS: Platform-as-a-service, It is a category of cloud computing services, offering a platform allowing customers to develop, run and manage applications without the complexity of building and maintaining the infrastructure typically associated with developing and deploying an application, like managing servers, load balancers and network configurations.
FaaS: Function-as-a-service, Also called Serverless computing, is a software design pattern where applications are hosted by a third-party service, eliminating the need for server software and hardware management by the developer. The applications are broken up into individual functions that can be scaled and invoked individually.

So, how is PaaS and FaaS(Serverless Computing) different?

Good, you asked!. Although PaaS like Heroku, AWS Elastic Beanstalk or Azure Web Apps offer many of the benefits with that of Serverless Architecture, the main difference in the way your application scales in both platforms. In PaaS, the application is deployed as a single unit and thus, it also scales unit wise. Scaling is done at the application level.

This is where FaaS differs from PaaS. In serverless computing, the whole application is broken up into individual, autonomous functions. Each function is hosted by the FaaS provider and even scales automatically as per the number of requests a function receives. FaaS is very cost-effective in comparison to PaaS as they do not require the whole application to be scaled, rather it can just scale certain functions based on their individual uses.

Due to this, the whole execution of serverless function changes, compared to when you were building an application in a PaaS platform. In a PaaS platform, the application is always on, eating through the provider’s resources (and probably your wallet also), listening for requests and executing them on receiving and occasionally spinning up more instances when the traffic increases, be it even on one of the functions of the application. But in FaaS, it is different. We use a design and execution pattern called Event-driven system architecture.

A relationship between Event-driven architecture, Serverless Architecture and FaaS

The lifecycle of Event-driven systems:

So, what are event-driven systems? As explained earlier, in serverless computing we do not assign a server which is always on, listening for requests and only when it receives any request, it acts upon it. In serverless computing we assign functions which are triggered on a certain event, hence they are “event-driven”. This allows us to use fewer resources and only when they are triggered. The following image shows a brief understanding of how the lifecycle of these applications works:

The lifecycle of an Event-Driven Architecture

As you may notice, there are three main components to a serverless application:
1. Event Source:
The event source is what triggers the event. Be it REST API calls, a new object in a storage bucket like that of Amazon S3, a new record in the data stream of a scalable database like Amazon DynamoDB or even a trigger from any IoT appliance with it’s gathered data.
2. Logic / Function:
The main logic, which is your Lambda function, or even any of the different FaaS provided by different providers. This will contain the main logic which the function will use to process the event data and then return the result to the next step.
3. Services:
The services which are integrated with the function. It is the result of all the operations conducted by the function. It can range from sending an appropriate response JSON to the REST API call or creating a new data record from the incoming data, endless possibilities are there.

Key points while modelling Serverless Applications:

  1. Stateless Runtime:
    A key portion of understanding the stateless runtime provided by the FaaS services is how containers are provisioned for the execution of the code after the function is invoked. It is very important to consider the stateless nature of your function while writing it. It should not be stateful, meaning it should not depend on any state of the user data. Thus, unfortunately, if you are looking to build a WebSocket API, you are out of luck. But if you are up for making REST APIs, we can move forward and create them. In-memory data stores, persisting filesystem content, and even Memcache databases are thus not an option while writing Lambda functions.
  2. Limited Resources:
    Another important aspect of Serverless architecture is the limited resources available for use. For example, Amazon Lambda allows a minimum memory allocation of 128MB and a maximum of 1.5GB per container initiated. This shows the limited amount of performance available. But this is the inherent nature of Lambda functions. They should be quick and less intensive to execute. The startup time and execution time can also be reduced following some best practices as per AWS, which I will list in the last section of this article.
  3. Reusability of Container:
    Last, but the most important topic to address is the way the Lambda containers are reused and how we should optimise for them.
    According to the AWS Developer Guide:

When you write your Lambda function code, do not assume that AWS Lambda automatically reuses the execution context for subsequent function invocations. Other factors may dictate a need for AWS Lambda to create a new execution context, which can lead to unexpected results, such as database connection failures.

This is especially important while writing Database Services and other connection profiles for the Lambda function. The main factor we should always have in mind is that Lambda does not assure us the reusability of the container and thus we should adjust for that factor in our code. In my second article, I will discuss in details the problem I faced while writing the Database services.

Serverless Modelling Code Patterns:

There are 4 major code patterns we follow in Serverless:
1. Microservices Pattern
2. Services Pattern
3. Monolithic Pattern
4. Graph Pattern

Serverless Modelling Patterns
  1. Microservices Pattern
    In this pattern, each job functionality is isolated within a single Lambda function. Each Lambda function responds to a single event and thus the whole application can be modularised. In this way, we can modify our application’s components individually without affecting the application as a whole. It also makes the application easy to debug and maintain.
  2. Services Pattern
    In this pattern, the Lambda functions can handle multiple operations (~4), often related to a certain data model. For example, a Lambda function managing student’s data for a classroom which can perform CRUD(Create, Read, Update, Delete) for all the student data. Thus, this Lambda function becomes a service maintaining the student data.
  3. Monolithic Pattern
    In this pattern, the entire application is crammed into a single Lambda function. It is often not preferred while writing Serverless applications. Lambdas written in this pattern have varying application performance and often reach their resource limit since it starts up the whole application for a single event which probably just caters to certain functionality. In this way, we can see Microservices are the way to go for Serverless applications.
  4. Graph Pattern
    This is similar to the Monolithic pattern but uses GraphQL to execute queries and thus reduces the number of endpoints to be available for the Lambda function. Although, you have to learn GraphQL for it.

Pros and Cons of Serverless Architecture:

There’s a lot to debate about, but I will go over some of the main points I thought you guys should be aware of:
Pros:
1. No server management: The meaning of the “serverless” is actually that you don’t have to maintain your servers, not that there are any servers involved!
2. Flexible Scaling: The applications are scaled automatically based on the usage of a certain function, instead of scaling the application as a whole.
3. Cost-effective: Since we have an event-driven architecture, and flexible scaling, it is extremely nice on your wallets compared to PaaS where a server is running 24x7 while handling requests occasionally.
4. Improved Development Speed: Developers has to worry less about servers and deploys and focus more on the application logic.

Cons:
1. Less system control
Since the servers are managed by your vendor, there are a lot of limitations imposed by your provider, often called “vendor lock-ins”. These pose a lot of challenges while customising the runtime environments and often pose a lot of debugging challenges.
2. Not suitable for large-scale complicated applications
Often writing large-scale complicated applications, it is easier to write monolithic pattern applications and let the server run continuously. Developers prefer to use Lambdas mostly while writing microservices for their application.
3. More complexity required for testing on local environments
It is often a difficult task to incorporate FaaS code into local testing environments, making thorough testing of applications a more intensive task.
4. Architectural Complexity
One of the most prominent complain against the Serverless Architecture is the difficulty in setting it up. Although making Lambda functions may seem easier to code, provisioning the appropriate resources (like DynamoDB, S3 and API Gateway) and writing the CloudFormation template, which includes all the services the Lambda function intends to use, is a difficult task for beginner developers.

But we will give you a head-start by introducing you to Netlify functions in our next article, an easier way to get started using the Serverless Architecture.

Some popular providers of Serverless Arch:

There are many of them, thus I am listing out some of the most common ones. Feel free to explore them in your way:
1. AWS Lambda
2. Azure Functions
3. Google Cloud Functions
4. Netlify Functions (coming up, next in the series)
5. Webtask by Auth0

An Introduction to AWS Lambda Architecture

Phew! So much theory about Serverless! Let’s now see where we see this in use on a large scale. Introduced in 2014, AWS Lambdas are one of the most popular and successful products on the AWS Platform.

What is a Lambda function?

Lambda is a high-scale, provision free serverless compute offering based on functions. It provides the cloud logic layer for your application. They can be triggered by a variety of events either offered by AWS or third-party services. Lambda functions scale precisely, down to the individual request. When there are multiple concurrent events, Lambda simply starts up more copies of the functions in parallel and almost ensures no idle time (More on the invocation lifecycle of Lambda functions later in the article). It is a type of serverless FaaS. Each Lambda function contains the code you want to execute, the configuration of the environment you want to execute in and the event sources the function has to react to.

An example Lambda function

An example of a “synchronous” handler function.

Above is a Lambda function with synchronous “handler”. Below is an example of an asynchronous handler.

An example of an “asynchronous” handler function.

How do serverless APIs work with AWS?

As discussed earlier, each Lambda function gets triggered by an event and uses certain services to complete its execution. Although there are many uses of invocation of a Lambda function, a common use is to create APIs. To create a serverless API using AWS we will have to configure the following things:
1. AWS API Gateway: It serves as the event source which lists out the available endpoints for a REST API and then invokes a certain Lambda function based on the incoming request.
2. Lambda Function: The main logic of what you want to do with the incoming data.
3. Services: Either AWS Services or Third-party services which are used by the Lambda function. Examples are Amazon S3, DynamoDB or CloudWatch Logs to name a few.

Main components of a Lambda Function:

As you may have noticed in the above two examples of Lambda functions, the function signature is as follows:

exports.handler = (event, context, callback(err, response)) => {}

The Handler
When a Lambda function is invoked, the code execution begins at the handler function. The handler is a specific code method (Java, C#), or a function (Node.js, Python) that we have created and included in the package. The Lambda function can then call any other methods and functions within the files and classes. They can also interact with other AWS services or third-party APIs based on the use case.
The Event Object
As we have seen multiple times earlier, the event source is the first step to the invocation of a Lambda function. The event object contains all the data and the metadata parameters passed from the event source. Thus, the structure of the event object varies from an event source to an event source. For example, an event from the API Gateway will contain the HTTPS Request properties whereas an event from an S3 bucket will contain the details about the bucket and the new object. In the above, I have shown an example of an event.body and the event.pathParameters property of the event object. This is when the event source is API Gateway. The event.body contains the request body, while the event.pathParameters contains the request parameters.
The Context Object
The context object allows your Lambda code to interact with the execution environment. It will contain all the data fro that specific execution environment. Later, we will learn how we can use “Netlify Identity”, an easy-to-use user identity management system by Netlify.
But, before moving on, I wanted to discuss a specific line of code in the above Gist:

context.callbackWaitsForEmptyEventLoop = false;

The above callbackWaitsForEmptyEventLoop is a boolean attribute of the context object. It instructs the execution environment to “freeze” after it has done executing all the code, even if the promises are not yet resolved. By default, the execution context waits for the Node.js event loop to finish resolving all the promises. It is especially important in such situations as shown above, where the outcome of the delete operation can be non-blocking. Although executing such async code can have its caveats, this piece of code has to be included when calling async functions, else the Lambda execution environment waits for the promise to resolved, which may introduce latency or in worst cases, timeout. But you need not worry, once the lambda function is executed, the execution context is maintained for some time, in anticipation of another function invocation. It is when your remaining promises will be resolved.

The Callback
The callback object, although not compulsory, is often used to send back the invoker some data, either an HTTP Response or even the errors occurring during execution. In asynchronous handlers, simply returning the promise is enough, a callback is not mandatory.

Container Reuse: Optimising Code for “Warm Starts”

Let’s now discuss the prime factor in optimising your code for Lambda containers. Let’s see an example of how AWS provisions and manages its Lambda containers:

Lambda Invocation and Execution Lifecycle

Although it is strictly said by AWS, that we should not assume any stateful nature of our code, it is often even preferred by developers, to optimise their code for the “Warm starts”. Now what do we mean by “Warm starts” and “Cold starts”? You see, each time a function container is created and invoked, it remains active in anticipation of a subsequent invocation for a certain time. Now if a successful invocation happens, it is said to be running on a “warm container”. Else, the applications “freeze”, and the next invocation requires the code package to be created and invoked for the first time, when we say the invocation is experiencing a “cold start”. Now, why do you it is important for developers? This starting mechanism allows developers to optimise their code for the warm starts, and reuse module references, database connections and thus resulting in faster startup time, and even faster execution time. It is highly recommended to scope variables in such a way that their contents can be reused on subsequent invocations while keeping in mind the cold starts.

Function Invocation Patterns:

This small section shows the various categories of events Lambdas can be triggered through:
1. Push events:
The type of events where the function is invoked only when a particular event occurs within another AWS service. Examples: Amazon S3, Amazon API Gateway, Amazon Alexa.

2. Pull events:
The type of events where a certain data source is polled for new data, and invoke our function with any new records that arrive at our data source. Examples: Amazon DynamoDB, Amazon Kinesis Streams.

Handling Exceptions:

Last but not the least, the methods of handling exceptions. When using synchronous handler functions, proper error handling has to be done, and a proper error message has to be sent to the invoker. But things get interesting when we start handling asynchronous errors and exceptions:
Introducing Dead Letter Queues…
Dead letter Queues are a powerful concept, which helps software developers to find issues within their asynchronous function components. To explain it simply — when your application encounters an unhandled error in an asynchronous process, the error information and the context can be sent to another location, to a message queue, probably a notification service Like Amazon SQS(Simple Queue Service). Thus, developers can also handle such errors gracefully. Below is a short flow diagram of dead-letter queues:

Best Practices while coding Serverless Applications

Here are some of the best practices which should be followed while writing serverless applications:

  • Language Runtime Preference:
    This is a tough fight between compiled languages like Java and C#, compared to interpreted languages like Node.js and Python. It has been monitored by AWS that, compiled languages have the largest startup times, while interpreted languages start up very quickly. But the compiled languages perform better under maximum performance, so it is a developer’s call to decide speed over workload capacity.
  • Business Logic outside the handler function:
    It is recommended to store the business logic, database connections, SDK clients outside the handler function to take advantage of subsequent invocations. It is also recommended to use code reusability features like static/global variables, and singletons so avoid the reinitializations of variables, taking full advantage of warm starts. But it is important to keep in mind, that we should also have the cold start logic in place, whenever cold starts take place from scratch.

In the next article, we will address this topic of reusing database connections in MongoDB and the problems I faced. Surely an interesting thing to learn while using Netlify functions!

  • Minimising the deployment package:
    The deployment package should be minimised to reduce the initial cost of a cold start. Developers are encouraged to keep external dependencies to a minimum. Instead of uploading entire SDKs, you are encouraged to selectively depend on libraries.

Conclusion

Woo-hoo! Man, that was a lot of information to share! We hope you enjoyed reading as much as we loved making for you guys! We tried to keep it as simple as possible to remove the complexity in AWS young developers. If you like the article, do give it a clap!
This is
Gita Alekhya Paul signing off. Next week, we will discuss how Netlify Functions work, and make a hands-on project of your first serverless API! Until then, stay safe, stay protected and follow us at SRMKZILLA, to be posted about our upcoming articles!

References:

--

--

Gita Alekhya Paul
SRMKZILLA

A Web Developer, Cyber Security Enthusiast and a full-time tinkerer!