What is Serverless and what is AWS Lambda?
A mini guide for the uninitiated
I’ve been trying to get my head around everything about the serverless initiatives that are super popular right now. I just had an understanding of how it works and will summarize it for you.
At the core, serverless and lambda functions have nothing to do with web development or APIs.
Basically a Lambda function is a mini virtual machine that exists for a very short time, like a few minutes.
Because (a) the virtual machine exists temporarily, (b) you don’t manage it, it is called serverless.
A lambda function is a packaged program, just like your email client that runs on your machine.
You create a lambda function by creating a program, plus adding the bare minimal helper executable that can run that program. For example a lambda could be:
Lambda = A bash script + The bash interpreter itself + some wrapper logic
At its core this is what a lambda is.
OK but what is unique about it?
What is unique is that instead of running this program on your local computer, you create an API out of it (define a specific input and an output), and you deploy it to the cloud.
It works for all types of programs you run on your machine.
How does it get used?
This means any program that runs on your machine now has a place in the web where you can access it in form of an input/output.
For example: Upload a video, extract thumbnails from it. If you can run this as a program on your machine, you can now do it via a Lambda cloud service.
OK so it doesn’t make sense for large programs?
No it doesn’t. My original example of an email client is not a suitable program. It works best for executables that have a defined, single input and output, like the video processor example I mentioned. Things that run in batch mode and give you a result are a good candidate.
When it is in the cloud, it is everywhere:
Services like zeit.co deploy your program to multiple availability zones around the planet. Lambdas also spawn new instances when there are multiple users of them.
Imagine that our original program that extracts video thumbnails is used across the world simultaneously by many users. You can do this without thinking about where the VM is running or when it spawned and terminated.
It is like instead of running multiple instances of a program in your computer, you run them around the world (it’s kind of crazy when you think).
How does this relate to web applications?
Well in the very specific scenario of a web application using Next.js + React, it helps a lot.
The special case of Next.js + React is that, every url you visit is kind of an individual application. For example, all the source code / logic you need to service a request to a particular route:
Can be each packaged as a Lambda. Because (Disclaimer I never used Next.js but I think I understand what’s going on) in Next.js you can enter an application from any route. There is no single entry point, each route is an entry point on its own. So each route is a candidate to be converted into a Lambda.
How would you convert each to a Lambda? Well you do the same thing, package the web script servicing that route + its interpreter + some wrapper script that describes the Lambda function.
What’s interesting about zeit.co is that they convert such an application and deploy each route as lambdas automatically.
What about ExpressJS?
Contrast this to ExpressJS. In Express, an application route is also developed like a single application. In the sense that there is a request/response on a single route.
However, in Express, while each route is developed individually, there is a single entry point: ‘/’. Express is one whole application that will start a request on the entry point of ‘/’ and the Express router will dispatch the request to the proper route (e.g. /your-route-1). Node.JS can spawn many threads in the background, but it is the same whole application.
So an ExpressJS application is monolithic, and is not a good candidate to be packaged as a Lambda.
What about databases and stateful applications?
In short: Lambdas are stateless, and you can’t do stuff like manage database connections.
At this point people use Lambdas together with NoSQL or 3rd party API databases such as Firebase but not traditional DBs. I think the reason being that when you do a request on a NoSQL service you can move on instantly, whereas it takes longer for a traditional DB transaction to complete, and it is too long for a serverless app.
So we need to break down monolithic apps to make use of lambdas?
Yes, looks like most of the time. A multi-process application on a single machine would be chopped down and each process run as a micro-VM in the cloud with a well defined input and output.
How do they run these Micro-VMs?
Lambdas are powered by AWS Firecracker: https://aws.amazon.com/blogs/aws/firecracker-lightweight-virtualization-for-serverless-computing/
It is a KVM based virtual machine + a VM manager logic written in Rust. Amazon has been using plain EC2 instances for Lambdas and now been using AWS Firecracker which is much more lightweight.