How and When to Use AWS Lambda Functions

Published in

Stackahoy

5 min readSep 12, 2017

A deep metaphor describing cloud-based resources.

Ephemeral cloud-based functions like AWS Lambda and GCE Functions are gaining traction quickly. If done right, they can be economically beneficial and enable infinite scalability™ without forcing you to wade into swarms of containers. If applied incorrectly, they can result in a much more expensive, over-engineered, and difficult to manage application. It’s important to understand that it isn’t all or nothing. Serverless is in our future, but it isn’t our exclusive future.

Functions as a service (FaaS) are functions configured in a cloud provider with a single entry point which lay dormant until triggered. Planning for FaaS is similar to the way you would approach learning about immutable objects, functional programming paradigms, or migrating from monolithic to distributed architectures. You break apart, simplify, and do an overall re-think of the application. However, it doesn’t mean you need to completely re-build the application.

Banana for scale

Say you have an existing monolithic application which acts as a tiling server — one similar to what Google or Mapbox uses to generate their beautiful jpg/png/json tiles.

The application’s core features are:

Allow authenticated users to submit new geographical data to be processed and ultimately translated to tiles via Mapnik. (Mapnik is common cartographic imaging software for processing things like Shape Files)
Serve tiles via HTTP — /{tile}/{zoom}/{lat}/{lng}.png
Allow for a special URL to generate a large map based on a bound box — /map/{north}/{south}/{east}/{west}/{zoom}.png

It takes a lot of CPU to generate tens of thousands of tiles. If you want a performant application, you’ll be putting a substantial amount of your budget in GCE or AWS for VMs. You also have to consider handling more requests than a typical application. Generally, one page load will result in six or more tile requests, or whatever it takes to fill the screen. Dragging the map around quickly results in a lot more. A lot of people doing the same thing… well, you get the idea. A lot of requests.

Your initial build or MVP may look something like this:

This may work great at first. You could write background “tile warming” scripts which cache tiles before they get requested and other small enhancements to get a proof of concept out.

But what about when it’s time to scale? You could implement a microservice approach with tools such as Docker, Kubernetes, Packer and terraform. This would ultimately distribute the traffic and computing across more VMs. However, you’ll still be presented with two pain points:

You’re limited to the technical bounds of the server specs until auto-scaling takes effect (assuming auto-scaling is setup). Not a reasonable thing for mission critical maps.
You must allocate the resources necessary ahead of time to accommodate your largest spike to avoid a crash. Obviously, this could be very costly.

Rather than throwing money at the problem, this would be an appropriate time to leverage the power of Lambda functions.

Let’s make some fair assumptions.

Mapnik processing and tile serving is our bottleneck, especially if tiles aren’t pre-cached upon request (more likely at higher zoom levels).
Tiles aren’t getting processed or served 100% of the time. For instance, in the middle of the night or after all tiles have been warmed (cached) for a specific layer.
The admin area is a single-page javascript app which interfaces with a simple API via OAuth2 authentication. This API does not get nearly as much traffic as the tiling server. Basically, it’s a simple web app.

Based on this information, we could break up the application in the following way:

By breaking out the tiling server and tile generator, our VM instance for the base client application becomes much simpler and requires minimal resources. We could probably get away with a standard size VM for this.

Last but not least, costs will be kept in check, making the common adage “if we need to scale, that’s going to be a good problem!” more true to the usage-to-cost curve.

Application Structure

Breaking out specific services in your application will create more moving parts, thus creating more complexity. To help relieve this, we built functionality within Stackahoy (our continuous deployment SaaS tool) to allow for directory-specific lambda packaging so you won’t have to manage this independently.

Unobtrusive Directory to Lambda Packaging

Using the directory to function mapping feature, you can simply point to a directory that should be exported. Stackahoy will then create the package (zip file) and update the function within your AWS account. Meanwhile, any other more “traditional” Docker or file-based deployment will happen at the same time.

Alternatively, you could build a fully serverless application in one repo without any other type of deployment enabled.

Conclusion

FaaS is by no means a magic fix for everything but, if used appropriately, it can be an incredibly powerful and cost saving tool.

Here are some useful links for further reading:

How and When to Use AWS Lambda Functions

Banana for scale

Application Structure

Conclusion

Written by Pete Saia