Five Essential Principles for Developing Lambdas

Five Essential Principles for Developing Lambdas

“By 2022, most platform as a service (PaaS) offerings will evolve to a fundamentally serverless model, rendering the cloud platform architectures dominating in 2017 as legacy architectures” - Gartner

There is certainly a lot of interest from organizations and developers on adopting serverless approach. In this article, let us discuss a few essential principles for writing lambdas (and a guideline for developing serverless solutions).

These principles are derived from Peter Sbarski’s excellent book “Serverless Architectures on AWS” [1] and from the “Serverless Compute Manifesto” [2]. Building on these insightful works and deriving from our experience in creating serverless solutions, we present a few essential principles for developing lamdbas.

We use an illustrative example to make the principles easier to understand. Consider BookMyMovie — a movie booking website — created using serverless approach. We use AWS platform (in specific AWS Lambda), for explanations — the outlined principles apply for other platforms that support serverless functions such as Azure and GCP.

Principle #1. Single-purpose lambdas

One of the important things to keep in mind when developing lambdas is that they must be single-purpose. Deriving from the Single Responsibility Principle (SRP), “each lambda must be non-trivial, have a unique responsibility and encapsulate an axis of change”.

A lambda should be non-trivial. For example, consider that there is a functionality of sorting movies by various values like rating, release date and language. If a lambda takes in movie details as input and sorts specific entries and returns it back to the front-end, that lambda is doing something trivial. A serverless web application encourages rich front-ends, and hence the sorting functionality can be implemented in the front-end itself instead of making lambda call(s) for that.

A lambda should have a unique responsibility. Consider the example of the lambda that generates a PDF movie ticket that includes a QR code. Now, for generating the ticket, if it also has the code for generating the QR code, then it violates single-purpose principle. So, what is the alternative? You can create two lambdas — one that generates the PDF and another that generates the QR code. The PDF generator lambda can invoke the QR code generator lambda.

A lambda must encapsulate an axis of change. Consider a ticket generation lambda that has the business logic of generating contents of a ticket as well as directly updating the database about the ticket details with an SQL query. The lambda can now change at least for two reasons: change in business logic or modifying the SQL query. This indicates that the lambda does not encapsulate an axis of change. To adhere to the principle, the lambda must be broken (at least) into two lambdas — one for business logic of ticket generation and another one for updating the database.

Creating single-purpose lambdas is especially important because it has positive influence on qualities like reusability and maintainability. However, this may sometimes result in increased latency due to further lambda calls instead of directly executing code within the lambda.

Principle #2. Stateless lambdas

Lambdas must be stateless, i.e., it should not depend on its previous execution, storage in the execution environment, etc. For example, a ticket PDF generator should take all the required details with payment and ticket details as input and generate the PDF as the output. It should not depend on anything else — it should entirely depend on the input given as an argument to the lambda function handler and return a status code; it may persist the generated PDF in S3 (or less preferably, return the output encoded in binary format such as base64). However, let us say that the lambda stores the details of the tickets in the /tmp folder to return it later. Merely temporarily storing a file in /tmp folder doesn’t make it stateful — but if the lambda depends on the PDFs available from the previous executions, then that makes it stateful and can result in defects. For example, the files in the /tmp directory will be lost when a new copy of the lambda gets created.

Another scenario where the lambda can become stateful is storing the values in a global variables. For example, a lambda may increment a ticket counter global variable, and use it as part of the ticket number. This is a really bad practice and will almost certainly result in a defect when a new copy of the lambda gets created and the global variable value gets reset to zero.

Principle #3. Lambda is the unit of deployment

Every lambda function would have dependencies and the ones that are not available by default in the execution environment should be provided as part of the deployment package/dependency.

For example, a PDF ticket generation could be a lambda function. If it is implemented in Python, then it would require necessary modules for the execution of the code. The code for the dependent module needs to be part of the deployment package (e.g., a “zip file”). If the code changes in future to include a new dependency, for example a PIL module to include an image, that that also needs to be part of the deployment package. During a future change, if the images are not part of the ticket anymore, then we should remove the PIL module from the deployment package. In other words, a lambda and its dependencies co-evolve, hence forming the unit of deployment.

When we create a large deployment package that has commonly required Python modules and use it for the lambdas, then it violates this principle. This a fairly common problem for those new to writing lambdas. The reason is that in conventional programming (e.g., developing desktop applications) we don’t worry too much about the dependencies we use in the code since they are always packaged together when the application is deployed.

Principle #4. Orchestrate lambdas instead of hand-coding coordination

Let’s say you are booking the ticket for “Fantastic Beasts: The Crimes of Grindelwald” in PVR Orion East Mall, Bangalore on 9th Saturday, 11.00 am show. Multiple lambdas can be invoked that complete your ticket booking.

A lambda gets the payment and customer details as input. This invokes other lambdas to generate the PDF for the ticket which again invokes another lambda to create a QR code used in that PDF. Later, another lambda emails it to the purchaser and a lambda updates the ticket table with the ticket and payment details in the database.

It is actually straightforward to write these independent lambda functions. Since a lambda can invoke other lambda(s), it is also easy to hand-code this coordination. However, if we do that, it is a really bad practice. Why? There are many reasons, and here are a few key ones:

  1. If one of the called lambda fails, then error/exception handling needs to be hand-coded in the lambda, including failover/retry mechanisms.
  2. We need to write considerable amount of code to “glue” them together. For example, we need to mention the sequence of calls, capture the output of one lambda to feed to the other lambda, wait for one of the called lambdas to complete if it is a sequential lambda, etc.

One of the better alternatives is to use the built in orchestration mechanism — step functions. It lets you coordinate multiple lambdas and create workflows that can eventually maintain state between lambda functions. Additionally, built-in try/catch, retry and rollback capabilities are available out of the box and deal with errors and exceptions automatically with almost no extra code!

For example, if your lambda that sends the confirmation email fails, you can re-invoke the lambda to send the email to an alternative email id provided by the user (in case that is available).

Another example: if your lambda that adds the purchaser details to the database fails because of a connection error or a function timeout, you can easily re-invoke / retry the lambda.

However note that it is okay to call one lambda from another lambda — using orchestration whenever more than one lambda is used is an overkill. For example, it is okay for a lambda that is meant for updating purchaser details to call a lambda that updates the details in a database. Only when a lambda directly or indirectly makes many other lambda calls, it will make sense to instead use orchestration.

Principle #5. Use fully managed services

A serverless application consists of not just lambdas, but also the services it consumes.

In case of BookMyMovie, the lambdas may have to use databases for ticket and customer information (DynamoDB, RDS, Aurora or any other database), send emails using a service (SES or SNS), send notifications to purchasers (SNS), save documents in a storage (S3), etc. These services could be fully managed by the cloud provider or be managed by you.

Consider using a database, for example. You can run a database instance in an EC2 instance. In that case, you are not just responsible for configuring and maintaining the database, you also need to maintain the underlying server by hardening the OS, restrict network access, etc. On the other hand, if you use a managed service such as DynamoDB or Aurora Serverless, then you need not concern yourself with the underlying machine or the database — AWS will take care of that.

When architecting serverless solutions, prefer choosing services that are fully managed for you. While this is difficult to always follow, at least make sure that aspects of servers, virtual machines or containers are not visible in your solution.

Summary

Increasing number of organizations and developers are adopting serverless for developing their solutions. For instance, AWS Lambda adoption has grown dramatically from 24% in 2017 to 29% in 2018 [3].

It is easy to start developing applications without understanding or applying essential principles in creating serverless solutions. Examples: it is easy to create stateful lambdas, get a lambda to do multiple things, put all dependencies for different lambdas in a single deployment package, hand code coordination with multiple lambdas or end-up using services that are not fully managed. Or even force-fit lambdas or serverless approach when you can solve it elegantly using other solution approaches.

In these scenarios, the end-result is an ineffective solution though you may claim yourselves to be a “serverless solution”.

Hence, it is important to understand these fundamental principles discussed in this article and apply them in practice.

References

[1] “Serverless Architectures on AWS: With examples using AWS Lambda”, Peter Sbarski, Manning, 2017. https://www.amazon.com/Serverless-Architectures-AWS-examples-Lambda/dp/1617293822/

[2] “Serverless Compute Manifesto”, https://de.slideshare.net/AmazonWebServices/getting-started-with-aws-lambda-and-the-serverless-cloud/29

[3] “The State of Modern Applications & DevSecOps in the Cloud” report for 2018 by Sumo Logic has numerous insights on cloud adoption usage in organizations: https://www.sumologic.com/wp-content/uploads/Modern_Apps_2018.pdf

Written by: Ganesh Samarthyam and Srushith Repakula from KonfHub team (www.konfhub.com)