Developing a fully Serverless Web app

Background

Hootsuite Engineering
Hootsuite Engineering
12 min readDec 31, 2017

--

As many companies tend towards a service oriented architecture, developers will often wonder whether more and more parts of their service could be moved into the cloud. Databases, file storage, and even servers are slowly transitioning to the cloud, with servers being run in virtual containers as opposed to being hosted on a dedicated machine. Recently, FaaS (function as a service) were introduced to allow developers to upload their “application logic” to the cloud without requiring the server, essentially abstracting the servers away. Despite not having to worry about the servers, developers find that they now have to deal with the cloud. Complexity with uploading, deploying and versioning now become cloud related. This, along with several current limitations of the FaaS model, has often positioned serverless technology as being best suited towards complementing a dedicated server.

Motivation

Recently, the Serverless Framework was introduced allowing us to abstract even the cloud part of the development process away. Now, with just about everything server or cloud related hidden away, developers have the ability to directly write code related to the actual application. The serverless model also offers several other advantages over traditional servers such as costs and ease of scaling. So would it be possible to completely replace the server with a serverless model? Taking into account the limitations of the FaaS model, we set out to build a fully functional, cloud based, serverless app.

What did we build?

A service that allows Hootsuite employees to list themselves and their skills as well as create, search and join internal projects within the company.

The backend for our service will be primarily based on AWS Lambda, an event driven FaaS Platform provided by Amazon. The computing service is provided in the form of Lambda functions.

To build this project, we would need the Lambdas to be able to carry out some simple tasks:

  • Store users and projects in a database
  • Retrieve users and projects in a database
  • Modify attributes of the users and projects

Along with some more powerful capabilities:

  • Search users and projects
  • Sending emails to users
  • Rank projects by upvotes and time

In addition, the Lambdas will need to be able to connect to several other services so we can easily integrate with them.

For this project, we used the following AWS services:

  • Lambda — serverless computing
  • IAM & Cognito — permissions and authentication
  • DynamoDB — data storage
  • S3 — static website hosting and file storage
  • SES — sending emails
  • Cloudwatch — debugging and scheduling events
  • Cloudfront — hosting the production single page application (SPA)
  • API Gateway — http endpoints to front our Lambda functions

And a 3rd party service:

  • Algolia — search service
  • S3_website — open source application to assist with deploying static Web sites on S3

For the SPA frontend:

Most importantly, the Serverless Framework is used in our project to test and deploy our infrastructure. Since we don’t have to worry about configuring services in the AWS cloud or anything DevOps related, we were able to focus on writing application code, rAPIdly deploying and working through our milestones.

The constraints

Before we jump in and start developing the entire app, it is best to know exactly what we are working with. Lambdas, as given by the name, are anonymous functions. Therefore, our backend is technically limited to single, independent function calls. The only other resources we have are storage and services on the cloud. This puts several constraints on our fully serverless app that traditional servers would have little trouble with. We outline some of these constraints below.

Statelessness of Lambda functions

The Lambda functions themselves are event driven and naturally stateless. Each event from the client is typically followed by a single invocation of a function. This typically delegates the storage of state to server memory such as Redis or a database. In our case, since we only have cloud resources, we chose to use a NoSQL serverless database (DynamoDB) for fast storage and retrieval. Consequently, our workflow and functions must be designed to work with single, independent, stateless functions.

Prolonged workloads

Currently, there is a 5 minute execution time limit for Lambda functions, in contrast with traditional servers which have no execution time limit. This makes the use of Lambdas challenging for tasks that require continuous execution such as processing data streams. If there is a task that would take longer than 5 minutes, the workload must be split into batches, with the data blocks and state being kept track of externally.

Scheduled tasks however, were simple to configure. Any Lambda that was uploaded to AWS could be configured to run at a set interval using Amazon Cloudwatch. So if something was required to run continually using Lambdas, there is the option to schedule events with short intervals, mimicking a continuous runtime. However, the application should be designed to completely avoid this. In fact, serverless architecture actually encourages tasks like this to be built as an independent service, exposing its own APIs and allowing other services to connect with it.

Cold start

When the Lambda functions are not in use, the cloud provider typically “spins down” the function. When they are invoked again, the function’s runtime container is required to “spin up” before executing the functions. This means any overhead required to set up the execution environment, such as the JVM for Java functions, will add additional latency to the initial execution time. An analysis of start times was done earlier this year by Neil Powers and Paul Cowles.

For background and housekeeping tasks, this is hardly an issue. However, for the app we are building, several pieces of IO are bound directly to the Lambda functions. This means careful attention must be paid towards preventing the cold start and keeping the user satisfied in the frontend. For Java implementations, a “ping” might be required when the user initially loads the client so that the functions are ready when they are invoked by the user. For this app, we chose to use Node.js as recent sources indicated it has the shorter startup latency.

Dependency and locality

Many of the advantages and disadvantages of Lambda are based on how the AWS Lambda execution model works.

  1. When a function is initially invoked, AWS launches a container and executes your function based on your configuration settings (cold start).
  2. For subsequent invocations, AWS Lambda will try to reuse that container to execute the function.
  3. If there are high amounts of traffic and more requests are being made, AWS Lambda will simply spin up more containers and accommodate the demand. Each container will incur the startup cost.

From this, it is obvious that AWS Lambda takes care of any horizontal scaling for you, which is a huge advantage. Unfortunately, since each function gets its own separate container, there will be overhead associated with creating each container. This also means spitting up your Lambda functions into chunks of smaller Lambda functions will further increase that overhead.

Since Lambda containers are created on demand, there is no guarantee that your functions will be running within the same memory space or even the same machine. Unlike traditional servers where there are negligible costs in performance when randomly accessing any function within the code, Lambda functions are not “local” and must pass state back and forth through the cloud. From this, it is obviously not a good idea to wait for several dependent functions on the client side, especially if it is user facing.

If functions must be chained, they should be done so on the cloud side where there is the option to configure message passing in Amazon SNS or by using services such as AWS Step Functions for function workflows.

Authentication

This part has less to do with serverless architecture, but it shows how our app connects to a BaaS (backend as a service) to manage things like authentication. Since all the server-side logic is managed by the BaaS provider, things are still serverless from our point of view. Skip ahead if you’re just interested in the serverless architecture.

As we mentioned above, a typical user of our app will be interacting with several of Amazon’s services. We needed a way to manage authentication along with several AWS permissions and API access. Amazon IAM (Identity and Access Management) and Cognito allowed us to configure these right within the console.

  1. To specify a certain AWS action that is allowed or disallowed (such as uploading to S3), it can be added as an IAM policy.
  2. With several of these policies, they can be associated with a role. A role is simply made up of several policies.
  3. Now with the role configured, it can be added to Cognito Federated Identities, essentially creating a “type” of user for us to vendor out.

Following this, we need an identity provider to authenticate a user, allowing us to vendor out an instance of our federated identities. For this app, we chose a custom identity provider in the form of a Cognito User Pool. This allows us to handle registration, maintain the user directory, and provide our own identity tokens for the authentication, while keeping everything on the cloud. The workflow for authentication is shown below.

  1. The client sends its authentication information to the IdP (Identity Provider)
  2. Our IdP (Cognito User Pool) sends a confirmation token to the client and Amazon Federated Identities
  3. The client sends its token to Federated Identities and the token is verified
  4. Federated Identities retrieves the IAM role corresponding to the IdP and returns temporary access credentials
  5. Our client can now access our app’s API and services using the credentials

Now we can go ahead and add any permissions we want to give to the user.

The full serverless architecture

Our Lambda functions are invoked by the client through Amazon API Gateway. Subsequently, the Lambda function will invoke the API of another service and return the corresponding results. The following shows a typical Lambda function workflow:

However, there are several other ways to invoke a Lambda function. Several of Amazon’s services such as S3, DynamoDB, SES, and even Alexa have the ability to trigger Lambdas. We used Cloudwatch events for example, to schedule a project ranking algorithm. This also means we can use Lambdas to control workflow between our microservices.

One useful Lambda trigger we used was Amazon Cognito’s pre-signup trigger (shown below). This allows us to verify whether the emails are from Hootsuite before they are sent off to the Identity Provider. This effectively restricts access to Hootsuite employees only. There are also event hooks at every step of the signup or login workflow, giving us high flexibility to customize our authentication.

Search

A cool thing about serverless is the ease of connecting with other services. The ability to perform fast, complex and dynamic searches on databases often requires running a server instance (Elasticsearch). Since we wanted the app to be completely built on top of serverless services, we delegated the task to Algolia, a SaaS (Software as a Service) provider. From our side, all we have to do is use their API to index items and perform queries through Lambda, and we are able to remain serverless.

This essentially makes the combination of all the Lambdas our “server”, allowing them to be the layer that manages workflow and modifies the state of our application. The following illustrates this:

From this, it is clear that our Lambda function layer has essentially replaced the “server” layer of traditional applications.

Frontend

For the single page React app, we build it separately from the Lambdas, communicating only through the API Gateway. The static app is hosted by S3 and served through Cloudfront. With serverless architecture, we expect to be building several components completely independently, similar to what we have here.

Developing Lambda functions

The actual development process after the initial AWS setup was exactly as simple and productive as advertised. After adding your function to a serverless configuration file, the only thing left for you to do was to write the actual function.

To test your function, the Serverless Framework provides the option to invoke your Lambda functions locally by mocking the request. This means the functions can be run without being uploaded to the Cloud and any console output and exceptions will go straight to the terminal that invoked the function. After some use, it felt as convenient as running the function straight from an IDE.

Functions can also be debugged on the cloud. Any exceptions, logs, or function invocations are automatically recorded. After a bit of navigation on Amazon Cloudwatch, you can find logs and statistics on every version of every function that was deployed. So if you needed to track down the request that was causing errors in production, it can easily be done through Cloudwatch.

As for uploading and deployment, it was as simple as typing a single command in the serverless CLI. Previously, there were a multitude of configurations and actions that needed to be done each time you wanted to upload your Lambda functions. This required developers to write and debug their own scripts to automate that process. With the Serverless Framework, all of this is now built in and automated. The very last bit of anything that has to do with servers is now abstracted away and it truly feels like the development process is “serverless”.

Benefits

Zero fixed cost

Many of Amazon’s services have free tiers to cover usage during development. This allows the developer to experiment with projects at virtually no cost. Once in production, pricing for Serverless services are usage based, so if a service does not get used, the developer incurs no cost. This extends to the storage we used as well; DynamoDB has no cost for the rest state. A traditional server hosted in the cloud such as an EC2 instance will be charged hourly, making Serverless the cheaper alternative for developing prototypes and intermittently and lightly used production applications.

Reduced time on Devops

With the introduction of FaaS, a large portion of a DevOps’ time has been shifted to maintaining and building infrastructure between the code and the cloud provider instead of the server. With the introduction of the Serverless Framework, yet another gap has been closed, putting the developer’s code closer and closer to production and removing a considerable amount of the DevOps work required. As demonstrated by this project, a relatively small amount of time was put towards planning the architecture and setting up services, while the majority of the time was used to develop the actual application.

Enforcement of the micro in microservices

Another benefit of a full FaaS backend is that you are fundamentally limited to microservices. Several of our app’s requirements, such as authentication and search, require a continuously running service. Since directly building something like that doesn’t suit AWS Lambda, we must delegate them to microservices. The authentication and search would then have to be built as separate services or outsourced to another pre-existing service. Our app can then connect to that microservice through their API, which is managed independently from our app. Building for a serverless world inherently restricts you to building true microservices.

So is there a functionality that doesn’t suit Lambda functions? Build it as an external microservice and connect the app to it.

What’s next?

Serverless technology has gone through a lot of growth in the past year. Several guides and tutorials we read up on in the beginning of this project have actually been updated to include recent updates from Amazon, Microsoft and Google. By the time more people read this, there will certainly be more improvements made to the serverless stack.

We have demonstrated that serverless is a cheap, fast way to develop and deploy microservices and even full-fledged Web applications. As serverless and cloud technology continues to improve, the performance cost generally associated with FaaS architecture will be less and less noticeable. Serverless is already commonly integrated with server hosted microservices.

It is very likely that the container and serverless movements converge in the future. Even if not the case, it is likely only a matter of time before a company that is built on top of microservices more fully embraces a serverless infrastructure.

About the author

Harry is a full-stack developer working on the People team at Hootsuite. He is a 3rd year student at the University of Waterloo studying Computer Engineering. Connect with him on LinkedIn.

--

--