The Goods and Bads of Serverless

Benjamin Tanone
Jun 17 · 9 min read
AWS SAM is a fairly nifty serverless framework which is represented by a fairly cute squirrel

Does it actually live up to the hype?

Contrary to what serverless marketers tell you, using a serverless architecture is not all that rosy. Sure, your application’s scalability is only limited by your corporate credit card’s credit limit, but there are quite a few gotchas which makes me wonder whether or not our decision to go serverless was correct.

Let’s start with what I have found to be less than pleasant with using a serverless architecture (since marketers have already emphasised on The Goods so much that you’d probably know them by now).

The Bads

Cold starts

Everytime you ask a serverless veteran what his/her #1 gripe is with serverless, they’d probably say, “Cold start.”

Cold starts are basically this: when a “function” has been idle for quite a while, your serverless provider (e.g. Amazon) recycles the space that your function had been taking. This means that if the function was to be used again, your serverless provider would need to prepare it for execution.

There are a lot of articles and resources on the internet explaining the problem. For example, this article stated that Lambda functions which are running Node take about 12 milliseconds on average to initially load (not to run). This looks fine at first, but when you look at another article, this figure jumps up to roughly 7 seconds when you operate within VPCs (which you often need to because DBs are usually hidden behind VPCs).

This was one of the reason me and my team are planning to move away from a serverless architecture. Everytime we tried navigating the front-end, it would take around 5–10 seconds for the data to initially load (afterwards it only takes around 500 ms). Note that each of our APIs are served by separate functions, so that 5–10 seconds wait would re-occur everytime you try to use a different functionality. Anyone with a hint of UX experience would tell you that this is not good UX at all.

On the other hand, there are ways to get around cold-starts. For example, you can ping your functions periodically to ensure they’re always warm. If you’re feeling adventurous, you can also set up a single function to handle all API calls in order to ensure that cold-starts only happen once (instead of in our case where each API endpoint has its own functions, which have their own cold-starts).

EDIT: As of June 14th, it seems AWS has started to implement a new architecture which they’ve been talking about; cold-starts now only take 2.5 seconds on average. In the [proposed] new architecture, Lambda functions can share/reuse existing ENIs (think VPC connections), so the heavy cold-start only happens once per VPC. See: https://www.nuweba.com/AWS-Lambda-in-a-VPC-will-soon-be-faster

Your applications have to be truly stateless

But wait, the systems powering RESTful APIs aren’t always fully stateless. For example, Spring Boot applications take a few seconds to start-up and reach a state where it is ready to serve a request. During that start-up period, the application may prepare a pool of connections to the application’s database layer and retrieve its configurations from remote sources so that, when requests are served, the application doesn’t have to waste time preparing itself.

In a serverless environment, you don’t have the luxury of making sure your application is ready to serve your request; you have to design your application in such a way that it is ready to serve a request from a cold, dead state.

Do you want to cache that particular decryption key locally? Tough luck.

Do you want to reuse those connections? Nah, we have to get rid of them.

These little quirks with being fully stateless may seem small at first, but they can add up really quickly to the time it takes for your application to serve a request. Is this user authenticated and authorised to access this endpoint? Wait, let me contact our identity provider and see if they’re actually legit. Alright, we can start doing something now. Oh wait, before we can get our data, we have to negotiate and establish a connection to the database.

Remember: seconds count when serving a request, because they can quickly add up as pages make multiple API calls (e.g. authenticate/authorise, get user data, get non-user data).

No one really makes serverless apps…

As of writing, there are less than 900 questions on Stack Overflow related to Serverless (SO tag: [serverless]) (AWS SAM’s older, cross-provider sibling). On the other hand, there are around 57,000 questions on Stack Overflow related to Spring Boot (SO tag: [spring-boot]).

…which means there is no mature framework for it (yet)

This means that we had to do a lot of manual work which would normally be taken care of in other frameworks (i.e. Spring Boot). For example, we had to map error responses ourselves, whereas in Spring Boot, unhandled errors are automatically mapped and returned as a HTTP 500 (which you can configure further if the default does not suit your need).

In addition to this, AWS SAM does not yet have a healthy and mature ecosystem of libraries around it when compared to older frameworks such as Spring. With Spring Boot, you can literally pull in dependencies such as Spring Security, configure them a bit, and start building your business logic after tinkering with your dependencies for a few hours. Well-established best-practices are available everywhere too, as a lot of people have worked on a Spring Boot application in the past.

It’s not really good for long-running jobs

Admittedly, for user-facing serverless APIs, you wouldn’t want to make the user wait 15 minutes anyways. However, when you’re dealing with datasets that are approximately in the six-figures (as our team does), you would need to re-examine whether or not that time limit hinders your ability to process all the data.

On the other hand, several articles on the internet (such as this one) have suggested packaging your long-running jobs as Docker containers and deploy them to services such as AWS Fargate. However, this approach is arguably not fully serverless, as you’d need to deal with the environment in which your application is being run in (in this case, dealing with Docker containers and images).

There are also suggestions on using Lambda recursively when doing long-running jobs, although I personally haven’t played with that concept yet. I am a bit wary to the dangers though, as it removes the concept of “time limits” when executing a function, which may result in you having to explain to your boss why your bill jumped up from $1 to $20 within the span of one day.

Development environments are… Tricky

While I don’t know if this applies to the Serverless Framework, developing on AWS SAM is a bit of a pain.

One of my team’s major issues is that we have a lot of dependencies which are not automatically managed by SAM. For example, did you forget to run “npm install” locally? SAM will just crap out and say “Error: Failed to import module YourModule” or something along that line. Or, how about when you want to include pre-run initialisations such as initialising your DB? Well, you’ve got to (1) manually do that; or (2) create your own script/function to do it.

This problem is made worse by the fact that serverless, by nature, means that you’d have to take advantage of external services in order to temporarily store states. However, AWS SAM only emulates a (really-dumbed-down) API Gateway + Lambda environment, which may cause you problems when you want to start using other AWS services such as SQS or SNS.

The Goods

Pretty Darn Good Scalability

With a serverless architecture, you pretty much say “Hey Cloud Provider, you deal with making sure I have enough resources/servers to serve my customers. I’ll just deal with the logic of my application.” This means that scalability is taken care of by your cloud provider, and you don’t need to think too much about what would normally make a Ops engineer bang their head as their boss is complaining about the company servers being down.

That being said though, as we are still developing our project, we have yet to see how this would fare in a production environment.

You Pay for What You Need

When you have a serverless application, you literally pay for your application’s level of usage. Did you fail to tell a single soul that you launched your serverless application? No problem — you do not have to pay for anything.

I’m not going to elaborate too much on this; a quick Google search would show you real-world examples on how going serverless can make your bill really light (or, as we have discussed before, really expensive).

No Need for an Ops Team

Setting up ops for the first time for server-based applications is always a pain; not necessarily because it’s complex, but because the process requires effort. You’d have to setup scaling policies, configure load balancers, configure VPCs, configure server provisioning and so on and so forth.

In a serverless environment, your developers don’t have to deal with those technicalities. You only deal with the data coming in and out. The most I’ve dealt with my execution environment is probably assigning a VPC to a Lambda; no need to patch and secure my “servers” and ensure that they’re running smoothly, or to setup autoscaling to ensure that my servers can handle the load.

Conclusion

Serverless is the future of computing — except the future is not now.

However, for mission-critical projects (e.g. collecting data to be sent to the government), I would be hesitant on recommending serverless (or AWS’s rendition of serverless) to my clients. This is because:

  • It’s relatively new stuff, hence it may not be the most maintainable solution. Not a lot of developers are adept in developing serverless applications; they would need to be experienced with both working on the cloud and developing using the serverless paradigm (i.e. things have to ideally be lightweight and stateless).
  • Typed languages (e.g. Java), which helps ensure data integrity and code maintainability, have long cold-starts on Lambda. Conversely, if you want to have shorter cold-starts, you’d probably have to use dynamically-typed languages such as Python or JavaScript, which are less “safe” than typed languages. This is one of the reasons why my current team is considering moving from serverless onto a server-based solution, just because it’s starting to become a pain to track what data gets passed around.
  • Very easy to make mistakes (because you can write any code you want in serverless, as long as it spits out an output).

If you have any corrections you wish to make or any questions you wish to ask, feel free to post it as a response! I would love to have incorrect facts in my post be corrected!

Benjamin Tanone

Written by

Backend/DevOps Software Engineer