Why serverless is still in its infancy

FaaS Architecture (Work-in-Progress) ordered from business logic (BL) to operational logic (OL). Source.

Computer scientists Johannes Grohmann and Erwin van Eyk, together with three fellows, recently published a paper about the performance challenges for vendors that provide FaaS. FaaS is the most mature type of serverless architecture out there nowadays. I had a two hour Slack chat with Grohmann and Van Eyk, discussing the challenges and emphasizing the need of a benchmarking report to assess all those serverless vendors.


Bas van Essen:

First a question about your background. What is your experience as computer scientists and what is your interest in serverless architectures?

Johannes Grohmann:

Our research group focuses on performance engineering for software systems. To that end we’re an active member of the SPEC RG effort, focusing on various performance-relevant issues from a research perspective. With the increasing interest in serverless and FaaS platforms, SPEC RG also decided to launch an initiative to tackle performance-relevant challenges for FaaS. The paper we now discuss is one of the initial products of this working group.


Bas van Essen:

In order to make sure everybody understands the definition of FaaS. How would you describe it?

Johannes Grohmann:

There still exist a lot of different perspectives and views on the exact definition of FaaS. Our research group discussed this issue in an earlier paper as well. In general serverless computing describes a paradigm, where all operational concerns, such as deployment and resource provisioning, are delegated to a cloud platform with a pay-per-use model. Function-as-a-Service (FaaS) is a form of serverless computing, where you execute the functions of your application in a serverless environment.


Bas van Essen [4:17 PM]:

Why did you guys decide to focus on FaaS?

Erwin van Eyk:

So we started looking into this serverless computing back in May 2017. Back then “serverless” was used almost interchangeably with FaaS. It was also one the initial issues we wanted to attach: what is serverless? and, how does FaaS relate to it?

Afterwards, we decided to keep the focus on FaaS since it is most mature of the serverless cloud models.


Bas van Essen:

Can you explain what exactly performance isolation is?

Johannes Grohmann:

By performance isolation, we mean ensuring that different VMs/containers/Functions on the same physical resource do not (negatively) influence each other performance-wise. In FaaS-platforms, multiple functions are usually executed on the same physical host in order to efficiently utilize the given resources. However, if one executes, e.g., two independent, CPU-intensive functions at the same time on the same core, both will “steal” each others CPU time and therefore have an increased latency.

Performance isolation refers to the degree that two functions that are executed on the same host are independent of each others resource usage. Therefore, function A should not be influenced at all by how many other functions B and C are running on A’s host and their resource usages.


Bas van Essen:

Before diving deeper, one more thing about FaaS: in which situations/scenarios is FaaS in your eyes the best type of serverless computing to go? You say that in the meantime the serverless landscape already has matured more, so perhaps you see alternatives?

Erwin van Eyk:

Using FaaS solutions is great for situations where you have to serve a bursty, ideally CPU-bound, workload. This of course is pretty generic, which makes FaaS applicable in many use cases. I don’t really have a specific scenario in mind, though the various vendors have been publishing quite a number of use cases.

The landscape has indeed matured. You have for example offerings like databases (e.g. AWS Aurora), containers (AWS Fargate) all being marketed as serverless. These are not so much alternatives or replacements for FaaS, but rather complements, enabling more complex serverless applications.


Erwin van Eyk:
“I don’t think there is a clear competition happening between BaaS and FaaS; both have their uses. Likely we will see platforms experimenting with variations on these two models.”

Bas van Essen:

The older concept of BaaS (Backend-as-a-Service) is often considered to be another type of serverless computing. To which extent do you consider this too as part of the serverless family and will it continue to compete with FaaS in the future, do you think?

Erwin van Eyk:

What is and what isn’t serverless is indeed a bit of a controversial topic. In our initial vision we argue for a broad definition for serverless. A serverless service should in principle exhibit the following aspects:

(1) Granular billing:

The service only bills the user for actual resources used to execute business logic. For example, a traditional VM does not have this characteristic, as users are billed hourly, not for resources that are actually utilized.

(2) Minimal operational logic:

Operational logic, such as resource management, provisioning, and autoscaling, should be delegated to the cloud provider.

(3) Event-Driven:

User applications (whether they are functions, queries, or containers) should only be active/deployed when they are needed; when an event requests it.

So with that definition, BaaS is indeed serverless.

I don’t think there is a clear competition happening between BaaS and FaaS; both have their uses. Likely we will see platforms experimenting with variations on these two models.


Bas van Essen:

Ok, onwards to the performance challenges you see regarding FaaS architectures. Can you describe them shortly?

Johannes Grohmann:

We identify six major performance challenges:

(1) Overhead:

The FaaS platform introduces some overheads (e.g., provisioning overhead when starting a new function instance) which might prevent adoption for some use cases.

(2) Performance isolation:

We already talked about this one. Here, it is important to find a balance between efficiency and performance guarantees.

(3) Scheduling policies:

Once a function event is triggered, a request has to be scheduled to a specific function instance. This provides great room for optimization. However, you have to consider that the scheduling is done in an online fashion and therefore introduces additional overhead.

(4) Performance prediction:

Many techniques have been proposed to predict the performance of traditional software systems. It is unclear, how they can be adapted to FaaS platforms.

(5) Engineering for Cost-Performance:

The pay-per-use pricing model of serverless platforms seems calibrated for a moderate number of requests per second. For higher workload intensities, dedicated VMs can turn out to be cheaper. Here, more complex pricing models might be investigated in order to relate the performance to its cost.

(6) Evaluating and Comparing FaaS Platforms:

As the trend of serverless platforms is somewhat new, there is a lack of standardization and benchmarks of FaaS platforms, which prevents making informed decisions when it comes to evaluating FaaS offerings. This is one of the main issues that we as the SPEC FaaS research group are currently working on.

FaaS function executions in theory (left) and in practice (right).

Bas van Essen:

Regarding the first mentioned: besides provisioning overhead, you point to request overhead and function lifecycle management & scheduling overhead. As stated in your paper, provisioning overhead is the dominant overhead when comparing all three, as FaaS platforms will need to deploy cloud functions prior to their use, and provision the underlying resources prior to deployment (such as containers or VM’s). How do you recommend vendors to tackle this overhead?

Johannes Grohmann:

I think the key here is to prepare the underlying infrastructure as much as possible so that the actual provisioning tasks is as small and therefore fast as possible. This means, that as you already pointed out, VMs or containers should be readily available when a new function instance is about to be deployed. Additionally, it is not wise to shut an instance immediately down, after finishing function execution. Instead, one should keep a “warm” instance running for a while, in case of additional requests. This is, of course, a trade-off again, where you have to consider the cost of letting “warm” instances stay and avoiding cold start function runs.

The typical lifecycle of a cold start and warm execution.

On many platforms, function runtimes are preemptively deployed and only the function code needs to be deployed during a cold start. As an example, AWS Lambda’s layers take this concept a step further. Here a function is defined as a set of layers.

For example, a function might use a Python runtime, add a numpy layer and a scikit-learn layer on top and then the actual function code itself might be fairly small. This enables Lambda to preemptively provision not only the Python runtime but also the numpy and scikit-learn layer as many different functions might build upon those layers.

Erwin also discussed this issue in an article.

The anatomy of the runtime of a FaaS platform. Source.

Bas van Essen:

Thanks for the explanation and link. The other mentioned overhead types, how do vendors overcome these?

Johannes Grohmann:

Unfortunately, the involved vendors are not particularly forthcoming with information about their underlying implementations. To the best of our knowledge, the major open-source platforms currently do not employ any specialized optimization for these overheads. This is, again, due to the relatively new nature of the whole field.


Bas van Essen:

Alright, we discussed the definition of performance isolation earlier. The challenge in this for vendors is to find a balance between efficiency and performance guarantees. What is your recommended best-practice to organize infrastructure to overcome this?

Johannes Grohmann:

We neither think there are established best-practices for that issue yet. In general, finding balance between efficiency and performance also depends on the underlying infrastructure of the platform. Does it use containers, micro-VMs, VMs or a different concept altogether?

However, performance isolation has been a research topic for many years, as general virtualisation techniques such as VMs and containers do have the same problems. Therefore, there exist a lot of approaches for achieving performance isolation. Here, it would be interesting to investigate, how the existing solutions can be adapted to FaaS platforms.


Bas van Essen:

Can you summarise shortly which approaches to achieve performance isolation are the most applied ones?

Johannes Grohmann:

The authors of this paper give a nice overview of the topic. The first thing to do is to apply quotas, i.e., defining the limit of how much each virtualisation unit can consume of each resource. By applying hard quotas and permit overbooking one can try to ensure that the assigned quotas are always available.

However, there are still resources (e.g., disk), where it is not trivial to apply quotas. Additionally, hard quotas and no overbooking, of course, comes with the cost of lost efficiency, in the case that the guaranteed quotas are not used. Therefore, cloud providers usually do not apply such hard limits.


Bas van Essen:

You spoke about scheduling policies as another performance challenge for FaaS providers. Did vendors already find multiple ways to be able to cope with this?

Johannes Grohmann:

We do not have perfect knowledge about how the vendors deal with the given challenges, as the implementations are usually closed-source. However, we think that the issue of scheduling policies is still an open issue with lots of interesting research problems.

The interesting part is that scheduling policies offer a lot of optimization potential by considering workflow deadlines, the location of input data and code, load balancing and/or co-located functions. Therefore, using a (near-)optimal algorithm can lead to major cost savings. On the other hand, however, you have real-time constraints. More complex scheduling policies introduce a greater scheduling delay, which is to be avoided. Therefore, schedulers need to make fast decisions that still lead to satisfactory scheduling.

One way of optimized scheduling was presented by Cristina Abad, a co-author of this paper that describes the optimization. It’s an improvement for current open-source platforms at least. However, this is just the first step, and there is still room for more research in that area.


Bas van Essen:

As you stated previously, many techniques have been proposed to predict the performance of traditional software systems. It is still unclear however, how they can be adapted to FaaS platforms. Which techniques seem to have most of the potential?

Johannes Grohmann:

One way is to utilise architectural performance models. However, they need to model both the hardware and platform structure (the cloud-provider view) and the application (the cloud-customer view), which can be hard to gather at the same time.

We are currently working on automated extraction techniques from the cloud-provider point of view, that infers the information about the application from monitoring data. One can use machine learning models to try to predict the performance of individual functions and/or requests. We are also working on an algorithm trying to achieve that.


Bas van Essen:

Alright thanks. If we further discuss the pay-for-what-you-use model, which in your eyes is a typical serverless architecture trait, you stated that this can turn out to be more expensive than VM’s in the case of higher workload intensities.
Can you describe a context (with numbers) in which this happens?

Johannes Grohmann:

I can maybe try to give a concrete example: let’s assume, one function execution costs 0.001€ on a given FaaS platform. We have a (more or less) constant workload with 1000 requests/s, that are not very latency-sensitive and can, therefore, be handled in a sequential manner. This would turn out to cost around 1€ per second to run.

However, I can also rent a VM, deploy my function and use it to serve the same requests. I need to rent a VM that is able to serve 1000 requests/s, which costs about 0,75 € per second to rent. In this case, using a VM would turn out to be cheaper and save about 0,25€ per second, serving the same number of requests.

This can be due to the additional overhead spend by the FaaS platform in trying to estimate your resource needs, without having your information about the workload intensity, plus the additional scheduling overhead etc.


Bas van Essen:

And then the solution you guys propose, exists of more complex pricing models. Although presumably more complex pricing models are not very transparent/user friendly for the consumer, how could that look like?

Johannes Grohmann:

One way would be to introduce a concept like resource reservation. This concept is also already known from the domains of VMs. If I, as a customer know beforehand, which and how many requests I will expect in the following minute/hour/day, I can book resources in advance. These resources are then sold at a cheaper price, as compared to the spot instances (for VMs).

On the downside, I still have to pay for these resources, even if I do not use them. I would like to note, that this way of pricing and resource reservation should still be based on the actual request that I book in advance, as the operational concerns are usually abstracted in the serverless context.


Bas van Essen:

Do you think serverless architectures can/should only be defined as serverless architectures if the pay for what you use model is applied? That would not be the case anymore if you book in advance, as you do not know for sure if in practice you really consume it. We need a broader definition?

Johannes Grohmann:

I would not alter the definition. This proposal is first off only designed for a niche case of very big customers with constant load demands. They should still apply a rather conservative reserve, which they will definitely use and therefore is still pay-per-use. Additionally, the cost model proposed can be altered to just give a small discount on all used and early reserved/declared requests. All “normal” and non-reserved requests are charged at the normal rate. This way, it is pay-per-use in any case.


Bas van Essen:

Then the last point you describe as a challenge: a lack of standardization and benchmarks of FaaS platforms. This is one of the main issues that your SPEC FaaS research group are currently working on. Now very recently, we had a related interview about another research paper trying to tackle this problem. I do not know if you read that paper, but can I ask to check if it is in line with the frameworks you are creating to compare serverless architectures?

Johannes Grohmann:

I did not thoroughly go through the paper, but on first sight, it looks like the goals of this study are complementary to the benchmark we are working on. We have our focus on comparing FaaS platforms from a performance view and leave the functional offerings aside. Additionally, we want to provide a realistic benchmark suite, consisting of multiple realistic workloads and use-cases and want to ensure reproducibility. This way, we want to assure that the benchmark is not a snapshot comparison, but can be used to track the developments with respect to the performance of the individual FaaS platforms over time.


Bas van Essen:

Great that your kind of initiatives are put in motion. With regards to the roadmap, how would you prioritize the various performance challenges?

Johannes Grohmann:

We are currently prioritizing the last challenge, as our main focus is the creation of the benchmark. I want to use the opportunity to invite interested researchers and/or industrial partners to join our efforts and propose/discuss solutions for the aforementioned challenges.