Evaluation of Serverless Technologies at Jet

Khalid Hasanov
Jan 21 · 15 min read
Photo by Markus Spiske on Unsplash

Feature Evaluation Criteria for Serverless Function Runtimes

The features we identified as requirements for a serverless runtime were based on the goal of minimising adoption barriers and maintenance cost. The following list was identified as part of our evaluation:

  1. Supported Languages — We have multiple teams using different tech stacks and programming languages. To get wide adoption of serverless within the company, it is important that our choice of serverless runtime supports all the languages used across different teams at Jet.
  2. Event Triggers — Support for various triggers: HTTP, Kafka, Azure Cosmos DB, Azure Blob Storage etc. The use of HTTP triggers does not need much justification as they are probably the most trivial and most applicable triggers to a wide range of use cases. The other important trigger for us to evaluate was Kafka triggers as Kafka is our main streaming platform for asynchronous communication between microservices.
  3. Integration with Existing Infrastructure — All new microservices at Jet are deployed on Nomad, get transparent integration with Consul for service discovery, Vault for secrets management, Prometheus and Grafana for monitoring, and Splunk for log management. This means that if we can deploy our choice of serverless runtime on Nomad, we can get all those integrations with almost zero-cost.
  4. Complexity to Manage — A system requiring complex runtime dependencies could be difficult and costly to operate. We wanted to avoid any such serverless runtime unless there was a strong reason not to do so.
  5. Onboarding and Developer Tooling — Serverless as a new paradigm already incurs a different way of thinking for developers. Ideally, we would like to provide a serverless oblivious deployment pipeline for our developers, so that, if we decided to adopt a serverless tech stack, they could use existing tooling without the need to think differently.
Main Features of FaaS Runtimes
The Reasons for the FaaS Runtimes not Considered

Performance Evaluation Criteria for Serverless Function Runtimes

The main performance criteria for our evaluation were cold-start time and auto-scaler efficiency. These two are related, but are not the same thing. Let’s elaborate a bit more on this.

Evaluating the Benefits of Serverless Function Runtimes

The features and performance provided by a specific serverless function runtime do not matter if at the end of the day we cannot answer the question, “why serverless?”. Therefore, we decided to evaluate the benefits of serverless computing as claimed by the community: cost saving and increasing developer productivity.

Cost Saving

The proponents of this promise justify this by the fact that the serverless function runtimes provide nice features to scale down to zero and scale up from zero on-demand. This can potentially save some cost and we do some cost estimation below.

Cost Estimation

Let’s assume we are going to redesign about 1000 microservices (about 30% of our microservices) as serverless functions. We believe 30% here is over-optimistic, as not all workloads fit into a serverless model and even if they do, it would require a huge effort to redesign all these services and integrate them together.

Azure Resource Consumption Billing Calculation
--------------------------------------------------------------------Resource Consumption (seconds) per Function per 2 Minutes:
Executions: 24,000 executions
Execution duration (seconds): 1 second
Resource consumption Total: 24,000 secondsResource Consumption in GBs: 
      
           200 MB * 100 Servers / 1024 MB ~ 20 GBTotal GB-s per 2 Minutes per Function:
          
           20 GB * 24,000 seconds = 480,000 GB -sTotal GB-s for 24 Hours (4 min) per Function:  960,000 GB -sTotal GB-s per Function in 30 Days:            28,800,000 GB -s
--------------------------------------------------------------------Billable Resource Consumption
Resource consumption:                          28,800,000 GB -s
Monthly free grant:                          - 400,000 GB -s
Total monthly consumption per application:     28,400,000 GB -sMonthly Resource Consumption Cost per Function
Billable resource consumption:                 28,400,000 GB -s
Resource consumption price:                  x $0.000016/GB-s
Total cost per application:                    $454.4
--------------------------------------------------------------------Executions Billing Calculation
Total monthly executions:                      1,440,000 executions
Monthly free executions:                     — 1,000,000 executions
Monthly billable executions:                   440,000 executionsMonthly Executions Cost:
Monthly billable executions:                   440,000 executions
Price per million executions:                  $0.20
Monthly execution cost:                        $0.088Total Monthly Consumption Bill per Function:   $454.488
--------------------------------------------------------------------Total Monthly Consumption Bill for 1000 Functions: $454,488

Improving Developer Productivity

This is one of the biggest promise of serverless functions; however, our evaluation shows that this claim holds mostly for small start-ups or for individuals who do not have established developer tooling, continuous integration/continuous deployments, and container orchestrators in place. As a matter of fact, if we take how our developers at Jet deploy and manage their services in production as an example, we would not see any significant benefits of a serverless function runtimes in terms of infrastructure abstraction. Our microservices platforms already provide many layers of abstractions that hide most, if not all details of infrastructure details from our developers. Most of the time, deployment of a new microservice is a matter of pushing a deployment file into a version-control system. In addition, we have auto-scalers in place to automatically scale up the number of VMs as well as the number of containers. Usually, our developers do not need to think about the number of instances for their microservices running on Nomad, nor they need to think about any failure that may happen to a VM or a container.

Kafka Controllers in FaaS Runtimes

Experiments

We conducted some experiments to understand performance of Azure Functions and OpenFaaS on Nomad. The main performance criteria we considered were cold-start time of a single request and cold-start time under continuous load for a short period of time.

Experiments using Azure Functions

The first invocation of the function took 9567ms, about 76% of this spent in server processing which included warming up the function. Another 16% was spent in DNS lookup.

The First Invocation of the Azure Function
The Second Invocation of the Azure Function
Azure Functions Load Testing
Azure Functions Vegeta Report

Experiments using OpenFaaS on Nomad

The first invocation of the OpenFaaS function took 6556ms, about 80% of this spent in server processing which included warming up the function. Another 8% was spent in DNS lookup.

The First Invocation of the OpenFaaS Function
The Second Invocation of the OpenFaaS Function
OpenFaaS Load Testing
OpenFaaS Vegeta Report

Conclusion

Our research suggests that the current serverless technology stacks available to meet our needs are not yet mature enough to deliver the desired cost savings and productivity improvements we were looking for compared to our current microservice strategy. We do, however, see the potential for serverless computing; we would like to see serverless runtimes going beyond functions and provide sophisticated function controllers to abstract away IO interactions between different functions, between functions and external systems, and between functions and their triggers. This idea is not new; we have already implemented and used similar kind of abstractions in Jet’s Order Management System. Azure Durable Functions also provides a similar workflow engine. What is missing in these solutions is their generality; ideally, the controllers should support multiple languages and pluggable event sources such as Kafka. We believe it would improve developer productivity significantly and help in saving costs.

Should I use OpenFaaS or Azure Functions?

If you answer “yes” to the following questions, then it makes sense to use OpenFaaS or some other non-managed serverless function runtimes if they meet your needs:

  1. Do you have multiple teams using different languages?
  2. Are you a heavy user of Kafka and are looking for a Kafka controller?
  3. Are you ready to extend existing controllers if they don’t satisfy your needs?
  4. Should your function runtime be able to handle a high number of requests?
  5. Would you/your company like to have a full control of your function runtime?
  1. A powerful workflow engine, i.e. Azure Durable Functions
  2. Better developer-tooling such as integration with IDEs, local development and testing
  3. Support from Microsoft




Jet Tech

Sharing our engineering org’s learnings & stories as we build the world’s best experience to shop curated brands and city essentials in one place.

451

451 claps
Khalid Hasanov

Written by

Senior Software Engineer at Jet.com

Jet Tech

Jet Tech

Sharing our engineering org’s learnings & stories as we build the world’s best experience to shop curated brands and city essentials in one place.