Lean OpenWhisk: Open Source FaaS for Edge Computing

Published in

Apache OpenWhisk

7 min readJul 5, 2018

This Blog is co-authored with Pavel Kravchenko as part of our work at IBM Research — Haifa*.

Unless you have been living in a cave for the last few years, chances are that you’ve encountered Function-as-a-Service (also known as “serverless” computing) and Edge Computing buzzwords.

Consider a typical IoT setting depicted in Figure 1. There is an Edge Node (sometimes referred to as Gateway) that has a direct connectivity (e.g., via WiFi or Bluetooth) to the physical devices, such as sensors and actuators on the one side, and to a Cloud on the other side. A typical Edge Node would be a Linux box that can execute workloads in proximity to the managed physical devices. This facilitates shorter control loops, saves bandwidth, and provides for autonomous operation.

Figure 1: Typical IoT Edge Computing Topology

How is this connected to serverless, you might ask? The most obvious connection is that many edge workloads (especially the IoT ones) are event-driven and the serverless programming model is an ideal fit for event-driven applications.

If you are a user of Amazon AWS Lambda, then you might be aware of Amazon Greengrass, an edge solution allowing you to execute Python 2.7 Lambda functions on the Edge Node. Likewise, if you are an Azure Functions aficionado, you might know that Azure Functions are offered for preview as an IoT Edge module. But what if you fancy a community driven open source FaaS framework, such as Apache OpenWhisk (if you are not familiar with OpenWhisk, take a few moments to read about it)? Can you install it on the Edge Node and run your OpenWhisk actions there?

The answer is: it depends. If you have a highly capable Edge Node at your command, probably all you have to do to equip your Gateway with the serverless capability, is to install Apache OpenWhisk . However, in many other typical scenarios, your Edge Node might be resource constrained. It might have only a relatively small amount of main memory, limited disk space and just one or two cores.

If this is your case, using Apache OpenWhisk out of the box might not be a feasible solution. For example, consider Raspberry Pi 3 as your Gateway. It has just 1 GB of RAM. Installing a full blown Apache OpenWhisk would require about 2.5 GB of RAM (just the core components, before running any actions). Obviously, this is a non-starter.

Apache OpenWhisk is a product grade multi-tenant system that aims at executing tens of thousands of user actions concurrently in a large compute cluster. But in the discussed use case, we only have one resource constrained Edge Node. In all likelihoods we do not need all the components of Apache OpenWhisk to obtain the same functionality, do we?

So, the question is: can we have a “lean” OpenWhisk distribution, which would be a fully functional Apache OpenWhisk, based on the same code, but much smaller? Fast forwarding to the solution description part of this blog, the answer is yes. Before getting to that part, let’s outline the challenges that we face:

We need to significantly downsize Apache OpenWhisk (to make it fit the smaller form factors) without touching its core components, especially Invoker and Controller;
We do not need yet another project just for Lean OpenWhisk; rather we want to continue making use of the Apache OpenWhisk code base as-is and consume all the up-stream changes.

Architecture

Figure 2 recaps the Apache OpenWhisk architecture. See an excellent explanation by Markus Thömmes on what’s under the hood in Apache OpenWhisk (note that Consul is no longer part of the architecture).

Figure 3 shows the structure of Lean OpenWhisk. Our architecture takes an advantage of the OpenWhisk pluggable architecture supported through the Service Provider Interface (SPI) pattern. The Controller component of OpenWhisk instantiates a Load Balancer using SpiLoader. We implement our own Lean LoadBalancer that is loaded via the the same mechanism, but with a different factory class name that provides a Lean LoadBalancer instance. In the Lean OpenWhisk distro, Controller and Invoker are compiled together. We refer to this as Controller-Invoker. The Lean LoadBalancer directly calls the Invoker’s method for handling an action queued in the in-memory queue.

We use an in-memory queue object that mimics Apache Kafka. Removing Kafka from the architecture dramatically downsizes Lean OpenWhisk. We define a new Gradle project to build Lean OpenWhisk from the existing OpenWhisk code base. An Ansible script is used to deploy Lean OpenWhisk with the right options.

Getting Started

And now, it is time to take Lean OpenWhisk for a ride! The Lean OpenWhisk git fork can be found here. Just follow the instructions on how to install and run Lean OpenWhisk. You have a number of options to select from:

Quick start using Docker Compose
Quick start using Vagrant
“Native” installation of Lean OpenWhisk (recommended only for developers)

Performance Evaluation Setup

After installing Lean OpenWhisk, you might be interested to run some simple stress tests to see how well it performs in your environment (we will be happy to hear back from you about your experience).

Obviously, we need some action to run. To this end, we first define a simple Node.js action and save it as sleepy.js in the local file system. This action is anything but sophisticated: it just sleeps for a number of milliseconds as specified in its argument and echoes back the argument.

function main(args) {
    return new Promise(function(resolve, reject) {
      setTimeout(function() {
        resolve({ timeout: args.timeout });
      }, args.timeout);
   })
}

Second, we create a sleepy action in Lean OpenWhisk, using wsk CLI just the same way we would have created it with the regular OpenWhisk:

$wsk action create sleepy sleepy.js

We use this simple action only to stress our Lean OpenWhisk implementation and obtain some performance baseline reflecting the action invocation overhead.

For this blog, we use an Oracle VM Virtualbox machine based on osboxes image of Ubuntu 14.04, with 1 GB RAM and 1 CPU core to run Lean OpenWhisk. The hardware architecture is Intel(R) Core(TM) i7–6820HQ CPU @ 2.70GHz (in one of our followup blogs, we will show how to use Lean OpenWhisk on Raspberry Pi 3).

Results

In our tests, we vary two key parameters: the concurrency of the back-end (i.e., how many simultaneous actions Lean OpenWhisk can execute) and that of the loadtest client (i.e., how many simultaneous action invocations are issued by the client). We vary these parameters in a lockstep from 1 to 5. The action sleeps 30 ms and then returns the timeout argument.

For each concurrency value, we replicate the experiment 30 times. Figure 4 and 5 represent our results for throughput and latency, respectively (latency in Figure 5 shows the time that requires to invoke the echo actions. We subtract 30 ms from the total latency result to study the invocation overhead.).

As one can see, at concurrency level 1, Lean OpenWhisk attains median throughput of 24 requests per second with very low standard deviation. As Figure 5 shows, at this concurrency level, the median latency of action invocation is 14 ms. Again, the variance is very small. As we increase concurrency both at the backend and in the client, throughput increases as well, but at the same time, the latency of invocation grows and the variance increases significantly. We observe that at the concurrency level 5 Lean OpenWhisk arrives to a saturation and it does not make sense to increase concurrency beyond this point for our resource constrained environment.

Figure 4: Throughput of Lean OpenWhisk (requests per second) as a function of the backend concurrency

Figure 5: Latency of Lean OpenWhisk as a function of the backend concurrency

Conclusion

So, hopefully, you had some fun installing and testing Lean OpenWhisk. However, at this point you might wonder how to put it to work for your IoT applications at the edge? Indeed, Lean OpenWhisk could be but one important building block in your edge solutions. In our next blog we will show how to build a simple, but fully functional IoT application that combines a serverless edge and a cloud. Stay tuned!

—

* This project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 761699

DISCLAIMER

All views expressed in this blog are those of the authors and do not represent the opinions of IBM. The content of this blog cannot and should not be viewed as an explicit or implicit description of any IBM product or service or construed as a product or a service plan or an indication of intent by IBM. Any action you take upon the information in this blog is strictly at your own risk. Neither the authors nor IBM will be liable for any losses and damages in connection with the use of the code and information published in this blog.