Apache OpenWhisk Runtime for BallerinaLang

OpenWhisk is an open source serverless platform initiated by IBM and Adobe in the year 2016 and now being hosted in Apache Incubator. Out of the handful serverless frameworks available today OpenWhisk stands out with its platform independent design and the ability to extend language runtime support just by creating a Docker image with a simple HTTP proxy. A similar architecture was there in Apache Stratos sometime back. OpenWhisk supports Node.js, Python, Java, Go, PHP and Swift out of the box. It can currently be deployed on Docker, and Kubernetes and the community are planning to extend it to support DC/OS in the future. For evaluation and development purposes it can be run on a local machine either using Docker or Vagrant. The only dependency it has on the container platform is its container orchestration interface. It does not depend on any platform native features for function execution. All of its internal features have been implemented using its own components and by reusing industry proven technologies such as Nginx, CouchDB, and Kafka.

Recently Sanjiva made an open request in the BallerinaLang mailing list to implement a runtime extension for Apache OpenWhisk to support Ballerina functions. As a result of that, I evaluated OpenWhisk and implemented a new OpenWhisk runtime for Ballerina. In this article, I will explain how OpenWhisk works and how the runtime extension was written.

OpenWhisk Architecture

Image reference: https://github.com/apache/incubator-openwhisk/blob/master/docs/about.md

As shown in the above architecture diagram OpenWhisk deploys functions as actions. Actions are run on containers and can be implemented in any supported language. Functions would need to accept input parameters using a single JSON object and the output also would need to be provided in the same way. Actions can be triggered either using event triggers, OpenWhisk API gateway or using the OpenWhisk CLI. Event triggers use rules for forwarding events to functions to execute their tasks. The mapping between event triggers and actions can be dynamically changed as required. Actions can either implement business logic on their own or utilize backend services for providing the intended functionality. OpenWhisk also supports implementing service workflows by chaining a collection of actions together using sequences.

OpenWhisk Messaging Architecture

Image reference: https://github.com/apache/incubator-openwhisk/blob/master/docs/about.md

In OpenWhisk Nginx has been used as a reverse proxy for forwarding HTTP requests coming from event triggers, OpenWhisk CLI, and applications to its controller. The controller exposes a REST API for its system operations and invoking actions. Once an HTTP request is delegated to the controller, it will first engage the authentication and authorization processes. CouchDB has been used for managing the user and permission information. Thereafter, it will look up the relevant action from the CouchDB and forward the request to an internal load balancer. The load balancer would select an invoker instance for executing the action and dispatch the request to the selected invoker via the Kafka message broker for reliable message delivery. Once the message is accepted by Kafka a response message will be sent back to the client with an activation identifier. This process would allow actions to be executed asynchronously while guaranteeing its execution.

Once the request is received by the invoker via the message broker, it will check the availability of the action containers and spawn one in absence. Afterwards, the invoker will make an HTTP request to action’s HTTP proxy endpoint for initializing its state. Action container images are built by packaging an HTTP proxy and the language runtime. This proxy service includes two HTTP resources for initializing its state by downloading the function and executing the function. Once the function is initialized, invoker will dispatch the function request and wait for the response. The result of the function execution will also be stored in CouchDB against the previously generated activation identifier. The clients will be able to either query the activation and find the result of an action execution via the OpenWhisk API or invoke the action in synchronous mode and get the response directly.

Even though the above architecture diagram shows Consul being used together with the controller it has been removed at a later stage of the project due to difficulties in handling configurations via Ansible and executing deployments on container cluster managers.

Language Runtime Architecture

The OpenWhisk CLI has the ability to understand the language of the function by looking at its file extension. For an instance if a function is written in Node.js, an action can be created as follows:

cat > hello.js << EOL
function main(params) {
var name = params.name || 'World';
return {payload: 'Hello, ' + name + '!'};
}
EOL
wsk create action hello hello.js

However, at the moment metadata of native langauges have been directly added to the OpenWhisk CLI and for adding a new native language the CLI needs to be recompiled. At the same time in the backend the controller keeps tack of the available language runtimes and their Docker images using a configuration file called runtimes.json. This file is only used for prefetching language runtime Docker images at the OpenWhisk installation time. Nevertheless, a new language can be added to OpenWhisk without the above feature by providing the runtime Docker image tag using a parameter to the wsk create action command:

wsk create action hello <function>.<file-extension> --docker <language-action-docker-image>

The language action Docker image would need to be created by implementing an HTTP proxy and packaging the language runtime together. The HTTP proxy would need to expose two HTTP resources as follows:

Initialization Request

HTTP POST /init
Headers: Accept-encoding=gzip,deflate, Accept=application/json, Connection=Keep-Alive, Host=<host>:8080, User-agent=<user-agent>, Content-type=[application/json], Content-length=<content-length>
{"value":{"name":"<function-name>","binary":false,"main":"<main-function-name>","code":"<function-source-code>"}}

Function Execution Request:

HTTP POST /run
Headers: Accept-encoding=gzip,deflate, Accept=application/json, Connection=Keep-Alive, Host=<host>:8080, User-agent=<user-agent>, Content-type=application/json, Content-length=<content-length>
{"<function>":"<request-playload>"}

The initialization request will only be sent once for an action container for initializing it state and the execution request will be sent each time a function request needs to be processed.

Ballerina Runtime for OpenWhisk

According to the above language runtime architecture, the Ballerina runtime for OpenWhisk was implemented using an HTTP proxy and a Docker image. The HTTP proxy was implemented in GoLang using gorilla/mux library. In the proxy the function initialization HTTP resource reads the HTTP request body, writes the function source code to the file system and builds it using Ballerina tools distribution:

type Value struct {
Name string `json:name`
Binary bool `json:binary`
Main string `json:main`
Code string `json:code`
}
type Init struct {
Value Value
}
func InitHandler(writer http.ResponseWriter, request *http.Request) {
// Decode request message body
decoder := json.NewDecoder(request.Body)
var init Init
err := decoder.Decode(&init)
if err != nil {
panic(err)
}
defer request.Body.Close()
  // Write function to a ballerina file
content := []byte(init.Value.Code)
fileName := "function.bal"

err = ioutil.WriteFile(fileName, content, 0644)
if err != nil {
panic(err)
}
  if !init.Value.Binary {
// Compile ballerina function
_, err = exec.Command("sh", "-c", "ballerina build function.bal").Output()
if err != nil {
panic(err)
}
}
  writer.WriteHeader(http.StatusOK)
}

The function execution HTTP resource was implemented by executing a shell command for executing the function and reading the response using the standard out (STDOUT):

func RunHandler(writer http.ResponseWriter, request *http.Request) {
// Execute ballerina function
out, err := exec.Command("sh", "-c", "ballerina run function.balx").Output()
if err != nil {
panic(err)
}
log.Printf("%s", out)
writer.Write(out)
}

With this implementation a Ballerina function can now be deployed and executed on OpenWhisk as follows:

cat > hello.bal << EOL
import ballerina.io;

function main (string[] args) {
var output = { "hello": "world!" };
io:println(output);
}
EOL
wsk action create hello hello.bal --docker imesh/ballerina-action
wsk action invoke hello --result
{
"hello": "world!"
}

The source code and instructions of this implementation can be found in the below Github repository:

https://github.com/imesh/openwhisk-runtime-ballerina

Conclusion

Apache OpenWhisk is one of the best serverless frameworks available today. It’s still under incubation in Apache Software Foundation (ASF) and its first release is now being planned in the Dev mailing list. OpenWhisk supports Node.js, Python, Java, Go, PHP and Swift as native programming languages. Support for any other language can be added by simply implementing an HTTP proxy and a Docker image by packaging the language runtime. That approach would need to specify the Docker image tag in the “wsk action create” command as a parameter. If a new native language needed to be added the CLI and few other components would need to be recompiled. I’m now trying to improve this process with the help of OpenWhisk community by using the runtimes.json file to specify the file extensions and other meta information needed for adding a new native language runtime.

References

[1] Serverless functions in your favorite language with OpenWhisk, rodric rabbah: https://medium.com/openwhisk/serverless-functions-in-your-favorite-language-with-openwhisk-f7c447558f42

[2] Apache OpenWhisk Documentation: https://github.com/apache/incubator-openwhisk/tree/master/docs

[3] Apache OpenWhisk Website: https://openwhisk.apache.org

[4] Apache OpenWhisk Blog: https://medium.com/openwhisk