Announcing GraalWasm — a WebAssembly engine in GraalVM

Published in

graalvm

10 min readDec 2, 2019

We’re happy to announce the initial public work on GraalWasm — the WebAssembly engine implemented in GraalVM. GraalWasm currently implements the WebAssembly MVP (Minimum Viable Product) specification, and can run WebAssembly programs in the binary format, generated with compiler backends such as Emscripten.

Supporting WebAssembly expands the set of languages GraalVM can execute with a whole other set of languages to the ones supported by GraalVM and is further step towards making it a universal platform for programming language execution. This feature was also highly requested by the GraalVM community and we are happy to share our first results.

Note that this is a very early implementation, and GraalWasm is currently in experimental mode. We are working on extending our test suites and benchmarks, and we plan to improve performance and to implement WebAssembly extensions in the future. Feedback and open-source contributions are very welcome!

This blog post will cover what WebAssembly is, how GraalWasm was implemented in GraalVM, and how to use it.

A considerable part of the work on GraalWasm was the result of the successful internship of our recent intern Ergys Dona. If you are looking for exciting internship projects with direct impact on the tech world, check out the GraalVM internship program.

A brief introduction to WebAssembly

WebAssembly is a portable binary instruction format for executable programs, whose main goal is to enable high performance applications on web pages, but also to be embedded in other environments, such as IoT devices. WebAssembly is designed to be run on an abstract stack-based virtual machine, can be parsed faster than JavaScript, and its binaries are smaller than the corresponding JavaScript programs due to a very compact code representation.

WebAssembly specifies two formats: textual and binary. The textual format is intended to be human-readable — below is an example of a recursive factorial function written in C, and translated to textual WebAssembly:

Recursive factorial in C.

The corresponding WebAssembly looks as follows:

The get_local 0 instruction pushes the first argument n to the expression stack. The i64.eqz instruction pops this argument from the stack, checks if it is zero, and pushes either 1 or 0 back to the stack. The if instruction demonstrates a very interesting aspect of WebAssembly - its bitcode format is semi-structured. An if instruction is followed by a linear block of instructions, after which an else can optionally follow, and an end that marks the termination of the if construct. Completely unstructured jumps are thus not possible, which makes it much easier to understand the WebAssembly code.

Consider what happens in the if branch: an i64.const 1 instruction pushes a constant (the default recursive case) to the stack. In the else branch, the argument n is pushed twice to the stack, the topmost value is decremented by one with the i64.const 1 and i64.sub instructions, and call 0 is used to invoke the function with the index 0 (which is in this case the fact function). Its return value is then multiplied with the n that was initially pushed to the stack. The final value that remains on the stack is the return value of the function.

The binary format is a one-to-one mapping of the textual format, and is intended for parsing and compilation. It is designed to be compact, yet efficient to parse and validate. The GraalWasm engine executes binary WebAssembly files, which are intended as the default method of distributing WebAssembly programs.

GraalWasm — implementing a WebAssembly engine inside GraalVM

This section lays out some technical details of our implementation. To implement GraalWasm, we used GraalVM as a platform that provides an efficient partial evaluation engine. Using GraalVM’s Truffle API, we first implemented an interpreter for WebAssembly binaries.

WebAssembly’s semi-structured format allowed us to easily recover the control-flow structure of the programs, which allowed our in-memory data structures that store the code to be represented as ASTs. An interpreter for programs represented with ASTs can be written in a very straightforward manner. However, while the AST-based data-structures are arguably easier to inspect and manipulate, they do have a disadvantage of introducing additional memory overheads. On the other hand, bitcode-based code representation does not require instantiating a tree node for each basic instruction. This is why bitcode-based GraalVM interpreters, such as the Sulong for LLVM, typically have a smaller memory footprint.

Since each WebAssembly block contains just linear sequence of instructions, GraalWasm was able to combine the best of both interpreter approaches — an AST is super-imposed on top of the WebAssembly’s control-flow instructions, such as if and loop, but each block is represented with a single Truffle AST Node, called a Wasm block node. This reduces the memory footprint, because individual instructions within each block do not require separate node objects. Furthermore, GraalWasm block nodes do not copy the parts of the original instruction stream - instead, they just contain pointers into the byte array of the WebAssembly binary.

Correspondence between textual WebAssembly, binary WebAssembly and a GraalWasm AST

The interpreter implemented on top of this data structure is a hybrid between AST-based interpretation and bitcode-based interpretation. On the higher, control-flow level, it dispatches between appropriate basic blocks. Within each basic block, interpretation is done inside an interpretation loop that iterates of the opcodes of that basic block. This design made the interpreter easier to comprehend, and simplified its partial evaluation.

At runtime the interpreter and the program are passed to Truffle’s partial evaluation engine, which then specializes the interpreter to the program, and passes the specialized code to the GraalVM compiler, which finally produces efficient assembly code for the target platform.

Installing and running GraalWasm from GraalVM

Edit: since GraalVM 20.0 release GraalWasm can be installed like a normal GraalVM component with the gu tool: https://www.graalvm.org/docs/reference-manual/languages/wasm/.

GraalWasm can be installed with GraalVM 19.3.0 using GraalVM's gu tool. Since the current development version of GraalWasm is 20.0.0-dev, by default the gu tool will not allow you to install GraalWasm with GraalVM 19.3.0. To overcome this, we can use the --force flag to override the version check during the installation of a GraalVM component.

The first step is to download the latest development version of GraalWasm from the list of GraalVM’s development releases. Simply select the JAR that corresponds to the JDK version of your GraalVM download, and to your platform (currently, Linux and OSX are pre-built). For example, if you use JDK8 version of GraalVM on Linux, then you should download wasm-installable-java8-linux-<nightly-timestamp>.jar, where the “nightly-timestamp” is just a unique value of the nightly build.

Then, run the following command-line to install GraalWasm:

$ graalvm-ce-java8-19.3.0/bin/gu install --force -L wasm-installable-java8-linux-<nightly-timestamp>.jar

The gu tool will install the new GraalVM component, at which point you can invoke the wasm launcher as follows:

$ graalvm-ce-java8-19.3.0/bin/wasm

The launcher will complain that it needs a WebAssembly module to run:

ERROR: Must specify the binary name.

You can find several WebAssembly modules precompiled from C using Emscripten in this repository. For example, we can download the WebAssembly program that prints Floyd’s triangle here, and run it as follows:

$ graalvm/bin/wasm --PredefinedModules=env:emscripten floyd.wasm

Note that we added the --PredefinedModules=env:emscripten flag, which instructs GraalWasm to link this binary against a predefined emscripten module named env, which contains certain system functions that the Emscripten toolchain normally embeds into the JavaScript file that it generates together with a WebAssembly module. In the future, we will support the WebAssembly System Interface (WASI) as a predefined module, which will become a standard way to run WebAssembly programs outside the browser and will be supported by most WebAssembly toolchains.

Building and testing GraalWasm

You can find the GraalWasm implementation in the GraalVM repo at GitHub. If you would like to build GraalWasm to play with it or contribute back, please follow these steps:

Download the mx build tool from its GitHub repo, which is used to build all GraalVM projects.
Clone GraalVM from GitHub.
Make sure that you have the latest JVMCI-enabled JDK.
Set the JAVA_HOME environment variable to point to your JVMCI-enabled JDK.
In the wasm directory of the GraalVM repo, run the following command line:

$ mx --dy /truffle,/compiler build

This will invoke the Mx build tool, and build the wasm.jar file in the mxbuild/dists/jdk<version> directory. To now run the WebAssembly tests from the GraalWasm suite, you will also need to download the WebAssembly binary toolkit, which the test suite uses to translate textual WebAssembly files to binaries.

Download the WebAssembly binary toolkit release.
Set the WABT_DIR variable to the path to the root folder of the WebAssembly binary toolkit.

The WebAssembly tests are organized into several different suites. The following command runs all the tests from all the suites:

mx --dy /truffle,/compiler --jdk jvmci unittest \
  -Dwasmtest.watToWasmExecutable=$WABT_DIR \
  -Dwasmtest.testFilter="^.*\$" \
  WasmTestSuite

It is also possible to select specific tests by specifying its name with the -Dwasmtest.testFilter regex flag. The following command will run all the tests that contain if in their name:

mx --dy /truffle,/compiler --jdk jvmci unittest \
  -Dwasmtest.watToWasmExecutable=$WABT_DIR \
  -Dwasmtest.testFilter="^.*if.*\$" \
  WasmTestSuite

You should see the following output:

-------------------------------------------------------------------
Running: BranchBlockSuite (4/16 tests - you have enabled filters)
-------------------------------------------------------------------
Using runtime: org.graalvm.compiler.truffle.runtime.hotspot.java.HotSpotTruffleRuntime@7b1d7fff
😍😍😍😍                                
Finished running: BranchBlockSuite
🍀 4/4 Wasm tests passed.

-------------------------------------------------------------------
Running: IfThenElseSuite (4 tests)
-------------------------------------------------------------------
Using runtime: org.graalvm.compiler.truffle.runtime.hotspot.java.HotSpotTruffleRuntime@7b1d7fff
😍😍😍😍                                
Finished running: IfThenElseSuite
🍀 4/4 Wasm tests passed

In addition to raw WebAssembly tests, GraalWasm comes with a suite of C-based tests, which is not a part of the default build because it requires extra dependencies to build. To run it, you need to install the Emscripten SDK on your system (we currently use Emscripten 1.38.45). You should follow the installation steps outlined in the Emscripten documentation. Once you install the Emscripten SDK, you should set your EMCC_DIR environment variable to point to the fastcomp/emscripten/ subdirectory of the SDK.

This will allow building the non-default targets with the following command line:

mx --dy /truffle,/compiler build --all

You can then run the additional C test cases as follows:

mx --dy /truffle,/compiler --jdk jvmci unittest \
  -Dwasmtest.watToWasmExecutable=$WABT_DIR \
  -Dwasmtest.testFilter="^.*\$" \
  CSuite

Embedding GraalWasm into a Java program

GraalWasm, like all other language implementations in GraalVM, can also be accessed using GraalVM’s Polyglot API, which allows embedding the GraalWasm engine into custom Java programs. Below, we show a minimal example of how to run a WebAssembly program from a Java application using GraalWasm.

Let’s assume that you have a simple C program that just returns 42 in a file main.c, and that you translated this program to a WebAssembly binary called main.wasm. You could generate such a binary using an existing compiler that translates languages such as C and Rust to WebAssembly. Alternatively, you could translate the C program using the online WebAssembly Studio:

The WebAssembly studio translates this C program into a WebAssembly binary, whose textual version looks as follows:

The WebAssembly binary file main.wasm (which can be downloaded from WebAssembly studio, or generated using e.g. Emscripten), can then be run by GraalWasm.

In the following example, we will create a new unit test in the GraalWasm test suite, called WasmExampleTest. We first create a new file wasm/src/org.graalvm.wasm.test/src/org/graalvm/wasm/test/WasmExampleTest.java. In this file, we read our WebAssembly binary into a byte array, and create a Polyglot Context object for the WebAssembly language. Note that the identifier wasm is used to refer to the WebAssembly programs in GraalVM. We then create a Source.Builder for our main.wasm binary, and we invoke build to create a Source object, which can then be parsed.

To parse and validate the Source object corresponding to our WebAssembly program, we call the Context's eval method:

context.eval(source);

If the WebAssembly binary complies with the specification, then eval will complete without throwing exceptions. It will also insert the functions of the respective WebAssembly binary into the Polyglot context that was used during parsing. These functions can be looked up using their respective names from the binary.

The next step is then to obtain a handle to our main function, and execute it, which is done as follows:

Upon completion, the result object will become a Java Integer with the value 42.
You can run this test with the following command line.

mx --dy /compiler,/truffle --jdk jvmci unittest WasmExampleTest

Future plans

The source code of the GraalWasm implementation is currently on GitHub within the main GraalVM repository and we plan to improve it in the upcoming 20.x releases.

One of the motivations behind GraalWasm is to extend the set of APIs supported by the GraalVM’s node.js implementation. The addition of the WebAssembly support will allow it to implement the V8-compliant API functionality that loads WebAssembly binaries.

An immediate next step will be the implementation of the WebAssembly System Interface (WASI), which is necessary to run WebAssembly programs outside of the web context. WASI is a set of APIs that abstracts access to various operating system features, such as the file API, network sockets, and clocks. We plan to support WASI as part of GraalWasm.

We will focus on improving performance. Our initial experiments and performance tuning on several C microbenchmarks showed that GraalWasm currently achieves a peak performance of roughly 0.5x to 0.75x when compared to a native GCC binary, compiled with the highest optimization level. These initial results are good, but there is much more to be done — aside from bringing GraalWasm closer to GCC’s peak performance, the next step will be to do performance tuning on larger, more serious benchmarks.

Yet another future step is improving the debugging support in GraalWasm, and integrating it with the rest of GraalVM. In particular, we will work on extracting the symbol and source map information that some compilers embed into WebAssembly binaries, with the goal of allowing our GraalVM tooling to map the code locations and the raw memory layout to the constructs from the original source code.

We will keep posting updates about WebAssembly on GraalVM, so stay tuned. If you have feedback or feature requests, please create an issue in the Github repository or talk to us on Twitter: @graalvm.