At Fluence, we use WebAssembly (Wasm) to run applications in a trustless network of independent nodes. This trustless environment mandates that every piece of the code executed by any network node must be verified by another node to check whether the execution was performed correctly. In order to be verifiable, every computation must be made deterministic, which means that to run WebAssembly code in a trustless network, we need to make its execution deterministic.
There are a great many different tools that will allow a developer to compile C/C++, Rust, or AssemblyScript code to WebAssembly. But it is not enough to give developers just the standard library — it is also important to give them the ability to interact with services and libraries they already know.
In this post, we share our experience of porting an existing open-source software package — the data structure server Redis — to WebAssembly. While this is not the first time that Redis has been ported to Wasm (see this port by Sergey Rublev), it is the first time to our knowledge that the obtained port can be run deterministically.
There are three primary sources of nondeterminism in WebAssembly: external functions invocation, NaN payloads, and VM resource exhaustion. The first one is the most problematic; currently, we deal with it by simply prohibiting any interactions with the host environment from Wasm programs. This means that submitted WebAssembly code should not contain any external imports.
Other nondeterminism sources are less complicated: while nondeterministic NaN payloads require special handling of floating point operations by the host VM and nondeterministic resource exhaustion requires pre-allocating enough memory and stack, these peculiarities are related only to execution of the WebAssembly code and do not greatly affect how the software is ported to Wasm.
If you’d like to learn more, we have discussed those difficulties (and few others as well!) in our Fast, deterministic, and verifiable computations with WebAssembly talk (slides).
Another constraint of our computation engine is that it works only in the request-response model: Wasm code can only be invoked in response to a request submitted by the user. This means that the ported application should not perform any background work such as running cron-like operations or background threads.
The good news is that Redis is mostly based on an event-driven model with a single event processing thread, which fits quite well into the request-response architecture. While Redis does employ background threads and external I/O operations, the clean separation of these functions in the Redis source code has allowed us to strip them off without too much hassle.
Translation to WebAssembly
Another option is to use clang and a special version of the libc library. The main property this library should have is to be able to use some POSIX-like syscalls without the need to import their implementation from the host environment. First of all, it needs to support the dynamic memory management capability that the majority of C/C++ programs use. This capability relies on just a few syscalls: brk-family and mmap(). For programs that don’t use other OS specific functions related to filesystem, interprocess communication and other OS-specific subsystems, it allows to obtain pure Wasm code without any external dependencies. And, surprisingly, there exists a musl fork maintained by Jf Bastien which can be used for compilation using wasm-unknown-unknown target.
While this method was quite reasonable for our needs, we found that WASI, which provides the most suitable libc version for our requirements, was even better.
Let’s quickly review the very high-level decomposition of Redis.
Redis consists of several major parts:
- the core, which orchestrates requests, manages records, and computes usage statistics
- data structures, which can be used by users to store their data in different ways
- the networking subsystem, which is responsible for connectivity
- the cluster subsystem, which is used to group individual Redis instances into the Redis Cluster
- the module subsystem, which allows the creation of Redis plugins
- the multi subsystem, which manages transactions
- the scripting subsystem, which lets users write Lua scripts
- the events subsystem, which manages event queues
- the persistence subsystem, which takes care of synchronizing the Redis in-memory state to disk
Not every subsystem can be directly ported to our restricted environment. For example, the cluster subsystem does not fit well into the request-response framework, and the persistence and networking subsystems need to make external calls to function.
Consequently, we have decided to strip Redis of its not-so-essential functionalities and retain only the core (including data structures, of course!), scripting, and rudimentary networking subsystems.
As we have already mentioned, Redis uses an event-driven model to process client requests. The main Redis thread handles tasks such as
- receiving client requests and sending back responses
- performing lazy (e.g., table rehashing) and asynchronous operations
- executing blocking operations (e.g., blpop or bzpopmax)
- updating statistic counters
There are two types of events used: time events for cron-like jobs and timers in the module subsystem, and file events to work with sockets and all the other stuff. This event architecture allows all the core components to communicate through the main thread.
Redis has several additional threads that are primarily responsible for lazy operations such as object deletion, file closure, and disk synchronization. At the moment, threading support in WebAssembly is still just a proposal, and very few WebAssembly VMs support Wasm threads at all. But fortunately, these extra threads can be easily removed since Redis has several boolean flags to choose between synchronous or asynchronous eviction policies.
The diagram below illustrates how all the mentioned parts are connected to each other. Any Redis module can produce file or time events that will then be processed by the Redis core loop, which, in its turn, can then invoke another Redis module as the reaction to the event.
The Redis core also contains a large number of syscalls. Some of them manage Redis subsystems, some are used to obtain the current system time or generate random numbers, and others are used for event loop support. The good news, however, is that the Redis core could work without any of these calls as an in-memory storage engine if we do a bit of patching.
As an example, Redis uses current time information for multiple things, from removing expired keys to running cron-like jobs. While we wanted to keep this functionality, our approach demands that the code be fully deterministic, so we have chosen to increment “system” time after each client request and after each call of the time() or gettimeofday() functions.
As another example, to generate random numbers, Redis uses the rand() function from the C runtime initialized by the random number obtained from /dev/urandom. For now, we just manually set this initialization constant, but this is not really secure in the trustless environment because an intruder might be able to predict the next “random” number and pick such records that will cause Redis performance degradation. A better trustless setup would use an unpredictable PRNG such as Randao.
One more issue arose when we were porting the Lua interpreter, which is written in C and uses exceptions handling ultimately based on setjmp/longjmp C functions. These functions provide the capability to arbitrarily transfer control flow in the code, but because WebAssembly has only the structured control flow, setjmp/longjmp functions could not be compiled to it. Also, WebAssembly currently does not have explicit exception handling (it is just a proposal now), so we had a tough choice when porting the Lua interpreter to Wasm. For now, we chose the simplest way to “solve” this problem and substituted exception processing with a __builtin_unreachable() call.
Finally, we had to strip most of the asynchronous work from the Redis core and replace its event-driven model with a request-response that does not have queues or background work. In the diagram below, you can see that user requests are now processed directly by the Redis core, and cron jobs and other asynchronous operations are now called synchronously.
We have also prepared a very rough benchmark to compare performance between the vanilla Redis version and its WebAssembly port. To test the original Redis, we used the redis-benchmark, which interacts with the Redis instance through a socket API; to test the WebAssembly port, we have embedded the benchmark code into Redis itself, compiled it into WebAssembly, and launched the obtained Wasm code with Asmble.
Using the default benchmark setup, we got the following results on the m4.large AWS instance:
- vanilla Redis, client-server benchmark: 52.2s
- WebAssembly port run by Asmble, embedded benchmark: 160.1s
Of course, comparing the WebAssembly port to vanilla Redis this way is like comparing apples and oranges, but it should at least demonstrate the approximate slowdown of the port. We hope to construct a more accurate benchmark in the future!
If you would like to port a large project written in C/C++ to WebAssembly, this is the section for you!
There are some Wasm execution environments such as Wasmer or WAVM that allow running WASI-compatible code with imported syscalls. Also, they have an ability to trace WASI syscalls and could be used as a syscall tracers for debugging. If your task is porting something to Wasm without imports, you can use these environments to test semi-finished code to check that everything is good on the current step of porting.
Sometimes, code that contains errors such as null pointer dereferences might be successfully compiled to Wasm and executed by certain virtual machines without throwing an exception. This behavior is possible due to the special WebAssembly linear memory layout: thanks to wasm-ld, the data and the stack are both placed in the linear memory. This means that it is a good idea to compile the code to the native binary first for testing purposes — in this way, all the debug capabilities (such as sanitizers or red zones) of clang or gcc can be used.
Another potential danger lies in the fact that the compilation determinism might be broken by compiling into WebAssembly some C/C++ code that contains undefined behavior. In this case, the compilation output might depend on different subtleties such as compiler version, compilation options, and so on. However, once compiled, the obtained WebAssembly code can be run deterministically by the virtual machine.
Generally, it is a good idea to investigate and attempt to get rid of compiler or linker warnings. But this is especially important when compiling code to WebAssembly! Let’s review one of these seemingly innocuous warnings:
The reason for this error is that we failed to include stdlib.h. For some reason, the compiler used the signature of this function that returns an integer value whereas the original function does not return anything. As a result, we got an unnecessary drop instruction popping the stack head after each call of our function in the resultant WebAssembly code, which makes this code not conform to the WebAssembly specifications.
After investigating this issue, we found that the wasm-ld linker was responsible for such behavior, so hopefully it will get fixed soon :)
If you would like to learn more about the Redis architecture, check out these great articles: