WebAssembly workout — back to the future part 1: compilation and memory
I wanted to write this article as I’ve seen a number of posts talking about WebAssembly but few that detail more ‘real-world’ use-cases. So if you were looking for code and a discussion of something beyond a C function for adding and returning a number, this is for you.
Repo and demo for these articles: wasm-render and demos.alanmacleod.eu
Javascript 2017
I’ve fully embraced the deliciously absurd macrocosm of modern Javascript development; its dynamic, weakly typed, pain in the arse transpilation pipeline, any given framework’s maddening obsolescence before graduating to v0.0.2
. But also amazing new kit like AR.js and interesting new vendor APIs popping up regularly like Shape Detection, Web Share etc in addition to the already well established big guns; WebRTC, WebAudio, WebGL etc.
It’s an exciting time to work in this field. Make no mistake friends, this is a (second? third?) Golden Age of the web.
But the one I’ve really been waiting for is WebAssembly. Because despite all the cool stuff above, I miss coding in the best language ever; C.
asm.js
Quick recap; the WebAssembly idea started with asm.js (in, omg, 2013) which enabled compilation of C/C++ source into a small subset of Javascript which could then be fully compiled and optimised by the browser before executing, as opposed to the standard, but slower, JIT compilation. The aim of asm.js and WebAssembly is to achieve near-native execution speed, as fast as you get with a real C/C++ sourced binary.
Mozilla was at the forefront of this new tech and they partnered up with Epic to produce the above incredible demo in allegedly just four days. It demonstrated two pretty disruptive things to me;
- The browser really could do anything native desktop could
- The rich, varied and untapped reservoir of skills and resources now had a direct route to the web
Theoretically you could compile, say, the Adobe Photoshop source code with Emscripten and run it in Chrome. Additionally, engineers who specialise in C/C++ suddenly had a whole new potential area to move into.
But more importantly to moi: “yay C”.
Moving things forward
WebAssembly (“Wasm”) grew out of asm.js, I guess they had a few different goals and reasons but for me, asm.js — while awesome — felt a bit hacky. Wasm is pretty much the same concept, but with a better file format, vendor buy-in, a solid roadmap and scope to add new features (such as DOM and system API access!)
Wasm reached “cross-browser consensus” in March 2017 and I felt it was time to dive-in and rev the engine (bro tip: always pull the throttle cable manually on the carburetor when people are watching). You can only do the basics with it right now, like call some simple C functions. Fortunately, right at the start I stumbled across a generous helping hand to get going. wasm-init is a package that does all the hard work setting up a Wasm project (trust, it’s a pain to do yourself) and makes it easy to compile your C/C++ projects. So have a look at the repo if you’re interested.
Compilation
When you compile a C/C++ source with Emscripten, you get two things;
- a
.wasm
binary file - a bootstrap file for the binary (in Javascript)
The Wasm docs are a bit confusing, but I think the eventual goal is to just have the single wasm binary. But for now at least, you absolutely need the .js bootstrap file Emscripten spits out after compilation. This part’s a bit messy but wasm-init provides a template demonstration on how to do it. Essentially you load the Wasm binary manually (via fetch()
or whatever) and then attach the data as an ArrayBuffer
to the global scope (eugh) and then execute the bootstrap script provided by Emscripten. Eventually you get a WasmInstance
returned via a Promise. The WasmInstance
object has the exported symbols from your Wasm/C modules attached, e.g. in the C/C++ module:
and after compiling and loading, you access and execute that code from Javascript like so:
Note the exported symbol has an underscore prefix. I’m not sure if that’s down to wasm-init or it’s a universal side-effect from Emscripten.
Memory
Next thing I needed to nail down is how to get my C routines to read and write data accessible by my Javascript code. You can’t just pass your C routines a pointer ref to your Javascript arrays. Again, it’s still early days in Wasm for a few things; memory in particular:
Every WebAssembly instance has one specially-designated default linear memory which is the linear memory accessed by all the memory operators below. In the MVP, there are only default linear memories but new memory operators may be added after the MVP which can also access non-default memories.
What they’re saying here is you can’t allocate any useful memory in the Wasm MVP. Sort of. I discovered the Emscripten bootstrap file does a number of things and one of them is the setup of the above mentioned “specially-designated default linear memory” which it turns out is 16 MB of heap space for the C/C++ modules to use. Critically, this is also accessible Javascript-side via an ArrayBuffer
property in the WasmInstance
called buffer
. You shouldn’t just write to this arbitrarily though, it’s the C/C++ code’s heap! Instead, after some more exploration and numerous console.log() dumps, I discovered there was a good old fashioned malloc
function memory manager attached to the Wasm instance which allocates memory inside this heap for you. It returns an offset into the buffer
which you can then construct a Javascript TypedArray ‘view’ from;
You can read/write to that js_buffer
TypedArray as you would normally. Now, the cool part is heap_pointer
is just a number. Specifically, a byte offset into the Wasm heap. So, when calling your C/C++ module’s code you can pass heap_pointer
as a parameter to the C/C++ function like a normal pointer;
And thus we have real pointers and shared dynamic memory between JS and C. So with that mini triumph, I wrote a SharedMemory class to handle all of this. You might be questioning if there’s a performance penalty accessing this special heap buffer compared to a regular JS TypedArray. And the answer is “not really”:
I benchmarked the two array types to check and they’re pretty much identical. Bizarrely, the Wasm view consistently measured a touch faster at least on my recent version of Chrome.
Getting memory “working” was critical to my test project. Without this, there wasn’t really much I could do other than pass a few params back and forth between JS and C.
Next
Clearly drunk in the smug old-school haze of C/C++, I decided a good test of Wasm performance would obviously be something a bit old-school that was simple but demanding of modern CPU hardware and kinda visually interesting. Thus, I decided to write two 3D software rasterisers; one in Javascript and one in C and see how each stacked up against the other.
Onto part II….