The road to WebAssembly GC for OCaml

In the current implementation of the OCaml WebAssembly backend, the memory representations of values closely resemble those of the other more traditional OCaml backends. This works great if you can implement your own garbage collector, but doing so in WebAssembly is rather inefficient as you are basically duplicating the effort of the browser. We would need to implement a shadow stack next to the actual stack which significantly increases the download size. Also the interop with JavaScript will be less than ideal. So ideally we want to do GC in harmony with the browser, and let the browser handle GC for us. To be completely fair though, a ship-your-own-GC can be faster.

However, OCaml is not the only managed language that needs to be supported by WebAssembly. In fact the WebAssembly GC spec identifies three main categories: typed functional languages like OCaml, object oriented languages with nominal subtyping like Java, and untyped languages like Python. Together with the security demands of the web, creating a broad GC solution is definitely no easy task. In other words: it will take a while before WebAssembly GC arrives. This however also provides opportunity to give feedback.

In this article we will first look at how memory management works in OCaml, secondly we will look at the WebAssembly backend again, and finally look at how the WebAssembly backend is most likely going to transition towards WebAssembly GC.

Note that we are not going to discuss the details of WebAssembly GC — only the parts that are relevant for the WebAssembly backend.

OCaml memory management

Values

In OCaml the heap is segmented in blocks, where a block consists out of a header and values. The header describes what the block is supposed to be, how many values it contains, and what the GC status of that block is. Values can be either a float or an int. Values are tagged — they can either be an actual int or a pointer based on the least significant bit (the last bit) of the value.

A simple example of a memory block for a 32-bit architecture:

For the translation to WebAssembly, there are two important things to note here:

  • an int fits in 32-bits and therefore is unboxed. If the int was boxed it would require a pointer to a different memory block with a header as well.
  • a pointer also fits in 32-bits, and when two pointers are equal to each other it’s called reference equality.

Ideally we want to keep these properties when moving to a managed solution.

Allocations

Whenever a piece of memory is allocated it’s attached to either the global GC roots or the local GC roots. Whatever is at the top level of your program, can be considered a global GC root. Local GC roots are the connections that the call stack has to an allocated piece of memory. Global GC roots are visible within CMM, local GC roots are not. The important part here is that the OCaml runtime needs to know which memory is in use, and what can be reclaimed.

Garbage collection

The garbage collector uses the before mentioned GC roots to traverse the memory graph. It does so in two different phases, a mark and a sweep phase. During the mark phase all memory that is still in used is marked via the GC status flag, and during the sweep phase the memory is checked for blocks that can be reused. Note that OCaml’s GC is one of the best GC around, and often is better than a handwritten memory management solution.

Pointers

In CMM, functions can pass pointers to other functions. When the actual data needs to be retrieved, the memory is loaded behind that pointer. Note that pointers can pass several functions before they are actually used. With only pointers at function boundaries, it’s hard to know what the related memory block is supposed to be before actual usage. Also it happens often that not every value of a memory block is used in the same function.

WebAssembly backend

Currently in WebAssembly backend, memory is allocated in the available linear memory and we pretend that we have an infinite amount of it. Ints and floats are translated to int32’s and float32’s respectively. Pointers translate to int32’s as well. Overall this should look very similar to what other backends do.

Transitioning towards WebAssembly GC

Outline

To move beyond the current limitations of the WebAssembly backend, we need to move towards WebAssembly GC. As was mentioned before, WebAssembly GC is still a work in progress though.

A stepping stone towards GC (and several other specs) is expected to arrive next year in Firefox and Chrome: reference types, which will introduce anyref. Anyref can be considered a replacement for pointers. And as with pointers, it’s unclear what’s behind an anyref. Anyref also allows you to import anything from JavaScript into WebAssembly. It doesn’t need to be one of the numeric types.

To get to something more specific however, you will need to be able to downcast an anyref to a different type. This is part of WebAssembly GC and is still being fleshed out, but roughly speaking you will need to provide the types to which an anyref can downcast to. The types to which an anyref will downcast will be structs with i31ref’s and references to other structs. A struct can be considered a replacement for a memory block, and i31ref can be considered a wasm version of a unboxed OCaml int. Do note though that i31ref itself is conceptually a reference and not an int.

Our previous memory block example will be transformed into this struct:

Note that there is no block header anymore, and also the tag bits are gone. Most likely the browser engine will add this block header for you as an implementation detail.

Implementation

A pragmatic first step for moving towards WebAssembly GC, is to first move towards reference types and implement all GC allocation and access operations in JavaScript called by WebAssembly in place of future WebAssembly GC instructions. Reference types are already available in Firefox nightly for experimentation.

This is roughly a set of what at least needs to be done:

  • add the anyref type
  • use the struct features of the WebAssembly GC MVP spec instead of using linear memory
  • move the data blocks also somehow to the managed version
  • remove tagged ints and related bit operations
  • ignore Cblockheader
  • and… actually make it all work.

To see examples of how code will be converted to WebAssembly GC, I created a repository: https://github.com/SanderSpies/ocaml-wasm-gc-experimenting. This is still a work in progress and there is not much in it, but more examples will be added over the next month.

Most likely this is not everything, and all the changes described here might also require changes in the LLD linker used by the WebAssembly backend.

This article was made better with help of Maxim Valcke and Luke Wagner. To be honest I rushed the ending of the article a bit as I want to start focusing on implementing.