Writing a Game Boy Emulator in WASM, Part 1

Hey, did you see that Game Boy emulator I wrote in WebAssembly and WebGL2? That project was the result of my desire to build a non-trivial program with WebAssembly, and get an early understanding of the challenges that pop up when integrating with existing web tools. Now that the emulator has grown to the point where I can actually play games, I figure it’s time to share a bit about what I learned.

This is the start of what I’m expecting will be four or five blog posts on the development process (including some WebVR stuff). While we’ll be discussing concepts in the context of an emulator, I’m hoping to be as general as possible when talking about things like interfacing between WASM and JS. Eventually I’ll cover how I implemented specific pieces like the Game Boy’s CPU, graphics unit, and audio by gluing together raw WASM code and modern web APIs.

In this first piece, though, let’s talk about why I thought this was a good application of WebAssembly, different ways to share data between WebAssembly and JS, and places where the WebAssembly development process still shows its immaturity.

Kicking things off, I want to make some clarifications about WebAssembly:

WebAssembly isn’t magic unicorn glitter that makes everything fast.

You can’t just sprinkle WebAssembly over your project to automatically make it faster. There are some applications where WebAssembly will be faster than JS might have been; there are also places where it might perform far worse.

One place where WebAssembly excels today is implementing routines that don’t allocate new memory. Consider an app that has a processing loop that runs once per frame, where you want to reduce the occurrence of garbage collection. If you want to do this in JS, you have to be clever about preallocating the right types of objects. Due to the nature of WebAssembly, you get this side-effect without thinking about it. This need comes up frequently in my WebVR work, which is why I’m excited about the potential for these new tools.

WebAssembly isn’t a harbinger of the “closed web.”

I sometimes see people worried that compiled source code will hurt the open nature of the web. But to be frank, I can’t remember the last time I clicked “View Source” on a web page and was presented with something readable. We’ve been uglifying the web for a long time, and it’s not borne out of anything sinister — it’s just the natural result of optimizing for performance.

If anything, WebAssembly opens up the web to more use cases. It’s a true compilation target for the web, allowing developers who don’t want to learn JS to take advantage of the web’s distribution power. Here’s a comment I made a few months ago at work during a discussion on the subject:

The best magic about wasm is that it takes what makes the web great (being able to instantly download an “application” on demand) and improves it (reducing parse time, opening up the channels to more developers). A web-style on-demand experience is always going to beat something you have to intentionally download and keep around. It’s Netflix Streaming vs Blockbuster.

As a side note, to maintain readability I wrote the JS parts of my emulator to not use any sort of compilation. Sure, it’s modern ES6, but it’s designed to run as-is. No transform step, no bundling, just a bunch of boring script tags containing IIFEs, like the good ol’ days.

Okay, with that out of the way, let’s talk about emulators!

Why I chose WebAssembly to build a Game Boy

An emulator is often simulating physical hardware and electronics in software, and in the case of the Game Boy most of the work involves dealing with 8-bit buses around (a variant of) the Z80 CPU. That means manipulating a lot of individual bytes, especially while navigating through huge banks of ROM and RAM. The Game Boy is a pretty simple architecture — getting button input requires reading specific memory addresses, writing pixels to the screen involves pushing bytes to specific places in VRAM.

Conceptually, this is exactly what WebAssembly does. It runs through a large ArrayBuffer, manipulating individual bytes as it executes. I could pre-define sections of this buffer to be my ROM and RAM, and easily read or write to them from both JS and WASM. In fact, any scenario that involves lots of byte manipulation or math (say, a video decoder) is a great use case for WebAssembly!

Why I chose Rust as a starting point

It’s not because Rust is new and hip, or even because I wanted to learn a new language. Actually, it’s as simple as the fact that Rust is one of the few languages that is currently capable of (easily) compiling straight to WASM through LLVM. As someone who used to write C for embedded systems, I’d be more than happy to have done everything in C99, but it seems like we’re still a few months away from that being an easy toolchain to set up.

When it comes to setting up Rust for compilation to WASM, the web has a number of conflicting guides that may have worked at one point but no longer do. That’s the risk of diving into something so new, the rules change quickly. As for my personal setup, I owe a huge debt of gratitude to Jan-Erik Rediger and his guides on hellorust.com. I was able to install the nightly build of Rust, and compile a quick-and-dirty WebAssembly project. The next hurdle would be figuring out how to get it to play nicely with JS.

JS <-> WASM Interop

One of the challenges of building with WebAssembly is sharing data between your JS and your WASM code. Due to the nature of WebAssembly at this point in time, you can’t pass arbitrary JS objects to it. Instead, you’re basically limited to passing numbers back and forth. While this sounds incredibly constraining, this is more-or-less how all high-level programming languages, including JS, are implemented deep down. Those numbers can be pointers into a block of memory where you have encoded some higher-level data structure. JS can send a complex object by writing it to a location in the buffer, and transferring that location to some WASM code that knows how to read that object back out. Projects like wasm-bindgen have been started to simplify a lot of that work for developers.

I didn’t use any object-embedding logic, though. Nearly all of my execution is done on the WASM side. I really just rely on JS to call my code at a pre-defined pace using a requestAnimationFrame loop, and occasionally listen for input events. Even so, I still had to solve some architectural challenges on the WASM side.

The simplest way to write the behavior I described is to have WASM expose an init() method to allocate global state, and a frame() method to update it. However, given the way Rust works, it’s difficult to maintain that global state between different external method calls. Instead, my solution was to return a pointer to that state from the init() call, and pass it to every method like frame(). That way, those methods know where to find the object to operate on, while keeping Rust’s memory checker happy.

Example from the JS side:

// WASM init logic, abridged
fetch('path/to/gb.wasm')
.then(res => res.arrayBuffer())
.then(bytes => WebAssembly.instantiate(bytes, { /* imports /* }))
.then(result => {
const exports = result.instance.exports;
return {
memory: exports.memory,
createVM: exports.create_vm,
frame: exports.frame,
// ...
};
})
.then(wasm => {
// create a pointer to an allocated VM object
const vm = wasm.createVM();
// update the instance, ideally once per frame
wasm.frame(vm);
});

Example from the Rust side:

#[no_mangle]
pub fn create_vm() -> *mut VM {
let vm = VM {
// init the struct
};
// allocate the object on the heap, and get a raw pointer to it
let b = Box::new(vm);
return Box::into_raw(b);
}
#[no_mangle]
pub fn frame(raw: *mut VM) {
unsafe {
let mut vm = Box::from_raw(raw);
// update the vm
// ...
// forget this, to ensure our object isn't deallocated
mem::forget(vm);
}
}

This method of passing back pointers to JS is also useful for debugging, and necessary for some I/O behavior implemented on the JS side. For instance, when the emulator boots up, I export a pointer to the location of the ROM memory that has been allocated by Rust/WASM. When a user uploads a Game Boy ROM file, I copy the data byte-by-byte into that buffer, so that the WASM code can execute it. When I was debugging the emulator, I found it necessary to inspect the emulator’s memory after each instruction. By exporting the pointer to the RAM table, I could create a Uint8Array at that location and inspect it byte-by-byte.

const ramPtr = wasm.getRamPointer(vm);
const buffer = wasm.memory.buffer;
const ram = new Uint8Array(buffer, ramPtr, RAM_SIZE);
// Inspect or modify individual byte elements

Growing pains

Once the pointer sharing was sorted out, I pretty much stopped fighting Rust and everything Just Worked™. However, I did uncover one major challenge of developing with Rust and WASM: the debugging story. On the WebAssembly side of things, if something goes wrong, you can often encounter a vague memory error. Chrome and FireFox both provide handy debuggers that display your WebAssembly in its human-readable format, but there’s no way to determine which piece of your Rust code was compiled into that particular bit of WASM.

There are future plans to implement source maps, which would be a huge development boost, but in the meantime you’re left finding another way around this. My strategy was a mix of unit tests (probably a good thing) and a secondary build target that ran my emulator natively (definitely some extra work, but fun to have). With the second, I was able to run line-by-line debugging with lldb to track down most issues.

I’m a bit ashamed to admit that another recurring cause of these inscrutable error messages was simply forgetting to pass the VM pointer to my WASM methods. There’s a market out there for someone who wants to build a tool that generates Typescript/Flow bindings out of Rust code, and keep developers like me from making dumb mistakes.

Architecture of the emulator

As I mentioned earlier, I wanted to keep as much logic in the WASM side as possible. The two exceptions are the graphics logic, which was implemented as a series of WebGL shaders that read raw bytes from VRAM, and the audio system, which uses a JS AudioContext.

Within a given frame, here’s everything that occurs:

  • The JS code copies the current button state to locations where the WASM code can read them.
  • JS calls into WASM to execute one full frame, which runs as many CPU instructions as it can before the VSync event occurs (screen has finished drawing, in the original GB hardware).
  • If, during the execution of those instructions, any audio changes are triggered, the WASM code calls back to JS to manipulate the AudioContext.
  • Finally, the raw video memory is fed to the WebGL program, which draws the screen for that frame.

From there, it’s easy to layer on a bunch of extra features, such as periodically copying Save RAM to IndexedDB so you can save your progress, enabling VR mode, or simply pausing the audio when the tab loses focus. All in all, it’s about 7K lines of Rust which compile to a 70Kb WASM bundle. Not bad, considering most Game Boy games are bigger than that.


Coming soon: posts on simulating the Game Boy’s CPU in Rust, implementing the graphics unit with a series of WebGL 2.0 shaders, and how to add a WebVR mode to any project that already renders to a canvas.

I’ll have the code up on GitHub in the next few days. Just a few i’s to dot and t’s to cross before it’s good to go, keep an eye out!

One clap, two clap, three clap, forty?

By clapping more or less, you can signal to us which stories really stand out.

Responses
The author has chosen not to show responses on this story. You can still respond by clicking the response bubble.