N64 VR with Javascript

TLDR: I made an N64 emulator work in VR with Javascript and WebXR. Emukit is open source.

Emukit running Ocarina of Time

I’ve always been interested in emulation. One of my first serious introductions to programming was hacking Japanese Pokemon Gold into English before the Western release. Finishing Final Fantasy 7 in an emulator at 4x speed is what got me into Final Fantasy. And I have twitch reflexes for the classic Zeldas — on keyboard and mouse.

So I took the last week off of Exokit browser work to hack on VR-ifying the N64 in Javascript.

…And fixed several bugs in Exokit in the process!

Part 1: Emulating

I knew VR-ifying N64 with Javascript would have to be theoretically possible.

There is a project called Retroarch that provides an emulator frontend to various emulator back ends. One of those back ends is Mupen64/Parallel N64. And retroarch and many of its cores are already compiled to WebAssembly, with a layer that binds to WebGL.

When you’ve got that, anything can be intercepted and hacked with Javascript to draw into a headset with WebXR/WebVR. It’s just a matter of hax!

So the first step was to boot Retroarch Web Player locally for hacking. This was more complicated than it seems:

The web player emulates operating system services in the browser.
And the browser APIs have some impedance mismatch with what you need for implementing an operating system.

Corner cases

One of those corner cases is browserfs, which is a file system implementation in Javascript with several backing drivers, such as XMLHttpRequest. File systems often support is blocking I/O — but this is somewhat antithetical to Javascript, expecially Javascript emulation, and double-especially Javascript XR.

The XMLHttpRequest FS layer was relying on the blocking request mode to implement the blocking I/O calls. But this has long been deprecated in web browsers, and there is straight up no good solution for doing this kind of thing in Node, short of binary modules or `child_process.execSync`.

I wanted to get this working in Exokit for the performance/hackability benefits (Exokit is a JS npm module, and faster than chrome), and I didn’t want to resort to supporting more backwards-thinking APIs. So I just rewrote the XMLHttpRequest usage to use async calls. Luckily there was no real good reason for them being synchronous in the first place and this was an easy switch.

One other technicality was that it seems by default Retroarch would try to synchronize audio with video and effectively hang at the Genesis refresh rate (60 Hz) to play audio samples. That is insufficient for good Desktop VR (which generally runs at 90FPS), so I fixed that technicality by simply returning from the audio processing loop early. Which killed audio, but made everything else buttery.

I DJ on the Twitch streams anyway, so it was an acceptable loss — though the Sonic 3 soundtrack is definitely amazing and worth checking out. Some of the tracks were supposedly done by Michael Jackson! </tangent>

So at this point I could boot Retroarch web player into Exokit and play Sonic 3 (& Knuckles!). It worked first try, too! This was both shocking and exciting because at this point we were running an N64 emulator on top of an assembly emulator, on top of an N64 emulator backed by a WebGL emulated with OpenGL.

Anyway, we were just a couple of small hacks away from a decent UX for booting relatively arbitrary ROMs.

We just had to:

  1. Hook up a file `drop` listener to grab files from the user’s operating system and inject them into the emulated filesystem in a well-known place
  2. Load the appropriate WASM/JS bootstrap core for the target emulator for the ROM type (detected by file name), and
  3. Call the Retroarch WASM “executable” with the right arguments to load the ROM file from its emulated location

For good measure at this point I booted the Genesis emulator in an Exokit reality tab as a plane whose material texutre is bound to the iframe framebuffer.

Hooking up basic gamepad events for a D-pad, A, and Start, we could successfully play emulated Genesis in VR.

Genesis emulated inside WebXR view WebAssembly RetroArch

Part 2: Haxing The Matrix

So, we had Retroarch booting to a screen in VR. I figured if Genesis worked then pretty much any Retroarch-supported console would work — including N64 — because abstracting ROM emulation into GL calls is precisely the domain of Retroarch, and the main hard part would be supporting the APIs in the first place.

But making the emulator run on a screen was boring. It could only work as a (potentially multiplayer) 2D screen, but I wanted to challenge myself.

So I set out to run N64 games in Javascript mixed reality.

The first step was to read up on how the N64 does 3D rendering. If I understood that I could probably get a good sense of what the emulator would be doing and the code would be much clearer when I looked at it.

N64 rendering pipeline

It turns out that the N64 had a pretty basic rendering pipeline, without the modern concept of programmabiity or even shaders. It had GPU opcodes that different games used, but at its core the pipeline was fixed in how it pushed triangles to the screen.

I also learned that the N64 rendering used the right-hand rule, which was a relief since that would be compatible with THREE.js without extra hax!

From this I figured there would be some standard shader I could hook into to hack in a new model-view/projection matrix to reproject triangles into a VR headset with OpenVR. This turned out to be true.

Since Exokit implements WebGL in Javascript, I could trap and manifest the GL calls list with console.log. I just logged all of the shaderSource calls to see the source of the shaders were being put into the pipeline.

Vertex shaders source

There seemed to be only two vertex shaders in Ocarina of Time, one of which was an obvious quad texture map, and one looked like textured scene geometry, judging from the uniforms. I confirmed this by hacking the gl_Position with a String.replace to offset Link, Epona, and the Sun programs individually.

It was all just guesswork; I hacked the program based on its I and looked at what happened when I ran the ROM.

From this I deduced how the vertex shader projection model worked: it takes triangles generated by the CPU, does multiplication to normalize an integer position.z to a float, negates it, and throws it through another shader which does a standard projection to the screen. There the z is re-negated to follow the right-hand rule — I have no idea why it’s like this.

Anyway, since I could control the shaders that the emulator was generating, I added my own uniforms for viewModelMatrix and projectionMatrix with another String.replace, and populated them by hacking in a getUniformLocation further in the pipeline.

Initially I set the matrices to identity (no-op) to make sure the multiplication was not exploding. Then I added a translation in the viewModel, which also seemed to work.

Then I hooked up the HMD matrices coming from Exokit’s OpenVR binding, and the goggle-faced hacking started.

The first thing I noticed in VR is that Link wasn’t as _thicc_ as he should be, as pointed out in the Twitch chat. That is, the whole game was running in a flat plane. It was the screen we wanted to get rid of!

But on closer inspection, it wasn’t actually a screen…

It was actually Hyrule field, except superflat!

The curious case of Z

The different geometries were separated in the Z dimension. One probably couldn’t tell from a 2D render, but you viscerally see those millimeters with the beauty of depth perception ;).

So I figured the right fix was to multiply the Z by some factor. After all, it was already being converted from int to float with a division, so perhaps the range got borked in the math.

The problem is that scaling the Z made it clear this is a logarithmic Z, because the nearest points were stretched out to infinity, even while the furthest points were pretty compact. So we can’t fix this by multiplying by a constant.

I tried guessing at a couple of constant factors for the logarithm with kind of worked: 10, 2, 16, 32, 64, and their inverses. But these fixed the scale of some parts of the scene while stretching others. So I knew that this probably wasn’t a pure logarithm.

That’s when I realized I’d been resetting the `gl_Position.w` factor to 1 instead of keeping its value as provided by the game. By keeping it the projection matrix math ended up being corrected.

We had Ocarina of Time’s main menu VRified!

You notice some interesting things when you do this:

  1. The N64 clips everything to a literal frustum.
  2. The menu is technically super tiny in world space! It’s just really close to the camera.
  3. The skybox is actually _in front_ of the rest of the geometry, but it’s got the depth test disabled.
  4. The N64 z-culls its geometry with great efficiency.

It lagged enough to boot us back to the Vive tracking environment. But… it was working!

Part 3: Layering

At this point we had the N64 emulated scene running but we had no sense of presence in the world. No hands, no other objects, just the game render.

I figured we should at least have VR controllers, a virtual console, and a controls tutorial in the world to ground us. I had the models lying around from Zeo!

But the question beomces: how do you draw models — or anything — into a scene when the emulator owns and trashes the GL context?

We can of course set up a scene graph with THREE.js to draw our models from the correct HMD perspective, but we need t consider when we do that draw in the render loop, so that it blends with the emulator without any screen wipes or overwrites.

In the end I decided to go with letting the emulator do its thing, setting THREE.js to preserve the emulator’s context (avoid gl.clear()), and then drawing the console and controllers afterwards. One gotcha was making sure to inform THREE.js that the states of the gl context like buffers, shaders, and textures are unstable via renderer.state.reset() before rendering.

Part 4: Controls

The next step was adding controls to actually control the game.

The first thing I did was sit down, close my eyes, and figure out a mapping for an N64 controller to the Vive controller. Surprisingly, this is doable for all of the keys that matter.

The second thing I did was to comment out the weird code that Retroarch Web Player had for handling gamepads via the Gamepad API. It didn’t support all of the N64 buttons, it had the analog axes wrong for the Vive touchpad, and I couldn’t be bothered to debug that.

Instead I chose to do thing I know, which is both easy to implement, and more importantly, debug: `KeyboardEvent` keys!

I wrote a loop polling the Vive gamepads via the Gamepad API, and mapped that to keyboard events dispatched on the `window.document`.

The one remaining problem was that Retroarch actually does not have key bindings for all of the Nintendo 64 controller buttons, so we had no actual key to emit.

I found out that this was controlled by a file called retroarch.cfg. That file didn’t exist, but with some virtual filesystem hacking I managed to inject it into the right place in the boot loop.

I went into a VR to “confirm that everything was working”. I actually ended up playing for quite a while ogling the chance to literally walk through the scenes that made up my childhood.

Part 5: Optimizing

So at this point everything was working, but it was a bit slow.

It was time to hook up Exokit (which is just a Node module) to the Node profiler via Chrome Devtools and capture a profile. This just worked and immediately showed us the prime suspect for the slowdown.

It was the matrix hack; Exhibit A was glGetUniformLocation.

Basically, we were querying the shader for the uniform location to use for our modelViewMatrix and projectionMatrix every single draw call. This multiplied the GL overhead by 3 or so.

Luckily, it was an easy fix. Uniform locations don’t change once the shader is set up, so we could just cache the location on the WebGLShader, which made the overhead once per shader instead of once per frame.

There was an immediate difference in performance — it was now pretty consistent regardless of scene complexity (number of shaders per frame). But it still wasn’t 90 FPS. It felt more like 60.

I profiled again and saw that all of the time was taken up by the WASM. That was disappointing, because it meant that there was probably little that could be done to speed up the rendering besides optimizing the WebAssembly implementation of V8 (which I am not at all qualified to do).

Or, hacking in a pure binary, threaded, non-WASM version of Retroarch as a node module — which we can do in Exokit, but it would cease to be web! No go!

But then it hit me that what felt like 60 FPS might not be a coincidence — it could be that the WASM blob was trying to do its own timing and introduce a delay to hit a frame rate target, as an emulator would is wont to do.

This turned out to be right and retroarch.cfg was capped at 60 FPS! I bumped it up to 90 and bam! We started hitting the frame target most of the time, and there this completely eliminated any instances of us being booted back to the tracking environment.

That’s it for the functionality. Now it was time for polish.

Namely, the whole thing needed a controls explainer and drag-and-drop prompt to tell the user to drop in a ROM.

These were slapped together in Photoshop as a PNG and drawn as textures in the world. And… that’s it!

Future directions (possibly, let me know!)

  • More consoles: Snes, Gameboy, Playstation, Dreamcast, Genesis.
  • Multiplayer w/avatars.
  • Multiple consoles at once.
  • Save states.

Video

Here’s a longer video of the result in action:

Try it out!

If you have an N64 ROM and a headset, you can load it in VR on the live site at https://emukit.webmr.io/. Emukit is open source on Github !

Firefox might have performance issues; it runs best on Exokit.

Keep in touch

Most of this was developed live on Twitch! I’m also on Twitter and Github.

There’s a bunch of people talking about these hacks on the Exospace Discord. You should come say hi ;)

Like what you read? Give Avaer Kazmer a round of applause.

From a quick cheer to a standing ovation, clap to show how much you enjoyed this story.