saved game from March 2005, successfully loaded and running in vanilla Chrome 🎈

To the Browser!

How to browserize RollerCoaster Tycoon?

Porting OpenRCT2 to the Browser using Emscripten


I’ll illustrate my approach, some of the challenges, and the results I got when porting OpenRCT2 (open source RollerCoaster Tycoon clone written in C/C++) to HTML/JavaScript using Emscripten. The sources can be found in my fork of OpenRCT2. Due to the fact that OpenRCT2 relies on an installation of the original game for the brilliant artwork, I have not hosted my port anywhere. Needless to say, I would love to talk with Chris Sawyer/Origin8/Atari in order to change that!😀


Approach & Challenges

I did all work on my Windows 10 laptop (i7–5557U@3.1GHz, 16GB), but made heavy use of WSL a.k.a. “Bash on Ubuntu on Windows” since both Emscripten and OpenRCT2’s platform-dependent code seemed most straight forward on Linux. Wondering how rendering would translate to JavaScript, I was pleased to learn that OpenRCT2 uses SDL2 for its rendering/UI and that Emscripten has a port for that. Bingo!

1) Getting it to compile with Emscripten. Emscripten’s compiler emcc acts like a drop-in replacement of clang/gcc, so working on the build scripts was the first thing to do. OpenRCT2 uses cmake/make so I could simply set CC/CXX to emcc and .js files should pop out instead of binaries. In theory.

First compilation attempts failed horribly due to missing symbols, so through stackoverflow and careful code inspection I improved my cmake command to:

cmake -D COMMON_COMPILE_OPTIONS="-S -emit-llvm -DDISABLE_NETWORK=1 -DDISABLE_HTTP -DDISABLE_TWITCH -D__amd64__ -D__LINUX__ -D__linux__ -D_LIBCPP_HAS_MUSL_LIBC" -D CMAKE_CXX_FLAGS="-stdlib=libc++"

The COMMON_COMPILE_OPTIONS mainly set the right defines in order for code to be less of a nightmare to compile (by literally compiling less code). I disabled things like networking since all I cared about for the prototype was the original game’s single player mode. The CMAKE_CXX_FLAGS are necessary in order for emcc, which is based on clang, to use the LLVM standard library instead of the GNU one.

I finally ended up with a >40MB openrct2.js as output. Up to this point, it felt like your usual experience of trying to compile someone else’s C/C++ project — there’s always some headers missing and some linker error, but at the end of the day, it somehow compiles. What could possibly go wrong now?

the JavaScript file looks good to me

2) Loading the JavaScript file. Right, we can’t test our openrct2.js just yet. Following the tutorials and examples, I crafted an index.html that loads openrct2.js and has a <canvas> element for Emscripten to use as a render target. Opening the webpage, I was positively surprised to see errors in the developer console, complaining that some RCT paths and files were not found — that means the game is trying to start up!

fancy, we even get error locations in the original source files 🤓

Emscripten has nice file system abstraction, so that should be an easy step. Since the game queries for files synchronously (as in: it calls standard library functions that are expected to return the file contents without resorting to callbacks or similar), I had to make sure that all data is already loaded as soon as the game asks for it. I added code that populates Emscripten’s virtual file system with all resource files, before invoking openrct2.js.

That is obviously a drawback — the browser loads the resources of an entire game into memory, regardless of what is actually asked for by the game. I don’t see a way around synchronous file delivery without gutting OpenRCT2 and rewriting parts of it, which is not my intention: the Browser should just be another build target, not something that relies on heavy customization. However, Web Workers allow querying for files synchronously, so I might investigate executing openrct2.js in a worker at some point.

Further minor issues:

  • Emscripten’ed programs have an explicit heap (ArrayBuffer) which was filling up during resource loading. One can resolve that by adding emcc option -s TOTAL_MEMORY=<HOPEFULLY ENOUGH BYTES> or -s ALLOW_MEMORY_GROWTH=1 (which comes at a performance cost).
  • OpenRCT2 tries to do a dlopen on the running assembly when checking for presence of Steam (function IsSteamOverlayAttached). Emscripten supports dlopen to some extent, but really I don’t wanna get into that business. I patched the IsSteamOverlayAttached to always return false.

At that point, the game seemed to load: there were no more errors in the console and the script seemed to be stuck in some loop — which makes sense since OpenRCT2 updates and renders in a loop that is only ever left on exit. Unfortunately, browsers don’t update UI at all while a script is busy, so:

with a little imagination, this is a game

3) Breaking the game loop. The loop lives in src/openrct2/Context.cpp and looks roughly like this:

void RunGameLoop()
{
_finished = false;
do
{
RunVariableFrame();
} while (!_finished);
}

It translates into an anologous JavaScript function called __ZN8OpenRCT27Context11RunGameLoopEv (due to C++ name mangling). It appears that _finished will be set to true if the user wants to exit the game. Since this is not really a concern for us (just close the tab/window), I tweaked the build to replace the function body with the following:

setInterval(function()
{
Module["__ZN8OpenRCT27Context16RunVariableFrameEv"](self);
},16);
throw 42;

This registers a timer to periodically call RunVariableFrame and throws an exception in order to skip whatever code would run if the game exits normally (we can’t have it clean up resources or mess with the game state in any way).

Success, kinda.

d-d-does it live? People and pixels are in incredible pain in this version of the matrix
welcome to “Dynamite Dunes”; I actually managed to build something, so that debt is a lie — everything is

4) Glitches everywhere. The game randomly hangs up, some graphics and sounds are horribly broken while others are perfectly fine (like the title song, phew). Looking back, it boggles my mind how the game could have worked at all under these circumstances — like a miracle, underlying simulation and data representation was apparently unaffected by the UI disaster. Let’s investigate.

This is where I started stepping through this massive 1.4 million LOC JavaScript file that looks something like this most of the time:

if ($60) {
$61 = HEAP32[$12>>2]|0;
$62 = ((($61)) + 4|0);
$63 = ((($62)) + 1|0);
$64 = ((($63)) + 1|0);
$65 = HEAP8[$64>>0]|0;
$66 = $65&255;
$67 = $66 & 4;
$68 = ($67|0)!=(0);
if ($68) {
$69 = HEAP8[3726389]|0;
$70 = $69&255;
$71 = $70 | 2;
$72 = $71&255;
HEAP8[3726389] = $72;
}
}

Not sure what’s more valuable here, understanding JavaScript or Assembler. Anyways, I decided to step through the code that loads the graphics in both actual C/C++ OpenRCT2 and my apparently broken JS code. I identified the spot where one of the broken title screen sprites gets loaded, so I prepared both debugging sessions to be in that spot. I then literally stepped through the code side by side, learning about the meaning of memory locations and cryptic variables on the JavaScript side on the fly, hoping to find something odd on the JS side.

Finally, I found a mismatch in behavior! The code in question was supposed to fetch a stride of pixel data from the resource file using some byte offset. Both the C side and JavaScript side agree about the offset and the contents of the resource file, but the stride of data is different — it’s offset by 1 byte. Argh, alignment! I identified the evil instruction, which was using HEAP32[p>>2] to read data (aligned read!), whereas the C code was doing something in the spirit of *((int*)p).

Upon further investigation, I realized that OpenRCT2 makes use of unaligned access quite a lot, for example there are packed structs that occasionally cause fields to be unaligned, for example:

#pragma pack(push, 1)
typedef struct rct_draw_scroll_text {
rct_string_id string_id; // 0x00
uint32 string_args_0; // 0x02
uint32 string_args_1; // 0x06

uint16 position; // 0x0A
uint16 mode; // 0x0C
uint32 id; // 0x0E
uint8 bitmap[64 * 8 * 5]; // 0x12
} rct_draw_scroll_text;
#pragma pack(pop)

Most of that packing is actually necessary in order for OpenRCT2 to conform with the original RCT’s file/data formats — memory was valuable back then, so why not pack.

So apparently Emscripten assumes 32-bit alignment somewhere, which arguably makes sense and also greatly improves performance, here’s what the unaligned version of HEAP32[p>>2] looks like:

HEAPU8[p>>0]|HEAPU8[p+1>>0]<<8|HEAPU8[p+2>>0]<<16|HEAPU8[p+3>>0]<<24

A web search quickly finds the magic argument: -fmax-type-align=1. This will essentially make sure that no assumption about alignment is made, unless alignment is clearly derivable by the compiler. Result:

same scene as above, without glitches; who would have guessed that there’s a second row of balloon stands?

5) Optimization time. There is terrible lag. The above park runs at ~8fps if the screen shows busy regions and the browser window is as small as above (roughly 480p). Also, the JavaScript file is still over 40MB large and we all know that adding -fmax-type-align=1 only made that worse.

First I tackled file size by not compiling code that is never called. For example, OpenRCT2 not only supports rendering using SDL (“software”), but also using OpenGL (“hardware”). I did’t see any point in also porting the OpenGL engine if the SDL engine works fine. At this point I abandoned mocking with cmake/make and just created a custom build script that compiles and links precisely the right source files with precisely the settings I want, which later came in handy, again. I abandoned around 20 source files related to OpenGL, networking and some Windows-only code, reducing file size by around 25%.

I then played around with different optimization levels emcc offers and settled with -O3 at compile time (emits LLVM code) and -O2 at link time (emits JS). Furthermore, I added the following flags that appeared to have a positive effect on performance: -fno-exceptions, -s DISABLE_EXCEPTION_CATCHING=1, -s AGGRESSIVE_VARIABLE_ELIMINATION=1, -s ELIMINATE_DUPLICATE_FUNCTIONS=1, -s ASSERTIONS=0

I’m not sure exactly whether each single one of these was truly benefiting performance, but overall, I measured around 50% speedup at this point, so double digits are doable in small windows.

Ironically, I introduced another glitch with all of the above, specifically the moving text in banners/entrances suddenly flickered:

imagine flickering right there

I pinpointed it to -O3 (without that, no flicker) but did not further investigate the precise cause, maybe it mistakenly enforces alignment somewhere again? I adjusted the custom build script by deactivating -O3 only on some files related to text rendering which did the trick (this customization is where I was very glad I had moved away from cmake— with my little cmake knowledge, it would have become a huge mess).

Ironically, writing this post I’m trying to reproduce the above flicker bug but failed — Emscripten has versioned quite a bit since I did the port, so maybe they indeed fixed a bug with the optimization!

At this point, file size is down to about 19MB (less than half of what we started with), minification achieves ~1.5MB, but doesn’t further improve performance. I assume that’s to be expected since the JavaScript conforms with asm.js and is hence compiled down to something which no longer cares about the source file (well, maybe the Browser is faster with compiling the minified version, but I’m not too worried about that).

6) Custom cursor glitch. Turns out there was a bug in Emscripten’s port of SDL’s CreateCursor method, which can create custom cursors from sprites. Specifically, the implementation was completely ignoring the desired hotspot:

32x32 spite used as cursor, bounds indicated in green; hotspot defined at (15 | 31), but rendered as if it was (0 | 0) (for non-RCT players: the above pixel soup depicts a park visitor being dragged around by me)

In the above example, Emscripten sets the CSS cursor property of the canvas to url(“data:…”), auto when url(“data:…”) 15 31, auto would yield the expected result. I found that the code generating the CSS cursor simply ignored the hotspot entirely, so I submitted a fix.

7) More optimization. The game seems functional at this point, but barely having two-digit frame rates is annoying. I used Chrome’s profiler to identify where most time is spent, which unsurprisingly pointed to rendering. I ended up optimizing some C/C++ code a bit and replaced the JS version of a blitting function with a manually optimized version. After a couple of nights of fine-tuning, I got frame rates to increase to >20fps (on a typically busy scene) in full screen mode.

As briefly mentioned earlier, at the time of writing this, Emscripten exists in an updated version, which allows me to use -O3 across the entire code. This not only further reduced file size to ~18.3MB, but also gives me ~30fps on typical scenes now! Then again, Chrome was also updated since I did my last amateurish measurements, so I’m not sure where the speedup comes from exactly. The game is actually very enjoyable at this point! I guess since there’s no need for super-fast reactions and also graphics rely on discrete animations that inherently make certain frame rates pointless, I was not bothered by 30fps at all. OpenRCT2 actually caps at 40fps (limit can be removed in settings), so we’re getting somewhere.

I think there is still tons of optimization potential from a code perspective, but at this point I decided to not spend more time on that. For example, I think one could benefit from the fact that browsers effectively freeze UI while JS is running, meaning that any form of buffering/off-screen-rendering/buffer-swapping OpenRCT2 and SDL does could be removed!

this is playable: 1400x900 canvas at 31fps 😅
Chrome profiler: update/render a ~20 megapixel frame; putImageData plays a relatively smaller role at lower resolutions

Conclusion

Let’s start with a completely different aspect.

Business Potential. RCT is still an amazing game with lots of potential, as recent releases show. I think the developers/publishers should consider the browser as a platform:

  • play within seconds, no installation required — the barrier for people to try out a game does not get lower than that! Even if they then buy the Steam/mobile game, they might not have done so without the effortless way to try out the game.
  • great F2P potential, e.g. limited set of parks/attractions/etc. for free, anything beyond that requires registration/payment.
  • no impact/interference with other releases like, say, the mobile game: the browser implementation clearly doesn’t run smoothly on handheld devices (and the UI isn’t optimized for it) — but one can redirect people to the respective app store. The website can guide everyone to the right place.

Technical Awesomeness. Emscripten is crazy powerful. It brings together two worlds that IMO couldn’t be more different: On one side the bare metal world of C/C++, where everything is statically compiled to platform-dependent low-level bits. On the other side the dynamic, high-level, platform-agnostic world of HTML and JavaScript that until some time ago could have been classified as a purely interpreted scripting language (well, and it’s still that same language, even if it got way smarter behind the scenes). With the performance of asm.js, Emscripten manages to leverage JavaScript of all languages as a “platform” to compile to. It felt almost bizarre to work on the bugfix for custom cursors: you’re translating data stored in some structs and binary buffers (sprite) into equivalent CSS. Wat. Two completely different worlds play together within a few lines of code.

Fun. Running in a browser comes with some nice implications:

zoom in more than the game allows by literally zooming (here: 250%); OpenRCT2 actually has a custom scale-factor setting living in the options dialog — but how ‘bout we just Ctrl+-/+ 😉 boy, that Simon Foster artwork is beautiful…
zoom out without getting aliasy; left: native in-game zoom, center: scale-factor setting set to 0.5, right: 50% browser zoom; while the in-game zoom drops details (animated text, grass textures), the scale-factor’s sampling introduces aliasing effects; the browser zoom essentially implements oversampling
if zooming out really just increases the logical resolution, that means we can render an entire map at once and at full quality: originally 6272x3183 pixels; surprisingly, this still rendered at 5fps on a Core i7 desktop PC; I created the picture using “Save image as…” on the canvas (after reenabling the context menu), then cutting away the menu and background using Paint.NET
same as above, but bigger map: originally 8064x4111 pixels— rendered at ~1fps
physical pixels this time, no zoom