Hello, GBA! Journey of making an emulator — part 1

Michel Heily
14 min readMay 11, 2020

--

Yes, yet another Game Boy Advance Emulator. Written in rust!

My side-project got to a stage where it successfully emulates most games, and you can either run it on desktop, android and even from your browser. I believe now would be a good time to b̶r̶a̶g̶ ramble on the experience, share some thoughts and insights, discuss internals, debugging techniques, etc. However, this isn’t a “complete guide on how to write an emulator from A-Z” kind of thing.

For those interested in making their own emulator, prepare for an incredibly rewarding journey, but also for a never-ending, vicious cycle of writing buggy code, and then spending a stupendous amount of time fixing it. Of course, all software development in its entirety is like that, but in this particular instance — the fun part is hunting these damn bugs!

In the later stages, you will most likely find that rather than coding, you‘ll invest more time towards debugging and reverse engineering games that fail to boot/freeze/have display glitches or otherwise are broken in any way you can possibly imagine.

The best thing about it is that in the end, you get to play games on your very own emulator!

RustBoyAdvance runs on your browser!

A rusty old tale

This section details my motivations for doing this project, you may skip ahead if you’re more into the bits.

Emulators have always fascinated me and I remember using them for countless play hours in my youth. The Game Boy Advance is my favorite game console to date, yet as a kid, Santa kept denying me the pleasure of ever owning one. That beardy bastard.

Back in 2016, a rising trend of emulator development was all the rage. I needed a new side-project and also wanted to learn rust; It introduced some promising features such as — a robust type system, LLVM, package management, easy cross-compilation, bindings for WASM, and whatnot. Thus began my adventure of learning rust and my first attempt at a Game Boy Advance emulator (abbr. GBA).

Coming from a background of low-level C/C++ kernel code/embedded development, I’ve had my struggles with the steep learning curve of rust and its idioms. The strict ownership system, while making sense, proved to be troubling with the tangled architecture of the GBA which didn’t want to comply with the rust rules. Simply put, I was trying to write C code with rust and it didn’t work out well.

Eventually, I lost interest and decided to drop it, not even getting to finish that notorious CPU. The unfinished project joined my GitHub graveyard and remained nothing but a heart-wish that maybe someday I’ll find the resolve to make something of it.

Well, as “luck” may have it, last year I had an accident and ended up with 2 broken arms. (Yep, both of my arms). During my long recovery, I found myself in a place where I really needed to feel productive again. (Also my physiotherapist recommended getting back to using a keyboard ASAP).

That’s when I decided to un-learn and re-learn everything about rust from zero. I participated in some small open-source endeavors and gotten to like the language. I also decided to pull out my d̶u̶s̶t̶y̶ rusty old repo and this time, do it the rust way. I was gonna beat that ARM CPU (pun intended) once and for all! (had I known back then that the CPU would be the least of my concerns..)

Emulators 101

Before we dive in the meaty details of the GBA, let us ask ourselves what does emulation even mean, anyway? Some may think of emulators as programs that take game ROMs from old consoles and spawn a window allowing you to play them. While this is not far from the truth, I still want to refine that notion.

In the context of emulating game consoles, an emulator often is a program that models the hardware components of the target system (to some degree of accuracy) to get software made for it (games) to run on the host platform (your computer).

But why do we need emulators? Well, game console emulators help preserve long-forgotten games that are only available for hardware that you can’t easily acquire nowadays. Also, emulators often provide enhanced features that are not available on real hardware that benefit gamers and hackers alike, such as quick save-states, time rewind, debugging&patching (useful for ROM hacking), and more.

Hello, GBA!

Now that we have a solid understanding of emulators, let’s get ourselves familiarized with the GBA hardware architecture.

Warning: We’ll have to get technical from now on (but hey, that’s what you came here for, am I right?). I advise referring to GBATEK by Martin Korth, the go-to document for the GBA internals.

In short, the Game Boy Advance is a handheld game console by Nintendo, released in 2001, powered by the 32-bit ARM7TMDI chip clocked at 16MHz.

Accompanying the main CPU:

  • Video: 240x160 TFT display and a proprietary 2D graphics engine often referred to as PPU (Picture Processing Unit)
  • Audio: 2 PCM channels used to playback wave samples and 4 analog wave generator channels compatible with the Gameboy Color.
  • Input: 10 Buttons
  • 4 Timers
  • 4 DMA (Direct Memory Access) channels
  • Serial IO
  • Z80-Like secondary CPU for backward compatibility with Game Boy Color games.
  • Games are stored on cartridges called GamePaks and may contain additional hardware such as Flash memory, EEPROM, Real-time clock, gyroscope, solar sensor, vibration, etc.

What’s common to all these components you ask?
none of them exist in your computer. They are all specialized hardware available in the GBA which the software relies on. So most of these components have to be emulated (to some extent) for games to work.

The beautiful thing is that writing an emulator is an incremental process. You don’t have to implement everything all at once for games to work! You can start with a basic CPU that only supports branch instructions and some ALU (MOV, ADD, SUB), write some assembly test cases for it, and improve upon it and add more instructions slowly and carefully.

After you have a solid CPU emulation and you want to sprinkle some graphics on the whole thing, you will find out that you can get by at first by only implementing a small subset of the PPU functionality. Sound is not even mandatory for games to boot. You get the idea.

The main CPU emulation

The CPU is the heart of the system. It is the main component that connects all pieces and is often the first milestone for emulator developers. All devices on the system share a bus that the CPU uses to interconnect with them. I recommend referring to the GBA Memory map.

As mentioned above, the main CPU of the GBA is the ARM7TMDI. It is based on armv4t architecture, and has instructions with many quirks and edge cases and also an additional instruction set called Thumb. People don’t pick the GBA for their emulator projects because of how complex this CPU is when comparing to other platforms such as the NES.

Let’s explore how a CPU is emulated; In high-level, a software model of the CPU will often concise of a struct/class that contains the CPU state (its registers, program counter, flag bits, etc). The system memory modules will often be simple byte arrays.

One popular approach is to implement the CPU as a machine-code interpreter. While for old platforms (GBA included) this is sufficient, it would be too slow on modern ones that have faster chips such as the 3DS or the Nintendo Switch. For these, you will often see JIT emulators.

Basic ARM interpreter example

Let’s explore a basic ARM interpreter.

Keep in mind that this is just an example and far from the spaghetti code I use in my own project. This is just to set a clean example.

Hypothetical CPU model

Every interpreter will have the main loop for the CPU that reads an instruction from memory, decodes, executes, and increments the program counter if needed.

CPU main interpreter loop

Now, let’s see how we can go about implementing the branch instruction!

Summary for B/BL instructions from GBATek

If you don’t know any ARM assembly and interested in picking it up, this guide by azeria-labs is a good place to start.

Bits 28–31 are meant for conditional execution, we will ignore them for this demonstration. We see that bits 25..27 must be a constant 0b101 . We can use that to decide if an instruction is a branch! Bit 24 is the link flag. If it is on, then it’s a BL (Branch/Link) instruction which means that the address of the next instruction is saved to register LR, useful when we want to call subroutines. Bits 0..23 make the signed offset (in multiples of 4) from the current PC that we intend to jump to.

Putting it together:

Implementation of the ARM branch instruction.

Great, now we only need to this for every other ARM instruction format.. and don’t get me started about Thumb. I dare say that implementing the CPU is an awfully time-consuming task.

ME vs ARM7TDMI

Bare Metal

As I said in the beginning, a tremendous amount of time is spent on Reverse Engineering GBA games. If you are interested in either hacking ROMs or emulator development, it’s always a good practice to get an idea of what kind of software you are dealing with, and what interfacing with the hardware looks like from a software point of view.

GBA programs are what people call “bare metal” programs. They are written specifically for the GBA hardware and do not rely on any operating system. Actually, there isn’t any on the GBA. The ARM7TDMI has no MMU, so no virtual memory. Programs access physical memory directly and talk to the hardware via Memory Mapped IO (MMIO).

Even though the GBA has no operating system, it does contain a small 16KB firmware we call the BIOS in the internal ROM. The BIOS is the first code to run at when the ARM7TDMI boots. It initializes the stack, clears the RAM, initializes hardware and displays the famous boot animation. It jumps to the GamePak when its inserted.

Other than that, the BIOS also provides system calls for GBA programs via software interrupts, so most emulators require the user to have it as well as their game ROM.

With the BIOS belonging to Nintendo, we are not allowed to share it, but you can dump it yourself. Even though the GBA protects the BIOS and won’t allow code running outside of it to simply read its contents, there are several attacks that allow leaking its contents.

Let there be pixels

Let’s take a look at a simple GBA program;

sample program

Lines 3–8 are configuring the PPU. The address 0x04000000 is mapped to a 16-bit PPU internal register called DISPCNT (Display Control). bit 10 enables background #2 and bits 0–2 enable display mode 3.

The GBA display modes can be divided into 2 categories. Modes 3–5 are the bitmap modes and modes 0–2 are tile modes. Tile modes won’t be covered in this post, but keep in mind that they’re more commonly used by games since they are a lot faster, although harder to design and program with.

Mode 3 is the easiest bitmap mode — the PPU treats VRAM (Video memory, mapped to address 0x06000000) as a contiguous frame-buffer with 16-bit colors. You will probably not see it used in any real game because it is very inefficient — To keep things simple, the CPU can only update the VRAM in a special period called VBLANK. Updating the entire framebuffer pixel-by-pixel consumes many CPU cycles, leaving very little room to perform game logic in VBLANK.

If we write a pixel in an arbitrary coordinate (x,y) in Mode 3, we would do:
((uint16*t)0x06000000)[x + 240 * y] = color

The loop at line 16 would then go to draw a red line in the middle of the screen. 0x1f is the RGB555 representation of the color red, and 240 is the display width.

Running this program, we see the following:

sample.gba
Hello Mode 3

Testing Testing Testing

Testing is always important, but people often overlook it in their private hobby project. In the last section of this article, you will get a glimpse of what might happen if you rush without proper attention for testing.

Anyway, there are various test ROMs out there that I used to test my emulator.

To test the ARM7TDMI Processor I used gba-suite by Julian Smolka. These are short ARM snippets that test many CPU edge-cases and jump to an infinite loop when a test fails, with the faulty test number in a CPU register. Very handy tests indeed, I even integrated these tests in my CI using rust’s built-in testing infrastructure. I also used arm-wrestler — a bitmap-mode test ROM that only uses requires a minimal PPU implementation to work. It took me a great while until I was finally passing arm-wrestler.

To test the PPU, sprites, buttons, DMA, etc — I often used demos by tonc. This site is an awesome guide for GBA homebrew development. I reverse-engineered these examples and tested them against them whenever I implemented a new feature. It also explains in detail some of the insides of the GBA hardware, so it is rather a priceless piece of information as well.

All-green sir!

Pardon my undefined behavior

Games often have bugs that, by some miracle, are not causing anything bad when running on a real GBA(thus potentially overlooked during development). If an emulation is not accurate enough, these previously harmless little bugs might not reproduce correctly and crash the emulator.

What does it mean to be accurate though? ask yourself what happens when a program is trying to read from an unmapped memory address. On a common machine, you would be right to guess that an access violation exception might occur. On the GBA, things are different. There is no MMU and no exception will be raised. The game would just read garbage data that happened to linger on the bus from a previous operation.

This phenomenon is dubbed “open bus”. And if not accounted for, the bugged game that works on hardware might read “different” garbage data that just might happen to break it on the emulator. (Yes “Legend Of Zelda — The Minish Cap”, I’m looking at you!)

When I was debugging what makes Mega Man Battle Network 6 freeze when I enter the “email” menu, I discovered that the game has a bug. Somehow the game happens to receive a NULL pointer when it displays the mail records and tries to dereference it later.

NULL (0x0) is pointing to the BIOS region, which as I mentioned earlier is protected so games can’t read its real contents. What would be returned instead is the last fetched BIOS opcode, and it just so happens that this value doesn’t cause any crash and the game continues as if nothing happened. I did not implement the BIOS protection in my emulator so when the game tries to access the BIOS region, it actually succeeds in reading what's in there. The value returned in this case happens to make the game freeze.

Luckily, GBATek informs about some of these unpredictable things. The main takeaway is that when emulating an embedded system, never forget that software can be very sensitive for bugs, and what is considered “undefined behavior” often must become “well-defined behavior”, otherwise this may break.

The rotating Metroid — a bug story

This is a bug I found when I was first implementing affine sprites. Wait, affine sprites? Well, in short, sprites (sometimes called objects) are tiled-bitmaps that can be displayed on top of all display-modes. These are often used for game characters and such. Affine sprites are sprites that can be configured to rotate/scale by the PPU.

I used obj_aff.gba demo by Tonc to test my implementation. In this demo, a sprite of a Metroid can be rotated and scaled with the GBA buttons.

This is the expected result:

The expected result of obj_aff.gba! Rotation and scaling transformations

For me, while the scaling transformation worked, but this is what I got when I tried to do rotation:

Bug when trying to rotate the sprite!

What might be the bug? Is this a bug in the rendering code? is this a bug in the CPU that configures the affine transformation parameters to the PPU? how can we tell?

Once I made sure that my math for affine transformation checks out and ruled out a rendering bug, I knew well that I was looking at a CPU bug. (Well, at that point, I forgot I was still failing some opcodes in arm-wrestler so duh)

Since the rotation is the only buggy part, let's look at obj_aff_rotatefunction from libtonc:

This function calculates the affine transformation parameters and writes them to the OAM (which stands for object attribute RAM). The PPU later uses them to render the transformed sprite.

I re-compiled obj_aff.gba demo with debug information so I can easily locate this function in the built binary and put a breakpoint in my emulator.

This is the compiled version of this function:

The compiled version of obj_affine_rotate

I immediately noticed the highlighted rsb r1,r2 (NEG) instruction because I remembered it was one of the instructions that didn’t pass the arm-wrestler tests. Stepping through this function in my emulator alongside the no$gba debugger confirmed my suspicion. I did not implement Thumb neg instruction properly!

The mistake was pretty stupid — but I am partially to blame. the thumb neg Rd,Rs opcode is translated into ARM modersb Rd, Rs, #0 but I implemented it as RSB Rd, Rs, Rd. Why did I do it?

From the ARM7TDMI thumb datasheet

NEG is the only opcode in thumb format 4 that uses an immediate value (#0) as its second operand. I hadn’t noticed it when I was in a hurry to quickly write the implementation for all these opcodes, so there you have it. The fix was trivial and it got the rotation to work.

Moral of this story? never neglect failed opcode tests :)

This post has been a long one, but I enjoyed writing it and I do aim for this to become an ongoing series of articles.

I hope these examples showed you the interesting kind of bugs you may deal with while working on an emulator. Sadly, getting these test ROMs to pass does not guarantee that your emulation is 100% correct.

I’ve dealt with other critical bugs that evaded these tests and haunted me long after already booting many games just fine. Some bugs were related to the inaccurate implementation of the peripherals, and some were missing edge cases in my ARM7TDMI interpreter that none of the aforementioned test ROMs covered. I will cover these bug stories in the next post.

If you are interested in these sorts of things, I also recommend hopping into the EmuDev Discord.

--

--