[What I Learned About] Python and Emulators [by] Making a Chip-8 Emulator

Toby Hughes
12 min readMay 2, 2016

--

This is my new series “What I Learned About X By Y”. As I am currently a student, virtually every project I work on will be a pretty large learning experience for me. Therefore, I think it will be beneficial to document different things I learn in my projects, for the sake of helping programmers learn about new things and to help myself review the concepts I’ve learned. Without further adieu, I’ll get on with the article.

CHIP-8 PONG

What is CHIP-8?

Unless you were already involved with the emulation community or enjoyed playing video games on your HP-48 graphing calculator in the 90’s, right now is probably the first time you have ever heard of this CHIP-8 thing. Why does it exist and why would someone be interested in emulating it.

The story begins back in the 1970’s when CHIP-8 was created by Joseph Weisbecker. CHIP-8 is an interpreted programming language written to be run on a CHIP-8 virtual machine. These virtual machines were included on the COSMAC VIP and the Telmac 1800 microcomputers. The VIP itself shipped with 20 individual games. At the time, CHIP-8 became quite popular and when a “VIPer” newsletter was created, the first three issues contained the machine code for the virtual machine.

And that would’ve been the story. However, almost 15 years after it’s creation, it was revived by a man named Erik Bryntse. This is where the HP-48 graphing calculators come into our story. A modern version of CHIP-8 was created for these calculators called CHIP-48 (eventually known as Super CHIP or SCHIP). This new version had a higher screen resolution and several more opcodes (for those who don’t know what this is, it will be explained later). Erik’s efforts revived the language and is possibly the reason why it still exists in the emulation community.

Nowadays, CHIP-8 is mainly a learning tool. The HP-48 calculators were discontinued in 2003 and the virtual machine hasn’t been included in any recent, notable hardware. So why emulate it? Well, it is extremely simple as far as emulators go. After reading a recent progress report for the Dolphin Emulator, I decided I wanted to get in on the action. However, I knew that I couldn’t just jump into Gamecube/Wii with no prior knowledge of emulation. So I searched around the internet and was met with the same answer — Do CHIP-8. There are only 35 16-bit opcodes that you need to emulate and the registers and memory are pretty simple as well. In order to emulate the system, the only thing somebody needs to know is the absolute basics of emulation, a programming language with a graphics library, and how to deal with binary.

If this story sounded interesting enough and you want to learn about all of this by yourself, I’d recommend this guide. They explain just enough of the process to get you started, but hold back enough information for you to figure things out yourself. And if you get too lost there is source code at the bottom. I will shamelessly admit that I peeked more than a couple times.

What I Learned About Emulation

Well, technically, everything. I went into the project only knowing what emulators do. I didn’t know the first thing about how an emulator was created. And that is what I wanted to know. I wanted to literally crack the code of how you get computers to run something as if it was some old, dead hardware. I’m not going to list every little concept that I’ve learned. If you want to learn everything, the best way to do so is to create your own emulator. The only thing it requires is programming knowledge and good google-fu. I’m just going to cover the interesting parts.

OPCODES

This was the first thing that I had to wrap my head around. Just what is an opcode? I saw it mentioned everywhere, but it felt like nobody was actually saying what it was. Even the guide I just listed above to high praise talks a lot about how to implement opcodes without actually mentioning what they are. I’m going to spare you the quick google search that will lead you to a Wikipedia article (lifesaver, I know). Opcode is an abbreviation of operation code. Opcodes are the part of a machine code instruction that defines the action.

Let’s give a quick example from the CHIP-8 instruction set. One opcode is:

0x00E0 — Clear Screen

What does this mean? When 0x00E0 was written in the CHIP-8 language, the expected effect would be for the screen to be cleared. So from an emulators point of view, you are going through the memory reading each individual opcode, and when you reach 0x00E0, you know you have to go to whatever you are using to display graphics and make the entire screen blank. In my code, I did this:

if(self.opcode == 0x00E0): 
for i in range(len(self.graphics)):
self.graphics[i] = 0
self.draw_flag = True

So when I run across the opcode I change every pixel to 0 (since CHIP-8 draws by changing pixels) and set a flag that tells the program that it is going to draw again.

For those of you who know some assembly language, you are probably familiar with opcodes. Well, kind of. Here is an example of some x86 instruction:

MOV eax, ebx 
;Moves the contents of the EBX register into the EAX register for those curious

See that MOV there? That is what is known as a mnemonic opcode. Opcodes are part of the reason that assembly languages were created. It is not easy to remember every single machine language instruction. Imagine if every time you wanted to clear a screen you had to type something like 0x00E0? In the minuscule CHIP-8 instruction set alone, there are 35 opcodes. That itself is quite an undertaking. Imagine doing that on something with 100 opcodes! 200 opcodes! You just wouldn’t. That is why we now have assembly languages (and higher level languages). It is much easier to remember that MOV moves things into different register than to remember some random hexidecimal number.

EMULATING REGISTERS AND MEMORY ISN’T AS SCARY AS IT SOUNDS

I went into this absolutely clueless on how these people were representing registers and memory. I could easily see what opcodes were doing. However, addressing and RAM are those scary words you learn about in your intro CS classes, learn how to use them, then forget any details about them because binary is still scary.

Well, after creating the emulator, it almost seems funny the fears that I had about all of this. Let’s learn about what these registers and memory actually are in regards to the CHIP-8 emulator.

The CHIP-8 has 16 general purpose registers name V0 through VF. The registers are all 1-bytes long. How can we store 16, indexed, 1-byte values in a programming language. Sounds exactly like an array of characters. In fact, in the guide listed above, that is exactly how they represent the registers. However, this emulator was written in python. Representing byte-size data in python isn’t exactly the smoothest things in the world. However, python does have this well-known 4-byte data type called the integer. As long as our code watches out for an artificial overflow, then there is no reason we can’t use it besides the size. So we would initialize the registers like this:

V = [0] * 16

What if we wanted to set the value of the VE register to 0x8A? Well, it would be as simple as this:

v[0xE] = 0x8A

We have 16 working registers. Simple as that.

Now you may be thinking, alright, registers are easy. But how in the name do I represent 4 KB of memory. Let’s look at what memory actually is. All memory means, in this sense, is a collection of values that can be reached by an address. The fact that we can reach the memory with an address means that it is indexed. 4KB means that there are 4096 of these addresses (for the uninitiated, one KB is actually 1024 bytes of memory, not 1000. Technically 1024 bytes is a kibibyte, but in reality, this is what people are generally reffering to, especially when talking about binary-addressing).

Wait a second? This sounds familiar. 4096, indexed, 1-byte values. Sounds like the same problem we ran into with our registers. In fact, it is the exact same problem. Registers are just a smaller, easily-accessible form of memory. So in order to make our memory, we need to make a much larger collection of memory. In python, it looks like this:

memory = [0] * 4096

Thats it. You now have 4KB of addressable memory. Actually, since this is python, you now have 16KB of memory since one integer is 4-bytes in length. However, since 16KB of memory is minuscule on our modern computers, it is much simpler to just take the memory-size hit than to have 1096 elements in the array and make the code significantly more complex.

So I hope this makes all of that memory and register stuff a lot less scary. It is pretty much exactly the same concept as the variables we’ve been coding with this entire time, a way to store values. Just sometimes with a bit more of those scary binary and hexidecimal numbers.

What I Learned About Python

I’ve been programming in Python since high school so this project was significantly less enlightening about the Python Standard Library than I did about emulators. However, I did touch parts of the language that I never had to touch before so I did learn about some new things. Specifically bitwise operations and time itself. Well, the time module in the standard library. But saying “time itself” sounds a lot cooler.

BITWISE OPERATIONS

Don’t get me wrong, this wasn’t the first time I had to deal with bitwise operations. And as a warning, if you don’t already have a decent grasp on binary and bitwise operations, you should spend some time with those before touching emulation. Emulation bleeds bitwise operations.

I already had a pretty decent grasp on the concept. I’m finishing up my machine language course this semester which is, to put it lightly, a bit of a crash course in non-decimal number systems. Seriously. For those who haven’t taken a machine language course, this course will take you the closest you’ve ever been to the cheesy movie-trope “Hacker thinking in binary” than you’ve ever been.

What was new to me was how python implements these operations. And I must say, after a machine language class that has opcode mnemonics like ROR eax, 1 (seriously, even from a “mnemonic” perspective, that’s hard to remember), it was so refreshing to use python’s bitwise operations. Let me give you the rundown.

Binary AND? var1 & var2
Binary OR? var1 | var2
Binary XOR? var1 ^ var2
Complement? ~var1
Right shift? var1 >> var2
Left shift? var1 << var2

That’s it. It’s that simple. Machine Language students everywhere likely just shed a tear. However, it gets better. These can be combined into assignment operators.

var1 &= var2
var1 |= var2
var1 ^= var2

The other 3, as far as I can tell, don’t have assignment operators, but I may have missed them.

Fun fact, remember how I said earlier that Python uses 32-bit numbers? That’s not exactly true. Python’s numbers can reach a theoretically infinite number of bits. This means two things.

You can do this:

>>1 << 1000
>>10715086071862673209484250490600018105614048117055336074437503883703510511249361224931983788156958581275946729175531468251871452856923140435984577574698574803934567774824230985421074605062371141877954182153046474983581941267398767559165543946077062914571196477686542167660429831652624386837205668069376

You read that correctly. That is a 312-digit number. Not very useful, but it is good fun to left-shift higher and higher numbers and stress-test the IDLE.

2. Because of the nature of negative binary numbers, negative python numbers are seen by bitwise operations as being infinity long. For the uninitiated, negative binary numbers are represented by making the highest bit of a number 1 and to read the other number’s as their opposite and adding one. So, as an 8-bit number, 3 would be represented as “00000011”, -3 would be read as “11111101”. Well, when you’re binary numbers are infinity in length, the highest bit is infinity itself. So, in python’s case, bitwise operations would treat -3 as “…1111111111111111111101” with infinite leading ones.

TIME

I’m not going to go through the entire python time module. Go read the docs, you lazy coder. I’m just going to cover the interesting thing that came up in this CHIP-8 emulator. That is, how you can cycle your program at a certain Hz. Sort of. I’m sure this implementation doesn’t exactly hold up to scientific standards, but it works for the sake of this emulator.

First, let’s remind ourselves what a hertz actually is. Hertz is defined as the frequency one cycle per second. So if you were to swing a yo-yo around your head so that it made five loops in one second, it would be doing 5Hz per second. If you suddenly gained superpowers and swung it around your head 5,000,000,000 in one second, you would be swinging it around your head at 5GHz per second.

The CHIP-8 virtual machine has two timers in it. The Delay Timer and the Sound Timer. Both count down at 60Hz. Or, in one second, the timers are decremented by a value of 60. How would we represent this in code? We time it. This is a simple way to loop at 60 Hz in python. It requires the time() method from the time module. All this does is return an integer that represents the current time. So, just a second ago, it returned 1462175614.460102. If I had done it exactly a second later, it would have been 1462175615.460102. We can use this representation to track how much time has passed between two function calls. 60Hz is 1/60th of a second, or 0.0166666… of a second. So the code would look something like this:

from time import time   #what a strange looking import
while True:
start_time = time()
some = function_call()
some &= bitwise_math
while time() — start_time < 1.0/60:
pass

However, a loop wasn’t what I needed in this case. Just the timers are decremented at that speed. So the implementation looked something like this:

from time import time    #still funny looking
def cycle():
#do random opcode stuff
current_time = time()
if current_time — last_time >= 1.0 / 60:
#do timer stuff
last_time = current_time

So this way, it doesn’t depend on the cycle speed, it just runs when 1/60th of a second has passed. It should be quite obvious why this isn’t exactly a scientific implementation. The other stuff might take more than 1/60th of a second to run. So if a cycle takes 1/50th of a second to get through, the timers are stuck being decremented at 50Hz.

I am not quite sure of the scientific way of doing this. I would love to get some comments explaining how it’s done. I’m going to take a shot in the dark and guess that it has something to do with a different thread running in the background that decrements the timers. However, for the purposes of this emulator, this loose approach works fine.

Another thing that I left not knowing is how to emulate the actual cycle speed of the CHIP-8 virtual machine. I know how to make it cycle at a certain speed, but I couldn’t find the actual cycle speed anywhere in the different CHIP-8 documentation and guides online. If somebody knows, please leave a comment. I believe that for now, I’m going to make the cycle speed a command line argument that can be decided by the user. But as of now, the program is at the mercy of how fast your computer is.

What Next?

I’m not quite done with this emulator thing. In fact, as I expressed at the beginning of this article, my journey has only begun. Eventually I want to be quite proficient at this emulator thing and contribute to projects like Dolphin. But for now, I’m going to continue to create my own emulators. Not 100% sure what my next project will be. Thinking of either emulating the Game Boy or the NES. Still looking into what those undertakings will consist of so I’ll hold off on the decision. I would love suggestions and more information about the two in the comments.

The End

And that’s it. This was, in fact, my first real article (blog post? Still not sure what you call these things on medium) on this website. With almost 3,000 words, actually turned out a bit longer than I expected. I would love some feedback on the article and the code itself, as well as comments on whether you learned anything new from this post. And if you liked what you read, please follow for more talk about emulators, python, and whatever other weird things I decide to do.

--

--