The anatomy of a computer (Part 3 of 4)

A brief dip into the processor

Jack Holland
Understanding computer science
6 min readMar 24, 2014

--

This is an ongoing series. Please check out the collection for the rest of the articles.

Metanote: apologies for taking so long to post this — expect more regular updates in the future!

Discussing computer architecture opens a pretty intimidating can of worms. How the processing and memory units actually compute and manage information is extremely complex and counterintuitive. I’m not going to fully open this can yet, but I will peel up the lid very gently so we can peek inside. Hopefully this will stop most of the worms from escaping.

Disappointing fact: worms don’t actually have eyes

The reason that I’m bringing up this topic now, rather than postponing it until we’re fully prepared, is because a basic understanding of it is so useful. While the majority computer scientists and programmers don’t need in-depth knowledge of computer architecture to do their work, a lot of things don’t make sense until you have a basic grasp of what’s going on in the actual hardware.

Additionally, the prevalence and necessity of parallel computing (using multiple processors to compute multiple instructions simultaneously) means that anyone interested in creating efficient programs needs at least a basic understanding of what’s going on under the hood. So, I’m giving you a crash course on the basics with the promise of more to come later.

To begin, it’s important to clarify that there are many different architectures and their differences are not trivial. With that said, we’re going to focus on one architecture in particular: MIPS. MIPS is often used when teaching architecture due to its simplified set of instructions.

The core of MIPS is its set of instructions — what kinds of computation it can do. These instructions operate over registers, which are sequences of bits (called binary numbers) that represent values. That is, each value is stored as a binary number. You can think of registers as arrays of bits. Since the bits making up the binary numbers need to be physically hardwired into the processor, the number of bits per number must be fixed to a certain amount when the processor is created.

An 8-bit register storing the binary number 10110010

The number of bits per number is called the size of the register. Common register sizes are 32-bit and 64-bit. Processors that use x-bit registers are called x-bit processors — you may have heard of the term “64-bit processor”, which describes the most common size of currently used processors. The size of the processor (i.e. the number of bits per register) is an important factor in the architecture. It limits how much memory can be used and how much information can be computed in one sweep.

Incidentally, MIPS processors that use x-bit sized registers have x number of integer registers and x number of floating point registers. Integers are numbers without decimal places, like 0, 1, -1, 2, -2, etc. Floating point numbers, or floats, are real numbers with a fixed size, like 0, 1, 1.0, 1.1. -3.4444, etc.

Note that because the integer and floating point registers both have fixed sizes, the values they can store have a fixed range. 32-bit integers range from about -2 billion to +2 billion and floats from around positive and negative 10^-38 to 10^38, which covers very small and very large numbers. But 64-bit numbers are much larger even still — exponentially so. 64-bit integers range from around -10^19 to +10^19 and floats (usually called doubles when they’re 64-bit since they contain double the bits) range from around positive and negative 10^-308 to 10^308, which is an enormous range.

If you’re not familiar with what ^ means, it’s the exponent operator. x^2 = x*x and x^3 = x*x*x. In general, x^y = x*x*x*xy times. Also, x^-y is 1/(x^y), making large values of x and/or y correspond to very small fractions. The relevance of the previous paragraph comes not from memorizing each register range, but simply grasping and intuiting the range of numbers available to the processor; 10^308 is a very large number; there are estimated to be only 10^80 atoms in the universe.

Let me give you some examples of MIPS instructions. As in Cake, instructions receive input and produce output; they accept values (registers are used as variables) and return a value (by storing it in a specified register). There are two forms of these instructions, machine code and assembly code. An instruction in machine code is simply a sequence of bits. Some bits represent which instruction to use, others represent values to use in the instructions. Most instructions can accept literal values (e.g. 10001011…) or register IDs from which to pull values (e.g. use whatever value is in register 14). Here is an example of some MIPS machine code:

Believe it or not, this binary sequence conveys meaningful information if you know what it represents

Note that in MIPS, instructions are always x-bits long, where x is the size of the processor. The above instruction adds the values in registers 10 and 8 and stores the result in register 13. While an experienced assembly programmer can decipher these binary sequences and figure out what each one does, the process is laborious and error-prone. Humans just aren’t equipped to intuitively translate bits into meaning.

Instead, assembly programmers write in assembly code (or assembly language), which the assembler translates into machine code. Assembly language is designed to be as similar as possible to machine code while still being readable. (Well, “readable” may be stretching it, but you know what I mean). Here’s the MIPS assembly code that corresponds to the machine code above:

Register IDs are prefixed with dollar signs ($) and each value given to the add instruction is separated by a comma (,)

Again, this adds the values in registers 10 and 8 and stores the result in register 13. The assembler, which is a program itself, translates this instruction in a fairly straightforward way: to start, each instruction has a unique operation code (usually called an opcode) made of 6 bits. add is 000000. The opcode is the first part of every MIPS machine code instruction. The next part of the instruction depends on the instruction type, of which there are 3:

  1. R: Register instructions, which operate on two registers and store the result in a third.
  2. I: Immediate instructions, which operate on a register and an immediate/literal value like 01101011…, and store the result in a second register.
  3. J: Jump instructions, which tell the processor to jump to a different instruction.

add is an R-instruction since it operates on two registers and stores the result in a third. Consequently, the next part of the machine code instruction is the register ID of where to store the result, which takes 5 bits. In the example above, the register ID is 13, which is 01101 in binary. The next two parts are the IDs of the two registers the instruction should operate on. Register 10 is 01010 in binary and register 8 is 01000.

The next part of the machine code is called the shift amount and isn’t relevant to our example, so we’ll fill it with zeroes: 00000. The final 6 bits is used in conjunction with the opcode to uniquely identify the instruction we want to use. add corresponds to 100000. The final machine code sequences looks like this:

I’ve inserted spaces between each “part” of the instruction so you can see them more easily

I think you’ll agree that assembly code is more readable!

However, an important question remains; how does the processor take machine code — sequences of bits like those above — and use it to actually determine the answers we want? How does circuitry take the question,

and return the right answer? That’s the conundrum we’ll tackle next time in the final part of this sequence.

Image credit: can-o-worms

--

--