Building a Computer From Scratch: Understanding Computer Architecture #1

13 min readJun 10, 2024

Throughout this blog series, we will embark on a thrilling adventure to construct a computer from scratch. We will utilise the software and hardware simulators provided by nand2tetris alongside the informative guidebook, “The Elements of Computing Systems: Building a Modern Computer from First Principles.” Come join us as we look at the complicated interplay of hardware and software that drives modern computing by breaking it down into computer architecture. Let us begin by exploring the major abstractions that underpin the design of a typical computing system.

The hierarchical design of computer systems
Computer systems are built in layers, with each layer offering abstract services and building blocks for the layer above. This hierarchical technique reduces the complexities of computer design and development. Here’s a breakdown of these layers, from the most basic to the most abstract:

Digital Logic Level: This is where computer hardware gets its foundation. It all comes down to simple electronic circuits like multiplexers, flip-flops, and gates. They serve as the foundation for circuits that are more intricate.

Moving up a level, we have the microarchitecture. This is all about designing the processor and its components. We’re talking about things like the registers, Arithmetic Logic Unit (ALU), and control units. It’s all about figuring out how the CPU will actually execute instructions.

That being said, the Instruction Set Architecture (ISA). This is like the bridge between software and hardware. It lays out the specific set of instructions that a processor can execute, as well as how they’re represented in binary form.

Now, the Operating System Level. The operating system is used in that case. It offers services to the application software and oversees all hardware resources. In essence, it handles all the intricate details in the background, resulting in an intuitive software development interface.

At last , the Application Level is presented. Application software works with the operating system at its most advanced level to carry out particular functions for users. Everything from word processors to web browsers and gaming is included in this level.

We frequently overlook the complex layers that function beneath the surface of a computer when we use one. We often consider it to be as simple as writing a programme that executes on a machine and completes our task. But actuality is much more complicated. To understand and carry out the instructions, the computer carefully breaks down the programme, reads the code, and converts it into binary form. The tiered mechanism that drives everything from binary execution to high-level code is what actually opens our daily computer experiences.

Machine language is just a collection of binary codes that everyone agrees on. It’s like an abstract idea that needs to be brought to life through hardware architecture. This architecture is then implemented using a specific chip set that contains registers, memory units, the ALU, and other useful components. Each of these hardware devices is comprised of a number of elementary logic gates. These gates can be constructed using fundamental gates such as NAND and NOR, which are made up of a collection of tiny switching devices, typically transistors. I won’t get into the nitty-gritty specifics because that’s where computer science meets physics, and things get pretty confusing.

As we get started on the adventure of creating a computer from scratch, we begin with the fundamental building blocks: logic gates. While we will not physically implement these gates, we will define their logical behaviour with Hardware Description Language (HDL). The logical behaviour of these gates is the same across all computers, independent of the materials or fabrication techniques utilised. This consistency originates from the basic laws of digital logic, which are unaffected by physical implementation.

To understand how to design logic gates, we must first grasp their function. Logic gates are physical devices that perform Boolean functions These Boolean functions are algebraic expressions derived from Boolean variables and operations. They define how output values are determined by input values and can be represented in truth tables that list all possible input combinations and their matching outputs. Boolean algebra is a mathematical framework used to analyse and develop logical systems.It works with binary variables, which can only have one of two values: true (1) or false(0). AND, OR, and NOT are the fundamental operations of Boolean algebra, establishing the framework for more complex equations and processes.

Boolean algebra helps to simplify computer circuit design by abstracting the intricate physical mechanisms of gates. This enables computer scientists to concentrate entirely on logic and functionality, rather than the finer points of physical implementation. Gates, the fundamental building blocks of digital circuits, use transistors grouped in precise configurations to carry out basic logical functions. The gates’ input pins accept binary data as inputs, and and outputs are derived from the results of the Boolean functions they implement.

Here are some simple logic gates.

OR Gate: Returns true if at least one of the inputs is true.
AND Gate: Returns true when both inputs are true.
NOT Gate: Returns the inverse of its input.
NAND Gate: Returns false only if both inputs are true; effectively a NOT-AND.
NOR Gate: Outputs true only when both inputs are false; essentially a NOT-OR.
XOR Gate: Returns true when the inputs vary, crucial for comparison tasks.
XNOR Gate: Also known as NOT-XOR, it produces a true output when the inputs are equal.

Since all logic gates have the same input and output semantics (0’s and 1's), they can be chained together to create composite gates of varying complexity. Consider the 3-way Boolean function And(a, b, c). To use Boolean algebra, note that a·b·c = (a·b)·c. Alternatively, using prefix notation, And(a, b, c) = And(And(a, b), c).

Initially, hardware designers built gates physically, frequently combining AND, OR, and NOT gates to form more complicated ones such as XOR gates. However, correcting even minor flaws in physical structure proved difficult. Today, hardware designers use computer workstations to plan and optimise chip architecture. They employ structured modelling languages like Hardware Description Language (HDL), often known as VHDL (where V stands for Virtual), to simulate and refine designs virtually, removing the need for manual creation and making error correction simple.

A chip’s HDL specification is divided into two sections: header and components. The header section defines the chip’s interface, including its name and the names of its input and output pins. The parts section describes the names and arrangements of all of the lower-level components (other chips) utilised to build the chip. Each part is represented by a statement that includes the part name and its connections. When writing an HDL programme, make sure that you define your input and output pins. Inter-part connections are formed by creating and connecting internal pins as required.

Let’s go over the process of implementing an AND gate in the HDL program using a pre-set NAND gate from nand2tetris. Just like axioms in mathematics, primitive gates serve as the foundation for building more complex structures. These gates come with a ready-made implementation, allowing us to use them without concerning ourselves with their internal workings.

/**
 * And gate:
 * if (a and b) out = 1, else out = 0 
 */
CHIP And {
    IN a, b;    // Declare input pins a and b
    OUT out;    // Declare output pin out

    PARTS:
    // Using NAND to implement AND
    Nand(a=a, b=b, out=nandOut);  // First NAND operation with inputs a and b
    Nand(a=nandOut, b=nandOut, out=out);  // Negate the output of the first NAND
}

This piece of code demonstrates the construction of an AND gate using HDL by combining NAND gates.

Input and Output: The gate requires two inputs, labeled as a and b, and generates a single output, referred to as out.
First NAND Gate: It takes a and b as its inputs and produces an intermediate output known as nandOut. A NAND gate produces a true output only when at least one of its inputs is false.
Second NAND Gate: This gate takes nandOut as both of its inputs and inverts the signal. This effectively transforms the NAND gate into a NOT gate.

By combining these two steps, the circuit functions as an AND gate. When both a and b are true, the output out is also true. If either or both inputs are false, the output will be false. This example illustrates how more intricate logic gates can be constructed using simpler ones.

When I run this HDL code in the hardware simulator, it produces the following outputs for the AND gate:

a=0, b=0 => out=0
a=0, b=1 => out=0
a=1, b=0 => out=0
a=1, b=1 => out=1

This confirms that the AND gate is correctly implemented using NAND gates.

In a similar way, let’s explore how MUX and DMUX work, and delve into their implementations.

Multiplexer (MUX)

A multiplexer chooses one out of many input signals and sends it to a single output. It is a digital switch enabling several input signals to utilize one output line simultaneously. In essence, it picks one input from several and sends it to the output according to a selection signal.
Consider a MUX as a railway switch that guides the train (data) from multiple tracks (inputs) to one track (output) depending on a control mechanism (selection signal).
How it works

Input: It usually has numerous inputs, typically denoted as a, b, c,…
Selection Signal: The selection signal (sel) determines which input is to be directed to the output.
Output: The output is a single line that carries the value of the selected input.

For example , a 2-to-1 multiplexer has two inputs and one selection signal:

When sel = 0, the output is the first input (a).
When sel = 1, the output is the second input (b).

Here’s an HDL representation of 2-to-1 multiplexer:

CHIP Mux {
    IN a, b, sel;
    OUT out;

    PARTS:
    // nSel = not sel
    Not(in=sel, out=nSel);

    // aAnd = a AND nSel
    And(a=a, b=nSel, out=aAnd);

    // bAnd = b AND sel
    And(a=b, b=sel, out=bAnd);

    // out = aAnd OR bAnd
    Or(a=aAnd, b=bAnd, out=out);
}

Within this piece of code:
NOT gate is used to invert the selection signal sel.
Two AND gates direct the correct input to the output depending on the chosen signal.
OR gate merges these outputs to generate the ultimate result.

In the same way, we can create a MUX16, MUX4Way16, and MUX8Way16 by utilising fundamental MUX elements.

MUX16

A MUX16 is a 16-bit multiplexer that selects one of two 16-bit inputs based on a selection signal (sel) and produces a 16-bit output. When sel is 0, the output is equal to input a, and when sel is 1, the output is equal to input b.

/**
 * 16-bit multiplexor: 
 * selects one of two 16-bit inputs based on a selection signal & produces a 16-bit output (out).
 * when sel is 0, the output is equal to input a, and when sel is 1, the output is equal to input b
 * for i = 0, ..., 15:
 * if (sel = 0) out[i] = a[i], else out[i] = b[i]
 */

CHIP Mux16 {
    IN a[16], b[16], sel;
    OUT out[16];

    PARTS:
    // Instantiate 16 2-input Mux chips to select between corresponding bits of a and b
    // according to the value of sel.
    Mux(a=a[0], b=b[0], sel=sel, out=out[0]);
    Mux(a=a[1], b=b[1], sel=sel, out=out[1]);
    Mux(a=a[2], b=b[2], sel=sel, out=out[2]);
    Mux(a=a[3], b=b[3], sel=sel, out=out[3]);
    Mux(a=a[4], b=b[4], sel=sel, out=out[4]);
    Mux(a=a[5], b=b[5], sel=sel, out=out[5]);
    Mux(a=a[6], b=b[6], sel=sel, out=out[6]);
    Mux(a=a[7], b=b[7], sel=sel, out=out[7]);
    Mux(a=a[8], b=b[8], sel=sel, out=out[8]);
    Mux(a=a[9], b=b[9], sel=sel, out=out[9]);
    Mux(a=a[10], b=b[10], sel=sel, out=out[10]);
    Mux(a=a[11], b=b[11], sel=sel, out=out[11]);
    Mux(a=a[12], b=b[12], sel=sel, out=out[12]);
    Mux(a=a[13], b=b[13], sel=sel, out=out[13]);
    Mux(a=a[14], b=b[14], sel=sel, out=out[14]);
    Mux(a=a[15], b=b[15], sel=sel, out=out[15]);
}

MUX4Way16

A MUX4Way16 is a 4-way 16-bit multiplexer that selects one of four 16-bit inputs based on a 2-bit selection input. Here’s how we use MUX16 components to build a MUX4Way16:

/**
 * 4-way 16-bit multiplexor:
 * out = a if sel = 00
 *       b if sel = 01
 *       c if sel = 10
 *       d if sel = 11
 * selects between four 16-bit inputs (a, b, c, d) based on a 2-bit selection input (sel)
*/


CHIP Mux4Way16 {
    IN a[16], b[16], c[16], d[16], sel[2];
    OUT out[16];
    
    PARTS:
  // Implementing two 2-way 16-bit multiplexers
    Mux16(a=a, b=b, sel=sel[0], out=temp1);
    Mux16(a=c, b=d, sel=sel[0], out=temp2);

    // Selecting the correct output based on the second bit of the selection input (sel[1])
    Mux16(a=temp1, b=temp2, sel=sel[1], out=out);
}

In this code, we initially utilise two MUX16 components to choose between inputs a and b, and c and d, based on the first bit of the selection signal (sel[0]. The second bit of the selection signal is then used to select between the outputs of the first two MUX16 components.
Using these fundamental components, we can build more complex multiplexers, such as MUX8Way16, by combining MUX4Way16 and MUX16 elements.

Demultiplexer (DMUX)

A demultiplexer, or DMUX, serves the opposite function to a multiplexer. It accepts a single input and assigns it to one of numerous output lines based on a selection signal.
Consider a DMUX to be a distribution switch that directs a single data signal (input) to one of numerous output lines using a control mechanism (selection signal).

How It Works

Input: A demultiplexer has a single input.
Selection Signal: The selection signal (sel) determines which output the input should be routed to.
Outputs: The outputs are multiple lines that can carry the input value.

For example, a 1-to-2 demultiplexer has one input and one selection signal:

When sel = 0, the input is routed to the first output (a).
When sel = 1, the input is routed to the second output (b).

Here is an HDL implementation of a 1-to-2 demultiplexer:

/**
 * Demultiplexor takes a single input signal and routes it to one of two output signals based on the selection signal (sel).
 * [a, b] = [in, 0] if sel = 0
 *          [0, in] if sel = 1
 */
CHIP DMux {
    IN in, sel;
    OUT a, b;

    PARTS:
     // NOT gate to invert the selection signal
    Not(in=sel, out=nsel);

    // AND gates to route the input to the correct output based on the selection signal
    And(a=in, b=nsel, out=a);
    And(a=in, b=sel, out=b);
}

In this code:

The selection signal sel is inverted using a NOT gate.
Two AND gates route the input to the appropriate output based on the selection signal.

Using the basic DMux, we can build more complicated demultiplexers like DMux4Way and DMux8Way.

DMux4Way

A 4-way demultiplexer takes a single input signal and routes it to one of four outputs based on a 2-bit selection signal.

How it works:

The DMux4Way behaves as follows:

When sel is 00, the input is routed to a.
When sel is 01, the input is routed to b.
When sel is 10, the input is routed to c.
When sel is 11, the input is routed to d.

Here’s the HDL implementation of a 4-way demultiplexor:

/**
 * 4-way demultiplexor demultiplexer takes a single input signal and routes it to one of four output signals based on a 2-bit selection signal (sel).
 * [a, b, c, d] = [in, 0, 0, 0] if sel = 00
 *                [0, in, 0, 0] if sel = 01
 *                [0, 0, in, 0] if sel = 10
 *                [0, 0, 0, in] if sel = 11
 */
CHIP DMux4Way {
    IN in, sel[2];
    OUT a, b, c, d;

    PARTS:
   // First level DMux splits the input based on sel[1] (MSB) into two intermediate signals (ab and cd)
    DMux(in=in, sel=sel[1], a=ab, b=cd);
    
    // Second level DMux further splits into outputs a and b based on the LSB of sel (sel[0]).
    DMux(in=ab, sel=sel[0], a=a, b=b);
    
    // Second level DMux further splits into outputs c and d based on the LSB of sel (sel[0]).
    DMux(in=cd, sel=sel[0], a=c, b=d);
   
}

In this code:

We use a DMux to split the input in into two intermediate signals, ab and cd, based on the most significant bit (sel[1]) of the selection signal.
We further split ab into a and b based on the least significant bit (sel[0]) of the selection signal.
Similarly, we split cd into c and d based on the least significant bit (sel[0]) of the selection signal.

DMux8Way

An 8-way demultiplexer takes a single input signal and routes it to one of eight outputs based on a 3-bit selection signal.

How it works:

The DMux8Way behaves as follows:

When sel is 000, the input is routed to a.
When sel is 001, the input is routed to b.
When sel is 010, the input is routed to c.
When sel is 011, the input is routed to d.
When sel is 100, the input is routed to e.
When sel is 101, the input is routed to f.
When sel is 110, the input is routed to g.
When sel is 111, the input is routed to h.

Here’s the HDL implementation of a 8-way demultiplexor:

/**
 * 8-way demultiplexor takes a single input signal and routes it to one of eight possible output signals based on a 3-bit selection signal (sel).
 * [a, b, c, d, e, f, g, h] = [in, 0,  0,  0,  0,  0,  0,  0] if sel = 000
 *                            [0, in,  0,  0,  0,  0,  0,  0] if sel = 001
 *                            [0,  0, in,  0,  0,  0,  0,  0] if sel = 010
 *                            [0,  0,  0, in,  0,  0,  0,  0] if sel = 011
 *                            [0,  0,  0,  0, in,  0,  0,  0] if sel = 100
 *                            [0,  0,  0,  0,  0, in,  0,  0] if sel = 101
 *                            [0,  0,  0,  0,  0,  0, in,  0] if sel = 110
 *                            [0,  0,  0,  0,  0,  0,  0, in] if sel = 111
 */
  CHIP DMux8Way {
    IN in, sel[3];
    OUT a, b, c, d, e, f, g, h;

    PARTS:
    // First level of demultiplexing based on the most significant bit (sel[2])
    //splits 'in' into four signals based on the two most significant bits of 'sel'
    DMux4Way(in=in, sel=sel[1..2], a=w,b=x,c=y,d=z);
    
    // Further demultiplexing based on the remaining bits sel[0]
    // Each of the four signals from the first level is further demultiplexed into two signals
    // based on the least significant bit of 'sel'
    DMux(in=w, sel=sel[0], a=a, b=b);
    DMux(in=x, sel=sel[0], a=c, b=d);
    DMux(in=y, sel=sel[0], a=e, b=f);
    DMux(in=z, sel=sel[0], a=g, b=h);
}

In this code:

We use a DMux4Way to split the input in into four intermediate signals, w, x, y, and z, based on the two most significant bits (sel[1..2]) of the selection signal.
We further split each of these intermediate signals into two output signals based on the least significant bit (sel[0]) of the selection signal:
We split w into a and b based on sel[0].
We split x into c and d based on sel[0].
We split y into e and f based on sel[0].
We split z into g and h based on sel[0].

These components are critical in digital circuits, allowing for efficient data routing and distribution. They are easily implemented using HDL for creating and simulating digital systems. If you want to see more examples, go to my GitHub. This is the end of my first week of building a computer from scratch. Stay tuned for the following section, where we’ll look at arithmetic operations on these logic gates.

Building a Computer From Scratch: Understanding Computer Architecture #1

Multiplexer (MUX)

MUX16

MUX4Way16

Demultiplexer (DMUX)

DMux4Way

How it works:

DMux8Way

How it works:

Written by Ruthu Rao