CPU: How it works…

Gabriel Batista
6 min readAug 28, 2018

--

Computers have revolutionized the ways we lead our everyday lives, but few people understand whats really going on below all that glass and brushed aluminum. Today I want to take a deep dive into how a processor goes about reading, interpreting and storing a piece of code and its outcome. Lets break it down…

The Pieces

Before we get into how the CPU work I want to give a light description of what each piece is. Don’t worry too much if it doesn’t make sense at first, as I will be going over their functions in detail in the next section.

At a basic level, a CPU consists of 6 pieces:

The Clock

The clock synchronizes the internal operations of the CPU with other system components.

The Control Unit

The Control Unit is in charge of delegating instructions from the code to the component who handles it, along with the data needed for the operation.

The ALU (Arithmetic Logic Unit)

The ALU performs arithmetic operations such as addition, subtraction and logical operations such as AND, OR and NOT.

Registers

The registers work as lighting fast memory which lives inside of the CPU and is used to store values as calculations are performed on them, hold status flags or reference the next instruction to be executed. There are 8 general purpose registers in 32-bit systems and 16 general purpose registers in 64-bit systems, along with another 6 segment registers, a EFLAGS register and an Instruction Pointer register (commonly referred to as EIP)

MMU (Memory Management Unit)

The MMU is responsible for the translation of virtual memory addresses to physical addresses. It manages memory protection, cache control and bus arbitration.

The Cache

The cache is a special set of memory located inside of the CPU which stores the data needed for the next few sets operations. Due to the CPUs incredible speed a bottleneck is created when a piece of data which is needed for an operation must be retrieved from outside of the processor. The cache allows for a mitigation of that bottleneck by preloading the needed sets of data in bulk for the next few commands.

Registers

Before we can get to the meat of how a instruction is executed I need to go into detail about the registers. Registers are how you interface with the CPU. The values that you manipulate are stored in registers and the result of said operations will also be placed in a register.

The flow of information in the CPU is directed by the registers. In the register definition I mentioned that a 32-bit computer contains 8 general purpose registers. The first 4 registers are truly general purpose in the sense that they can be used for whatever means the programmer sees fit, but they do have a suggested purpose and some procedures (functions) will look into specific registers for values. The registers are as follows:

EAX

The first of the truly general purpose registers. It is also known as the accumulator and is used for arithmetic operations.

EBX

The second of the truly general purpose registers. It is also known as the base register and is used as a pointer to data. This register is also used for indexed addressing.

ECX

The third of the truly general purpose registers. It is also known as the counter register and is used to keep track of loop iterations and shift/rotate instructions.

EDX

The last of the truly general purpose registers. It is also known as the counter register and is used for multiplication and division operations along with EAX.

These registers are the ones you will find your self using most often for your calculation. The rest of the register can also be edited at will but will effect the flow of your program in doing so.

EBP

This register is used to point to the base of the hardware stack and tells the stack pointer where to return when a function evocation is complete.

ESP

This register is used to point to the top of the stack. It works in tandem with the EBP to keep track of the stack.

ESI

This register is used to point to the source of a stream. This register can be used for a few different purposes including working with strings and doing mass assignment operations.

EDI

This register is used to point to the destination of a stream. This register can be used for a few different purposes including working with strings and doing mass assignment operations.

EFLAGS

This register is used to keep track of boolean flags such as overflow , carryover , zero , sign and many more. Each flag is represented by a single bite in the register.

SEGMENT REGISTERS

There are actually 6 separate flags, but suffice to say they point to segments of the currently running code. When coding in assembly you can split your code into segments such as .data and .code .

and last but certainly not least…

EIP

This is the instruction pointer. It points to the next code in line to be executed. This register can be manipulated to jump around the program and implement things like loops.

THE LOGIC

With the definitions behind us we can get to the good stuff.

The process of executing a command has 3 steps. fetch, decode and execute.

FETCH:

The CPU first pops off the instruction address pointed to by the IP (instruction pointer) register and stores it in the MAR (memory address register) which is in charge of fetching any data needed for the instruction it is given. The MAR sends a request through the address bus to the memory controller on the RAM. The memory controller will check to see if the requested data is held within it, and if not, will fetch the data from the hard drive in turn. Once the data is located it will be sent back to the CPU through the data bus and the CPU will store that information in the MBR (Memory Buffer Register). Finally the IP will increment so that it points to the address of the next instruction.

DECODE:

Now that the CPU has the command in hand it needs to translate that into something it can actually understand. In this step the CPU will decode the assembly language into machine code, taking special care to find all the parameters needed for the execution and putting together the operands needed for arithmetic and floating-point calculations.

EXECUTE:

Lastly, we have the execute step. This is the cycle where data processing actually takes place. The instruction is actually carried out upon the data. The result of this processing is stored in yet another register.

After the execute step in completed the CPU begins a new cycle by fetching the next command.

BONUS!

As a bonus I want to run through a VERY simple (and completely useless) piece of assembly code.

The mov command on line 4 simply puts the value of the second parameter into the first parameter. So we are setting the register EAX to the value 0. On the next line we are doing the same except its the value 8 into the EBX register. The cmp command stands for compare. This command works sets a flag which we can use to trigger a jump command. Speaking of jump commands line 7 ( jl L1 ) says jump if less than to label L1. It is looking back at the last cmp statement, it checks if the first parameter is less than the second and then moves the Instruction pointer to a label named L1.

Below this block there are 2 separate parts of code denoted by labels( L1: and EX: ). Lets assume that our jl (jump less than) command has triggered the jump. This would leave us on line 12. The next command to be executed would be the inc command. This simply increments the register passed to it by one. We compare EAX and EBX , then jump to EX: when the value of EAX is exactly 8. EX: simply exits the program.

And thats that. I hope this helps to fill you with questions and lead you to learn assembly and the hardware of your computer a little deeper!

As always if you have any questions please leave me a comment!

--

--

Gabriel Batista

Full-stack developer with a background in computer repairs, looking for my first break in software engineering. https://linkedin.com/in/gabriel-batista-dev/