CPUs: Past, Present, and Future

A brief introduction to the history of computing and what it is now.

Published in

The Startup

11 min readOct 28, 2019

Introduction

One thing I find super impressive about the human race is its ability to take simple objects and turn them into complicated ones. For example, you’re reading this article on nothing more than a bunch of electronic switches. I’ve been fascinated with computers ever since I was a child and I’m super excited to explain how it works to you. Let’s get into it.

The First Generation: Triodes

Triodes are the first switches used for computing. They were first invented in 1906 by American inventor Lee de Forest. Thankfully, triodes are easy to visualize, because they’re so large you can actually see them with the naked eye. Triodes look like this:

The triode is made out of 3 parts: The cathode, the plate, and a control grid. These parts were all suspended in a vacuum, which is why triodes are a type of vacuum tube. The cathode is heated up to emit electrons that are attracted to the plate, with the control grid controlling the flow of electrons.

Making the control grid negative causes electrons to be repelled back to the cathode, but making it positive causes electrons to be attracted to the plate instead. Notice how there are only 2 settings: This is the primitive version of the 0/1 mechanism found in today’s computers.

As good as this seems, triodes are not without their drawbacks. To start, triodes were incredibly inefficient, consuming lots of power and generating massive amounts of heat. In addition, they were also unreliable, and would often fail every few hours or so in large systems. On top of that, as triodes are made of glass, they are fragile. This began the search for a more efficient switch.

The Second Generation: Transistors

Enter the transistors! They were a solid-state electronic switch. They had countless benefits over their predecessors. Transistors were much smaller and consumed nearly a millionth of the power. Transistors were invented in 1948 by American physicists John Bardeen, Walter Brattain, and William Shockley.

There are many different kinds of transistors, but the one we’ll focus on the ones used in today’s computers: Metal Oxide Semiconductor Field Effect Transistors, also known as MOSFETs.

Of course, there is an obvious increase in complexity and transistors are not so simple.

See? Told you. This looks a lot more complicated. But don’t worry, I’ll explain. MOSFETS are made by a layer of materials on a silicon substrate, with some layers containing certain impurities through a process called doping. Silicon dioxide (acting as an insulator), polysilicon (acting as an electrode), and metal (to connect to other transistors) are also used to construct the transistor.

The arrangement of the silicon can make it act as an insulator and a conductor, hence the name semiconductor. There are 2 types of MOSFETS: PMOS (P-Type transistors) and NMOS (N-Type Transistors).

PMOS transistors are doped with boron, which causes the silicon to be positive as it loses electrons. Inversely, NMOS transistors are doped with phosphorus, causing the silicon to be negative as it gains electrons.

If it makes you feel better, MOSFETS are also made of 3 parts. They have different names now: the gate, drain, and source. NMOS transistors are made with an N-Type source and drain with a P-Type piece of silicon in between. The source and drain are separated by the gate, which is placed above the P-Type silicon and is separated from the P-Type as well with an insulating piece of silicon dioxide.

Usually, there is no current between the N-Type and P-Type silicon, however, if there’s a positive voltage on the gate, the gate creates an electric field that attracts electrons to the P-Type silicon. The extra electrons added causes the P-Type to act as N-Type, which creates a path between the other 2 N-Types and turns the transistor on.

NMOS and PMOS transistors are combined in such a way that power is only used if the power is switched, greatly reducing power consumption and increasing density compared to vacuum tubes. This combination is called complementary metal oxide semiconductor (CMOS).

Compared to vacuum tubes, transistors are much easier to miniaturize, leading to an exponential increase in transistor density (also called Moore’s Law, which states that the number of transistors in a square inch doubles every 2 years).

However, in the past decade, it has been exponentially more difficult and expensive to miniaturize the transistor as it nears the size of a silicon atom (0.2nm). Making a transistor too small also raises the possibility of an electron flowing right through them when they are turned on, a quantum mechanic called quantum tunneling.

These transistors are packed into a central processing unit (CPU) by being attached to a printed circuit board (PCB). The bottom of the PCB is covered in electrical connections that make contact with the motherboard of a computer.

The CPU produces heat by computing, so an integrated heat spreader (IHS) is put on the die. It allows contact between a die and a cooler to cool the CPU. Some CPUs have a thermal compound between the IHS and die, while higher-end ones have the IHS soldered to the die for much more effective transfer of heat.

Clock Speeds

A CPU’s speed can be measured in many ways. Let’s start with the most basic one: Clock speeds. Clock speeds are measured in Hertz (Hz), which is the number of operations per second a CPU can perform. There’s also something called instructions per clock (IPC) which is simply the number of instructions that can be performed per clock.

The first ever digital computer, created in 1941, had a clock speed of a mere 5–10Hz. When the first commercially sold PC was released in 1974, it ran at 2 MHz (Which is 2 MILLION Hertz). And clock speeds just kept on increasing. The 1 GHz (1 BILLION Hertz!) mark was passed in 2000 using an AMD Athlon CPU. Any guesses for the clock speeds of CPUs today? 100 GHz? 500 GHz?

Wrong. Today’s CPUs only run at around 4 GHz. Increasing clock speeds uses exponentially more power and leads to a much high heat output as well. In CPU terms, power is measured in watts and is called thermal design power (TDP).

Cores

People noticed diminishing improvements in clock speeds in the early 2000s. Engineers needed to find a way to satisfy the insatiable need for processing power, which created the parallel processing of tasks. This is called symmetric multiprocessing (SMP).

SMP involves connecting 2 identical CPUs where they share main memory, input/output (I/O) devices, and the operating system. Especially intensive applications were programmed to allow for this new method of processing. Multiple CPUs can be put to work and process operations in parallel, greatly reducing the load on a single CPU and speeding up processing speeds.

As time moved on, applications grew increasingly intensive on the processor. Chipmakers decided to add onto the idea of SMP, with one major twist. Engineers integrated multiple CPUs on the same PCB, a step ahead of SMP. Those multiple CPUs, now called cores, would function as a single CPU. For example, a quad-core CPU would have 4 individual CPUs working together to process tasks in parallel.

The first multi-core CPU to market was the IBM POWER4, which released in 2001. The picture below shows 4 definite sections of the CPU, 2 sections of which are cores.

Simultaneous Multithreading

Processing power can be boosted further using a process called simultaneous multithreading (SMT). It involves splitting up a single-core into 2 threads to increase processing speeds. It’s explained very well by Linus Sebastian from Techquickie (Although you might know him from Linus Tech Tips).

Imagine you (A CPU) are trying to eat food (process a task). Your goal is to finish the food as quickly as possible, but you only use one hand at a time to eat (single-threaded processing). You can’t have multiple mouths to eat the food (you are only restricted to single core processing).

You realize that every time you finish swallowing, there’s a small amount of time where your mouth is empty. You tell yourself that you have 2 mouths, and you start using both of your hands (multi-threaded processing) to eat so one hand is always queued up to put food in your mouth and it greatly reduces the time your mouth is empty.

Multi-threaded processing is the same, tricking a core that it is actually 2 cores instead of 1 to streamline operations and generally increase processing speeds. However, it is NOT a universal improvement in all applications, only those that use parallel processing

Now, let’s talk about computers today.

How Can We Build Faster Computers?

To start, I assembled a graph of the best CPUs in each price class that could be bought in a specific time period. There are 4 classes: Entry-level, mid-range, high-end, and enthusiast-level where cost is not a concern. I will be comparing CPUs from 2008 to 2016, then CPUs from 2016 to present by their general speed increase.

Although the time difference from 2008 to 2016 was 8 years and from 2016 to 2019 it was only 3 years, there was a similar improvement in processing speeds for both. What happened? It’s time for a history lesson.

There are 2 big CPU companies, Intel and AMD. They were extremely competitive with each other, releasing new products every year and one-upping the other company. However, this only went on until the late 2000s. AMD, with poor management, started losing huge amounts of money and they also started falling behind in the CPU game. By the time the 2010s came, Intel was the company to buy from.

Many analysts predicted AMD would go bankrupt very soon. Their stock price plummeted from $40 per share to $2 per share. There was no longer competition. Intel stopped innovating and we would see 5% improvements generation-over-generation instead of the 20-something percent we were used to seeing.

In 2014, a lot of high ranking executives either left the AMD or were fired. Many of them were replaced, and many new executives joined the company. Among the new ones was the new CEO of AMD, Lisa Su. She began work on a new architecture that would bring a high core count at a low price. The architecture would be called Ryzen.

Left: A single CCX. Right: A rough diagram of a Ryzen CPU.

This is how Ryzen was built. Let’s ignore everything and only look at the cores and the CCX’s, also known as core complexes. Notice how the 2 CCX’s, each housing 4 cores, are connected through Infinity Fabric, a high speed interconnect, to create a single CPU, with 8 cores.

At the time, this technology was unthought of. Adding cores to CPUs was difficult because every extra core added would greatly increase the risk of failure. AMD eliminated that risk.

Intel’s approach was different from AMD’s, using a single monolithic die instead of 2 smaller ones. AMD’s strategy made computing much more scalable: Instead of tackling the entire CPU problem head-on, they split it up into 2 parts. They changed how CPUs were built, from low core count CPUs to high core count CPUs. Let’s start low.

There are usually two parts to computers. There’s the CPU and a graphics processing unit (GPU). Typically you’ll go to Nvidia to buy a GPU but there are some CPUs with the GPU integrated. Intel and AMD both make CPUs with integrated graphics and AMD’s integrated graphics are normally superior. AMD created something that they call an APU, basically a CPU with integrated graphics by putting a CCX and a graphics chip on the single PCB.

This labeled die shot shows many similarities with the previous Ryzen CPU, except there’s only 1 CCX and the integrated GPU replaces the other one. Let’s jump to the best CPUs each company has to offer now which are server CPUs.

These are the best server CPUs AMD and Intel have to offer

AMD’s EPYC CPU consumes half the power and costs literally ¼ of the price of the Intel Xeon CPU. In a market where 10% improvements are considered normal, this is insane. Again, AMD is undercutting Intel with a better method of constructing CPUs.

Intel Xeon CPU (left). AMD Epyc CPU (right).

Intel is back at it again with a monolithic die — unlike AMD. AMD is using a method of construction CPUs called chiplets. The chiplets design involves manufacturing everything separately and then fitting them all together on the same PCB. While Intel’s approach involves stuffing all the cores together, AMD splits everything up in 8 8-core chunks for a total of 64, with the I/O die in the center.

There are numerous advantages to using chiplets to construct microprocessors. However, the two largest are cost and scalability. The cost advantage of EPYC was clearly shown in the numbers, but scalability is also a major issue. While Intel’s offering only goes up to 28 cores, AMD can go up to 64 cores, with literally half the power consumption per core. As electricity prices are a major concern for servers, 50% less power consumption makes a massive difference.

AMD’s made a major comeback. Their success has been reflected in their stock price, where they’ve risen over 1500% over the past 4 years to about $30 at the time of writing. They’ve managed to successfully implement a new way of making CPUs that are disrupting the industry.

That’s all. In 10-ish minutes, you read a comeback story, an information-packed article, and an unpaid advertisement for AMD. Chiplets are an underrated topic and the full potential of chiplets has not fully been exploited. All I can say is that there is no better time to be in the CPU market.

Key Takeaways

Triodes are a type of vacuum tube that acts as a switch based on electron movement inside.
Transistors are a much smaller solution to triodes.
Companies are having trouble miniaturizing transistors so they’re finding new ways to boost performance such as implementing additional cores.
CPU performance improvements plateaued in the early 2010s but picked back up in the late 2010s.
Chiplets are the future! For now.