What you should know about GPUs Part 1 — History & Industry

5 min readDec 26, 2023

As someone who likes to understand things deeply, this is my research into understanding AI chips, GPUs and the surrounding innovation. It was particularly hard to dig up knowledge because of the abundance of non-substantive content on the subject.

Enjoy the information that I found deep in reddit rabbit-holes and TSMC blog posts.

Part 1 — History & Industry

Part 2 — The Rise of AI

Part 3 — The Paradigm Shift of Compute

Origin of the GPU

The first Graphics Processing units (GPUs) were created to render visuals in arcade games. As these techniques evolved the GPU became the heart of graphics compute in everything from complex multiplayer gaming systems to the display of your Mac/PC.

In the early 2000s researchers at Stanford discovered the relationship between graphics processing and other compute intensive processing tasks (AI) and partnered with NVIDIA to classify cats, running the first ML model (an image classifier) on a GPU. The generalization of the GPU stems from the fact that it excels at SIMD (Single Instruction Multiple Data) operations.

Why is the GPU so good at math?

A CPU is the brain of a computer, and a GPU is the calculator. CPUs control everything, save to memory and decide when work is sent to the GPU for evaluation. A core is the component that does math, the calculator’s processor. A core is full of transistors which represent numbers in binary. Cores run boolean operations AND, OR, NOT, XOR (exclusive OR) representing any multiplication, division, addition, or subtraction. Read a cool description of how this works in a calculator here. The number of cores determines the number of simultaneous processes that can be run.

The massive parallelization through cores is why GPUs excel. There are 5120 cores on a V100 (Recent NVIDA GPU) while the average CPU has 48 cores. That’s about 100x processing capacity. Though core count is a useful and frequent comparison it is not quite apples to apples. GPU cores have a small eligible instruction set while CPU cores are more versatile. See here for an in-depth explanation.

The use of GPUs for AI was brought to the mass market via NVIDIA’s release of CUDA (Compute Unified Device Architecture) in 2006. CUDA is the toolkit that connects traditional programing languages (python, java, C++) to the machine code required to leverage a GPUs compute. With this innovation, NIVDIA sparked a revolution of GPU adoption for data centers over the following 10 years.

Nvidia Quarterly Revenue by Category

CUDA turned the GPU (Graphics Processing Unit) into the GP-GPU (General Purpose Graphics Processing Unit). Want more? See theses articles for more on the GPU & it’s use for AI & ML.

General Purpose-GPUs (Pre-AI)

To understand the innovations in GPU technology it is useful to understand dynamics in the semiconductor industry.

The Semiconductor Industry

Most of us know Intel, Samsung, TSMC and most recently NVIDIA, but it is critical to understand how these how these players interact, to understand the categories of innovation.

The IDMs of today are the monstrous organizations that ruled semiconductor production during the tech proliferation of the 90s. Nowadays cutting edge chips come most often from fabless design companies & OSATs in partnership.

Fun fact — AMD used to manufacture their own chips but they spun off global foundries in 2009. Former AMD CEO Jerry Sanders once remarked “real men have fabs.” Not anymore Jerry.

In GPU innovation the cutting edge of chip design is coming from NVIDIA and AMD. The advancement of fitting more compute into smaller spaces (transistor density) while maintaining manufacturability comes from OSAT innovation, namely TSMC.

Design Innovation

The innovation in design has come from engineering components and transistors, arranging wires closer and closer together while avoiding crosstalk (where adjacent wires interfere with each other’s signals), and managing heat.

Design companies also spend time innovating the interface through which software engineers use GP-GPU compute. Examples range from the CUDA toolkit to complete apps designed to leverage the GPU.

NVIDIA Apps

Chip Hardware Innovation

Chip hardware innovation is largely owned by the OSATs who work to push the boundary of what was possible in terms of size and manufacturability. The underling philosophy in semiconductor design has been Moores law — that the number of transistors in an integrated circuit (IC) doubles every two years.

Interesting excerpt from a TSMC article (2019) -

Some people believe that Moore’s Law is dead because they believe it is no longer possible to continue to shrink the transistor any further. Just to give you an idea of the scale of the modern transistor, the typical gate is about 20 nanometers long. A water molecule is 0.275 nanometer in diameter! You can now start counting the number of atoms in a transistor. At this scale, many factors limit the fabrication of the transistor.

The primary challenge is the control of materials at the atomic level. How do you place individual atoms to create a transistor? How do you do this for billions of transistors found on a modern chip? How do you build these chips that have billions of transistors in a cost effective manner?