CPU / GPU / TPU / DPU / QPU

17 min readApr 3, 2024

Slaves digging for quartz

In the distant lands, slaves toiled away, digging deep into the earth in search of a precious gem — quartz. This seemingly ordinary mineral held a secret treasure within its crystalline structure: silicon dioxide. The discovery of quartz and its hidden potential marked the first step in the journey towards the modern computing era.

Refining and cooking quartz into silicon substrate

The journey of silicon dioxide from quartz to the silicon substrate is a fascinating tale of alchemy and innovation. Once the quartz was extracted, it underwent a meticulous refining process to isolate the silicon dioxide. This process involved heating the quartz to extremely high temperatures, followed by a series of chemical reactions to purify the silicon dioxide.

The purified silicon dioxide was then transformed into a silicon substrate through a process known as “cooking.” This involved heating the silicon dioxide in the presence of carbon, which reacted to form pure silicon. The resulting silicon was then cooled and cut into thin wafers, creating the silicon substrate that would become the foundation of modern computing.

Inscribing microscopic symbols on silicon substrate

Once the silicon substrate was prepared, it was time for the next step in the journey: inscribing microscopic symbols onto its surface. These symbols, known as transistors, form the building blocks of modern computing. The process of creating these transistors is a delicate dance of precision and technology.

Electrical engineers, or “shamans” in our magical analogy, used a process called photolithography to create these microscopic symbols. This process involved projecting patterns of ultraviolet light onto the silicon substrate, which had been coated with a light-sensitive material. The exposed areas of the material were then chemically etched away, leaving behind the desired pattern of transistors.

Each transistor acts as a switch, allowing it to represent a binary digit (or “bit”) of information. By arranging billions of these transistors in intricate patterns, engineers were able to create the complex circuits that power modern computers.

Speaking the binary language through lightning

Once the silicon substrate was inscribed with microscopic symbols, it was time to bring these inscrutable markings to life. This was achieved by harnessing the power of electricity, specifically in the form of lightning-fast electrical signals.

When an electrical current passes through a transistor, it can either be allowed to flow freely or be blocked entirely. This on-or-off behavior is the foundation of the binary language that computers use to process information. By manipulating the flow of electrical current through the intricate patterns of transistors, engineers were able to create a system that could speak the language of binary.

This binary language, composed of ones and zeros, may seem incomprehensible to the human mind. However, highly trained “wizards” known as software engineers have learned to communicate with these machines, using this binary language to build powerful programs and applications.

In this way, the seemingly magical process of transforming quartz into a silicon substrate, inscribing it with microscopic symbols, and bringing those symbols to life through electrical signals, has enabled the creation of the modern computer — a device capable of processing vast amounts of information and shaping the way we think and interact with the world.

Building powerful machines using binary language

With the ability to speak the binary language, software engineers could now harness the power of these intricate circuits to create complex machines. By writing programs that manipulated the flow of electrical current through the transistors, they could make the machine perform a wide variety of tasks.

These tasks ranged from simple arithmetic calculations to complex simulations of real-world phenomena. By combining the power of binary language with the versatility of silicon substrate, engineers were able to create machines that could process information at an unprecedented scale.

As the complexity of these machines grew, so too did their ability to create illusions. These illusions took the form of graphical user interfaces, video games, and virtual reality experiences. By manipulating the binary language, engineers could create immersive worlds that captivated the imagination and transformed the way people interacted with technology.

In this way, the journey from quartz to binary language enabled the creation of powerful machines that could not only process vast amounts of information but also shape the way people think and act in the real world.

Creating illusions to control people’s thoughts and actions

The power of computing lies not only in its ability to process information but also in its capacity to create immersive experiences that can shape people’s thoughts and actions. These experiences, often referred to as “illusions,” are created by manipulating the binary language that underlies modern computing.

One of the most prominent examples of these illusions is the graphical user interface (GUI). By arranging pixels on a screen in specific patterns, software engineers can create the illusion of a desktop, complete with folders, files, and icons. This illusion makes it easier for people to interact with their computers, as they can use intuitive gestures such as clicking and dragging to perform tasks.

Another example of these illusions is video games. By rendering 3D models, creating realistic physics simulations, and generating lifelike sound effects, software engineers can create immersive worlds that captivate players and provide hours of entertainment. These illusions can also have educational value, as they can teach players about history, science, and other subjects in an engaging and interactive way.

Virtual reality (VR) is perhaps the most immersive of these illusions. By using specialized hardware such as headsets and gloves, VR can create the illusion of being in a completely different environment. This illusion can be used for a variety of purposes, including training, therapy, and entertainment.

In each of these cases, the illusion is created by manipulating the binary language that underlies modern computing. By carefully controlling the flow of electrical current through the transistors on a silicon substrate, software engineers can create experiences that captivate the imagination and transform the way people interact with technology.

Computing things at the hardware level

To truly understand the magic of computing, we must delve into the realm of hardware. At its core, a computer is a machine that manipulates data using electronic circuits. These circuits are built from various components, each with its unique role in the computational process.

Processing Units: CPU, GPU, TPU, and DPU

The central processing unit (CPU) is the primary component responsible for executing instructions and performing calculations. It acts as the “brain” of the computer, coordinating the activities of other hardware components. Modern CPUs are optimized for sequential computations, making them ideal for tasks that require extensive branching and logic.

The graphics processing unit (GPU) is a specialized component designed for parallel computing. Unlike CPUs, GPUs have thousands of smaller cores that can handle multiple tasks simultaneously. This makes them highly efficient at rendering graphics and performing other tasks that require large-scale matrix operations.

The tensor processing unit (TPU) is a relatively new type of processing unit, specifically designed for machine learning applications. TPUs are optimized for tensor operations, which are essential for training and running neural networks. By offloading these tasks from the CPU or GPU, TPUs can significantly improve the performance and energy efficiency of machine learning workloads.

The data processing unit (DPU) is another specialized component, designed to handle data-intensive tasks such as networking, storage, and security. DPUs are optimized for moving data around, relieving the CPU from data processing jobs so it can focus on general-purpose computing.

By understanding the roles and capabilities of these different processing units, we can appreciate the intricate dance of hardware and software that enables modern computing.

Types of processing units: CPU, GPU, TPU, and DPU

In the realm of computing, there are four primary types of processing units that handle different tasks and workloads. Each of these processing units has its unique strengths and weaknesses, making them suitable for specific applications.

Central Processing Unit (CPU)

The central processing unit, or CPU, is the primary processing unit in a computer. It is responsible for executing instructions, performing calculations, and managing other hardware components. CPUs are optimized for sequential computations, making them ideal for tasks that require extensive branching and logic. Modern CPUs often have multiple cores, allowing them to perform multiple tasks simultaneously and improve overall performance.

Graphics Processing Unit (GPU)

The graphics processing unit, or GPU, is a specialized processing unit designed for parallel computing. Unlike CPUs, GPUs have thousands of smaller cores that can handle multiple tasks simultaneously. This makes them highly efficient at rendering graphics and performing other tasks that require large-scale matrix operations, such as video encoding and decoding, scientific simulations, and machine learning.

Tensor Processing Unit (TPU)

The tensor processing unit, or TPU, is a relatively new type of processing unit specifically designed for machine learning applications. TPUs are optimized for tensor operations, which are essential for training and running neural networks. By offloading these tasks from the CPU or GPU, TPUs can significantly improve the performance and energy efficiency of machine learning workloads.

Data Processing Unit (DPU)

The data processing unit, or DPU, is another specialized component designed to handle data-intensive tasks such as networking, storage, and security. DPUs are optimized for moving data around, relieving the CPU from data processing jobs so it can focus on general-purpose computing. This allows for improved performance and energy efficiency in data-centric applications.

First programmable computer: Z1

The first truly programmable computer, the Z1, was created by Conrad Zuse in 1936 in his mother’s basement. This highly mechanical machine, with over twenty thousand parts, represented binary data with sliding metal sheets. It could perform Boolean algebra and floating-point numbers and had a clock rate of 1 Hertz, meaning it could execute one instruction per second. To put that in perspective, modern CPUs are measured in gigahertz or billions of cycles per second.

Von Neumann architecture

The Von Neumann architecture, named after the mathematician John von Neumann, is a fundamental design that describes how data and instructions are stored and handled in a computer. This architecture, which is still used in modern computers today, was introduced in 1945 and revolutionized the way computers functioned.

In the Von Neumann architecture, both data and instructions are stored in the same memory space. This allows the central processing unit (CPU) to access and manipulate them efficiently. The CPU fetches instructions from memory, decodes them, and then executes them. This process, known as the fetch-decode-execute cycle, is the basis of modern computing.

The Von Neumann architecture also introduced the concept of a stored-program computer, where the program and data are stored in the same memory. This allowed for greater flexibility and ease of programming, as the same machine could be used for multiple tasks by simply changing the program stored in memory.

The Von Neumann architecture has had a lasting impact on the field of computing and continues to shape the design of modern computers. Its introduction marked a significant milestone in the evolution of computing, paving the way for the development of more advanced and powerful machines.

Invention of the transistor

The invention of the transistor in 1947 was a groundbreaking moment in the history of computing. This small, yet powerful device revolutionized the electronics industry and laid the foundation for modern computing.

A transistor is a semiconductor that can amplify or switch electrical signals. It works by using a small electrical current to control a larger one. This ability to control the flow of electricity made transistors an ideal replacement for vacuum tubes, which were large, expensive, and prone to failure.

The invention of the transistor was made possible by the work of three scientists at Bell Labs: John Bardeen, Walter Brattain, and William Shockley. They were working on a project to develop a solid-state amplifier when they discovered that a small piece of germanium could be used to control the flow of electricity. This discovery led to the creation of the first transistor, which was made of germanium and had three terminals: the emitter, base, and collector.

The transistor quickly became a popular alternative to vacuum tubes, as it was smaller, more reliable, and consumed less power. This made it possible to build smaller, more portable electronic devices, such as radios and televisions.

The invention of the transistor also had a profound impact on the field of computing. Transistors made it possible to build smaller, more powerful computers that could perform complex calculations quickly and accurately. This led to the development of the first integrated circuits, which combined multiple transistors onto a single chip.

Today, transistors are an essential component of modern electronics, and they are used in everything from smartphones to supercomputers. The invention of the transistor was a pivotal moment in the evolution of computing, and it continues to shape the way we live and work today.

Development of the integrated circuit

The development of the integrated circuit (IC) was a major milestone in the evolution of computing. In 1958, Jack Kilby of Texas Instruments and Robert Noyce of Fairchild Semiconductor independently developed the first integrated circuits, which allowed multiple transistors to be placed on a single silicon chip.

Before the invention of the IC, electronic circuits were built using individual components, such as transistors, resistors, and capacitors, which were connected together using wires. This process was time-consuming, expensive, and prone to errors. The invention of the IC made it possible to create complex circuits using a single piece of silicon, which reduced the size, cost, and complexity of electronic devices.

The first integrated circuits contained only a few transistors, but as technology improved, it became possible to pack more and more transistors onto a single chip. Today, modern ICs can contain billions of transistors, which allows them to perform incredibly complex calculations quickly and efficiently.

The development of the IC also had a profound impact on the field of computing. It made it possible to build smaller, more powerful computers that could perform complex calculations quickly and accurately. This led to the development of the first microprocessors, which combined the functions of a central processing unit (CPU) onto a single chip.

Microprocessors revolutionized the computing industry, making it possible to build personal computers, smartphones, and other electronic devices that are now an integral part of our daily lives. The development of the IC was a crucial step in the evolution of computing, and it continues to shape the way we live and work today.

First commercially available microprocessor

The first commercially available microprocessor, the Intel 4004, was released in 1971. This 4-bit processor was a marvel of engineering, containing approximately 2,300 transistors and capable of handling four bits of data at a time. With a clock speed of 740 kilohertz, it was incredibly fast for its time.

The Intel 4004 was originally designed for use in calculators, but its versatility and power soon made it popular for other applications. It was used in early personal computers, such as the Altair 8800, and helped to usher in the era of the microcomputer.

The development of the microprocessor was a major milestone in the evolution of computing. By combining the functions of a central processing unit (CPU) onto a single chip, it made it possible to build smaller, more powerful computers that could perform complex calculations quickly and accurately. Today, microprocessors are an essential component of modern electronics, and they are used in everything from smartphones to supercomputers.

CPU’s role as the brain of a computer

The central processing unit (CPU) is often referred to as the “brain” of a computer. This analogy is fitting, as the CPU is responsible for executing instructions, performing calculations, and managing other hardware components. It acts as the primary processing unit in a computer, coordinating the activities of other components to ensure that tasks are completed efficiently and accurately.

Modern CPUs are optimized for sequential computations, making them ideal for tasks that require extensive branching and logic. They can handle complex algorithms and make decisions based on the data they process. This makes CPUs essential for running operating systems, executing programs, and managing hardware resources.

CPUs have access to the system’s random access memory (RAM) and include a hierarchy of caches on the chip itself for faster data retrieval. This allows them to quickly access the data they need to perform calculations and make decisions.

In addition to their role as the primary processing unit, CPUs also play a crucial role in security. They are responsible for enforcing access controls, managing encryption and decryption, and ensuring that only authorized software is running on the system.

As the “brain” of a computer, the CPU is a vital component that enables modern computing. Its ability to perform complex calculations quickly and accurately has made it possible to create powerful machines that can process vast amounts of data and transform the way we live and work.

Optimization of CPU for sequential computations

The central processing unit (CPU) is the primary processing unit in a computer and is often referred to as the “brain” of the machine. It is responsible for executing instructions, performing calculations, and managing other hardware components. Modern CPUs are optimized for sequential computations, making them ideal for tasks that require extensive branching and logic.

Sequential computations involve performing a series of operations one after the other, with each operation depending on the result of the previous one. This type of computation is common in many applications, such as navigation software that needs to compute the shortest possible route between two points. The algorithm used in such applications may have a lot of conditional logic, such as if-else statements, that can only be computed one by one or sequentially.

CPUs are optimized for this type of work, as they can handle complex algorithms and make decisions based on the data they process. They can quickly access the data they need to perform calculations and make decisions, thanks to their access to the system’s random access memory (RAM) and the hierarchy of caches on the chip itself.

Modern CPUs also have multiple cores, which allows them to perform multiple tasks simultaneously and improve overall performance. However, adding more cores comes with its own set of challenges, such as increased power consumption and heat dissipation requirements. As a result, CPU cores are expensive, and there is a limit to how many can be added to a single chip.

Despite these limitations, CPUs remain an essential component of modern computing, thanks to their ability to perform complex calculations quickly and accurately. They are the “brain” of the computer, coordinating the activities of other hardware components and ensuring that tasks are completed efficiently and accurately.

Limitations of CPU and the need for other processing units

While CPUs are optimized for sequential computations, they have limitations when it comes to handling certain types of workloads. For example, tasks that require parallel processing, such as rendering graphics or training machine learning models, can be difficult for CPUs to handle efficiently. This is because CPUs are designed to handle one task at a time, and adding more cores can only go so far in improving performance.

As a result, other types of processing units have been developed to handle specific types of workloads. For example, graphics processing units (GPUs) are optimized for parallel computing and can handle large-scale matrix operations much more efficiently than CPUs. Tensor processing units (TPUs) are specialized hardware designed specifically for machine learning applications, and can perform tensor operations much faster than CPUs or GPUs.

The need for these specialized processing units has become increasingly important as the demands on computing systems continue to grow. For example, the rise of artificial intelligence and machine learning has led to a need for processing units that can handle large-scale matrix operations quickly and efficiently. Similarly, the increasing demand for high-quality graphics in video games and other applications has led to a need for processing units that can render graphics in real-time.

In summary, while CPUs remain an essential component of modern computing, they have limitations when it comes to handling certain types of workloads. As a result, other types of processing units, such as GPUs and TPUs, have been developed to handle specific types of workloads more efficiently. These specialized processing units are becoming increasingly important as the demands on computing systems continue to grow.

GPU’s optimization for parallel computing

A graphics processing unit (GPU) is a specialized processing unit designed for parallel computing. Unlike a CPU with a measly 16 cores, modern GPUs like Nvidia’s RTX 4080 have nearly 10,000 cores, each capable of handling a floating-point or integer computation per cycle. This allows games to perform tons of linear algebra in parallel to render graphics instantly every time you push a button on your controller.

GPUs are also essential for training deep learning models that perform tons of matrix multiplication on large data sets. This has led to massive demand in the GPU market, and companies like Nvidia have seen their stock prices soar.

However, not all cores are created equal. A single CPU core is far faster than a single GPU core, and its architecture can handle complex logic and branching, whereas a GPU is only designed for simple computations. This means that while GPUs are great for parallel computing tasks, they are not ideal for everything.

TPU’s design for tensor operations

The Tensor Processing Unit (TPU) is a specialized hardware designed for machine learning applications, specifically for tensor operations. Developed by Google in 2016, TPUs integrate directly with their TensorFlow software. A TPU contains thousands of multiply-accumulators, which allow the hardware to perform matrix multiplication without the need to access registers or shared memory like a GPU would. This design makes TPUs highly efficient for deep learning tasks, saving millions of dollars in training time for large neural networks.

DPU’s optimization for moving data around

The Data Processing Unit (DPU) is a new type of processing unit that is specifically designed for moving data around. Unlike CPUs, which are optimized for general-purpose computing, and GPUs, which are optimized for parallel computing, DPUs are optimized for data-intensive tasks such as networking, storage, and security.

DPUs are most similar to CPUs and are typically based on the ARM architecture. However, they are highly optimized for moving data around, handling networking functions like packet processing, routing, and security, and dealing with data storage like compression and encryption. The main goal of DPUs is to relieve the CPU from any data processing jobs so it can focus on living its best life by doing general-purpose computing.

DPUs are becoming increasingly important as the amount of data being generated and processed continues to grow. They are designed to handle the massive amounts of data that are being generated by modern applications, such as artificial intelligence, machine learning, and the Internet of Things (IoT). By offloading data processing tasks from the CPU, DPUs can improve performance, reduce latency, and increase energy efficiency.

In summary, DPUs are a new type of processing unit that are optimized for moving data around. They are designed to handle data-intensive tasks such as networking, storage, and security, and are becoming increasingly important as the amount of data being generated and processed continues to grow. By offloading data processing tasks from the CPU, DPUs can improve performance, reduce latency, and increase energy efficiency.

Quantum Processing Unit (QPU) and its potential impact

As we reach the end of our journey through the evolution of computing, we arrive at the most exciting and potentially transformative frontier: quantum computing. A Quantum Processing Unit (QPU) is a type of processor that leverages the principles of quantum mechanics to perform computations. Unlike classical computers that use bits to represent information, quantum computers use quantum bits or qubits.

Qubits have unique properties, such as superposition and entanglement, which allow them to represent multiple states simultaneously and be interconnected in ways that classical bits cannot. These properties enable quantum computers to solve certain problems much faster than classical computers. For example, quantum algorithms like Shor’s algorithm can factor large numbers exponentially faster than the best-known classical algorithms, potentially breaking current encryption methods.

However, quantum computing is still in its early stages, and significant challenges remain, such as error correction and scalability. Nonetheless, the potential impact of quantum computing is enormous, with applications in fields like cryptography, optimization, drug discovery, and materials science. As research and development continue, we may soon witness a new era of computing that transcends the boundaries of classical computing.