Long Live the Semiconductor
New forms of semiconductor scaling, as well as new forms of software-hardware interactions, will shepherd us into the next era of computing, one in which massive data processing is instant, communication is flawless, and unfathomably complex computation can take place anywhere.
by Charlie Wood for The Engine
Despite being largely invisible and embedded within our devices, semiconductors now form a system as essential as roads or the electrical grid. So many facets of our daily lives — not to mention our future prospects — rely critically on these glimmering objects, and the streams of electrons alternately passing and not passing through their unimaginably thin channels. Pocket-dwelling supercomputers have granted us superpowers, letting us hail cars, identify music, and take photographs sharpened by artificial intelligence. Computational prowess has also advanced our understanding of the world, giving us models of how space quivers when black holes collide and more accurate forecasts of a hurricane’s course. Whether we will be able to keep expanding this arsenal of awesome powers at the same clip depends largely on semiconductor technology, a half-trillion-dollar industry that still remains obscure to the average person.
For decades, titans such as Intel and IBM have fashioned computer chips from ever smaller elements, spawning jumps in computation along with drops in price at such regular intervals that the progress became not just an expectation but a law, Moore’s Law. Today’s computer chips boast many millions of times the power of those 50 years ago. The processor inside even the brick that charges your phone has hundreds of times the power of the Apollo 11 Moon Landing Guidance computer, to say nothing of your phone itself. In the last decade, however, the progress of all-purpose processors has staggered as their silicon parts have shrunk so much that manufacturers are nearly working with individual atoms. At the same time, the appetite for handling 0’s and 1’s is exploding, with scientific institutions and businesses alike seeking more answers in bigger datasets. Researchers fear that the tsunami of computational need may swamp the abilities of machines, stymieing progress.
“It will stop innovation,” says Jeff Chou, an electrical engineer and founder of Sync Computing, a startup attempting to accelerate cloud calculations. “It will be a cap on what we can do.”
As this impasse draws closer, it puts more pressure on researchers and entrepreneurs to come up with ways to save computing — ways to reinvent it. As a result, more and more of Silicon Valley’s famous venture capital has been flowing into semiconductors, an industry that has in the last two decades often been considered too capital intensive to compete with, for instance, software. While building a bleeding-edge foundry has never been tougher, hiring an existing foundry to produce a bespoke chip has never been easier, and investors are flocking to startups creating processors tailored to artificial intelligence and other lucrative applications.
The federal government is also getting involved. Recognizing the strategic value of producing this essential infrastructure domestically, Biden’s infrastructure plan calls on Congress to invest tens of billions of dollars to reboot the U.S. semiconductor fabrication capacity. Much of that funding would go into traditional silicon fabrication, supporting innovators hoping not just to revive Moore’s Law but to surpass it.
There is no single replacement for the silicon transistor; nor is there just one bottleneck to resolve. If society is to continue to enjoy the rapid progress that has defined the information age, we will have to find more efficient ways to work with the processors we have, new processors tailored to the hardest calculations we face, and new materials for novel chips that can help processors communicate more quickly. Semiconductors play many roles in the informational ecosystem, and all of them are ripe for reinvention.
“People are realizing that we’re reaching the limit of where we can get to with the hardware,” says Owen Lozman, an investor with EMD Electronics’s investment arm, M Ventures. “We need a paradigm shift.”
A 50-Year Race to the Bottom
In 1959, Nobel physicist Richard Feynman gave a lecture at the annual meeting of the American Physical Society entitled, “There’s Plenty of Room at the Bottom.” The computers of the era were hulking machines that took up entire rooms in our macroscopic world — “the top,” in Feynman’s way of thinking. Instead, he urged engineers to explore “the bottom,” the miniature world of molecules and atoms. If these particles could become the building blocks of sub-microscopic transistors, computers could dramatically shrink in size while growing in power.
“Computing machines are very large; they fill rooms. Why can’t we make them very small, make them of little wires, little elements — and by little, I mean little,” Feynman said.
“The wires should be 10 or 100 atoms in diameter.”
Just six years later, Gordon Moore, a semiconductor researcher who would go on to co-found Intel, wrote an essay observing that the race to the bottom had already begun. He noted that the most economical number of components to carve into an integrated circuit hovered at around 50, but that the figure was doubling every two years, a forecast that became known as Moore’s Law.
The law has various incarnations relating to power, price, and energy, but in practice, the trend’s main driver has been the shrinking of the element at the heart of modern computing: the semiconductor transistor, an electrical switch that flickers on and off with no moving parts.
While early electronics were based on vacuum tubes — airless bulbs with a wire that could produce an on-demand stream of electrons when heated — the modern computing era began in the 1950s with the invention of the silicon transistor. On the atomic level, insulators hold their outer electrons tightly while conductors let them roam free. Semiconductors fall in the middle. Their atoms keep their electrons loosely tethered, so an applied electric field can liberate them.
This property let researchers engineer electric valves out of solid silicon blocks that could switch between the open and closed positions much more quickly, using far less energy than vacuum tubes. Crucially, making continually smaller patterns of silicon was much easier than shrinking complicated bulbs, creating a long runway for companies to take up Feynman’s challenge.
The semiconductor industry delivered, developing a complex international supply chain dedicated to transmuting piles of sand (a plentiful source of silicon) into the most intricately crafted devices in existence, with modern semiconductor chips packing in billions of transistors each measuring just dozens of nanometers across — so small that it would take more than 200 to cross a red blood cell. In 1965, Moore forecast that chips would someday host as many as 65,000 components. Last year, Apple shipped iPhones with processors containing 11.8 billion transistors.
But room at the bottom, in the atomic realm, is running out.
Modern chip manufacturers use light beams as scalpels to hew minuscule components, a booming business.
And next-generation chip production currently hinges on one machine, from one company, that can produce an exact enough light blade. Dutch multinational ASML has developed the only technology that can harness extreme ultraviolet light (EUV). To produce the 13.5 nanometer-wide ripples of light, ASML uses pulses from a metal-cutting laser to vaporize microscopic droplets of molten tin 50,000 times each second. At those wavelengths (which are more than a dozen times finer than the industry-standard ultraviolet light), even air blocks light, so the entire process takes place in a vacuum. ASML makes a few dozen EUV machines annually, each of which weighs 180 tons, takes four months to build and costs more than 150 million dollars. ASML’s market capitalization has grown from about $47 billion five years ago, to nearly a third of a trillion dollars today.
That’s not to say there’s no progress at the bottom. The Taiwan Semiconductor Manufacturing Company (TSMC) has commercialized ASML’s EUV machine to produce Apple’s A14 iPhone chip, and the tool is an essential part of the roadmaps of Samsung, Intel, and IBM. Earlier this year, IBM unveiled a chip produced with what it calls “two-nanometer” technology. The transistors themselves aren’t so much smaller than previous generations, varying from 15 to 70 nanometers in length, but IBM harnessed EUV manufacturing and other innovations to stack transistors for greater electrical control, packing 50 billion components into a fingernail-sized chip for a density 3.5 times greater than what current so-called “seven-nanometer” processes can achieve.
But the industry can afford only so many advances of this type. Dozens of chip manufacturers have quit the race to the bottom since 2002, squeezed out by prohibitive prices (Intel is spending 20 billion dollars on two new foundries). And the few that remain are starting to band together. ASML’s EUV technology is the result of a decades-old private-public consortium and funding from Intel, Samsung, and TSMC. Despite these efforts, the companies are getting less and less bang for more and more bucks. On one benchmark (known as SPECint), single-core microprocessor performance improved by 50% each year in the early 2000s, but by only 4% between 2015 and 2018. (The rise of multi-core processors came about in part to compensate for this performance plateau.)
“While Moore’s law is slowing down, we do know there’s a pathway for innovation,” says Mukesh Khare, Vice President at IBM Research. “Tech scaling isn’t only about miniaturization,” he continues, pointing to the rise in specialized computing components as an alternative way to keep powering up chips.
Yet even as Moore’s law falters, the world has never needed it more. An explosion in software services has led to an exponential hunger for computing power. Software companies are increasingly outsourcing their calculations to cloud service providers. This chip rental business generated more than 120 billion dollars in revenue in 2020, a roughly 100-fold increase from 2010. And the number of calculations needed to train the most sophisticated artificial intelligence programs (such as DeepMind’s championship-winning AlphaGo Zero) surged by more than 300,000 times between 2012 and 2018, far outstripping any version of Moore’s law.
If demand for computing capacity continues to outpace supply, the era of cheap computing could soon come to an end. Some software companies already spend half their revenue on cloud services, and data centers consume more than one percent of the world’s energy.8 Researchers and companies once scaled up their enterprises by doing more of their computing at the seemingly endless bottom. But now that expansion is shifting to the top, where companies are building bigger data centers and recruiting more chips at an increasingly high financial and environmental price.
“A lot of the benefits that came from Moore’s law; actually many of those things have already disappeared,” says Neil Thompson, an economist at MIT’s Computer Science and Artificial Intelligence Lab.
Modern necessities like affordable calculation will continue to disappear as the bottom fills up — unless, that is, electrical engineers and computer scientists can make room somewhere else.
Smarter Calculations
One of Feynman’s 1959 predictions was that more capable machines would streamline their own computations. “They would have time to calculate what is the best way to make the calculation that they are about to make,” he said.
Jeff Chou and Suraj Bramhavar are two engineers on their way to realizing a variation of this vision with an entirely different form of computing.
Almost all computers answer queries by flipping transistors on and off in such a way that they execute binary calculations in an order specified by a program: first do this, then do that. But this paradigm is not the only way to calculate.
Nature also computes. Cannonballs trace out parabolic trajectories; light always finds the quickest route between two points. The universe will always seek out the path of least resistance. Such thinking drives the development of some quantum devices, which leverage the bizarre physical behavior of particles in ways that are impossible to capture with 0’s and 1’s.
Or you might use a classical, but not digital, device known as an analog computer — a machine that physically acts like the specific system you want to study. After meeting at MIT’s Lincoln Laboratory, Chou and Bramhavar developed precisely such a machine using electric currents that synched up in a particular way.
“We built this very cheap 20-dollar circuit that could basically do the same thing that a lot of quantum companies are trying to do,” Chou says. Their research was published in Nature’s Scientific Reports.
Their circuit solved a particular class of math problem known as combinatorial optimization, essentially searching an exhaustive list of possibilities for some ideal solution. One example is the traveling salesman problem, where a salesperson seeks the fastest route between cities on a map. With each additional city, the number of routes the salesperson must check grows exponentially.
This is a problem that logistics organizations like USPS and FedEx tackle daily. It’s also a crucial aspect of cloud computing, Chou and Bramhavar realized, where bits of information flow back and forth between staggering numbers of computer chips in data centers.
“You’re trying to send a bunch of different computing jobs to a bunch of different computers at the right time at the right place,” Bramhavar says. “How do you make 1,000 chips work together better, or 10,000 chips, or 100,000 chips?”
The duo started by developing software that could mimic the behavior of their physical circuit while running remotely on Amazon Web Services (AWS) and founded Sync Computing to commercialize the technology. Early collaborators included NASA and the Air Force, who helped speed up simulations of aircraft performance by 30 to 40 percent.
Now they’ve moved on to more advanced versions of the scheduling algorithm, which helps clients from retailers to restaurants reduce their ballooning AWS cloud bills. The gains vary, but the algorithm has sped up jobs by 20 times. “It just shows you [all] the potential waste,” Chou says.
The group started out using their algorithm to orchestrate the flow of information between cores on a single chip. Now, they coordinate informational traffic between different racks in a data center. Eventually, they hope to come full circle and design an enterprise version of the original 20-dollar circuit to dispatch jobs between data centers.
“That problem gets so large that you can’t solve it quickly enough with software, and you need hardware,” Bramhavar says. “That’s where our long-term vision comes in.”
Using Light to Go Big
Building huge, datacenter-like computers is a strategy Feynman considered too, although he advocated for “the bottom” to avoid the physical limits of “the top.”
“If we wanted to make a computer that had all these marvelous extra qualitative abilities, we would have to make it, perhaps, the size of the Pentagon,” he said. But “the computer would be limited to a certain speed,” he continued, since “the information cannot go any faster than [that].”
Each of the two largest U.S. data centers already cover nearly one-sixth of the geographic footprint of the Pentagon — the world’s largest office building — and indeed much of the information inside them flies through fiber optic cables at close to light speed.9 But as data centers have grown, a more significant choke point has emerged. Light is fast enough, but converting the light into and out of the sluggish streams of electrons that silicon chips use to calculate takes time.
“The best medium to compute is electrical signals. The best medium to communicate is optical signals. So you see where this problem is,” says Mian Zhang, a photonics researcher and CEO of HyperLight, a company attempting to break this bottleneck.
Zhang, an engineer by training, never expected to co-found an integrated photonics company. But after his post-doctoral work at Harvard developing a new type of photonics conversion chip led to a series of Nature papers in 2018 and 2019, he received an overwhelming response from investors.
“We got very serious people interested,” Zhang says. “Instead of saying, ‘Hey this a nice scientific discovery,’ they were saying, ‘Where can I get these chips?’”
Some large data centers today have millions of integrated circuits devoted to translating information between photons and electrons. These chips are typically made from silicon, due to the semiconductor industry’s prowess at shaping that material. But the element does not respond naturally to light. Manufacturers must infuse it with other atoms to change its properties, which has the inconvenient side effect of making the silicon opaque to the very light it should transform.
This drawback has opened the door for other materials: notably, a transparent salt known as lithium niobate, whose crystal structure warps as light passes through it. Changes in the atomic structure tell the electrons how to move, allowing information to pass between the two worlds. Moreover, lithium niobate can deform itself hundreds of billions of times each second, fast enough to keep up with modern communication.
Zhang and his colleagues discovered a way to get the best of both worlds, harnessing the honed technology of semiconductor foundries to chisel thin films of lithium niobate. Their devices have set multiple world records. In December 2020, HyperLight demonstrated a conversion rate suitable for use within data centers that was seven times faster than what silicon devices on the market today can handle. And in March 2021, they achieved breakthrough voltage bandwidth performances in integrated electro-optic modulators. Those speeds should satisfy the growing hunger for data transmission for another decade, Zhang estimates, enabling collective computing on a scale that dwarfs the Pentagon. “All the different racks are going to behave like a single machine,” he says. “Data centers around the world are going to behave together like a mastermind.”
Low Power & High Efficiency
Feynman foresaw another barrier to computation’s expansion at the top: mammoth facilities would drain the electric grid. “There is also the problem of heat generation and power consumption; TVA [the Tennessee Valley Authority] would be needed to run the computer,” he told his audience at the California Institute of Technology.
His forecast was overly pessimistic but not entirely off. Data centers have held their energy consumption steady in recent years thanks to innovations in extreme efficiency, but researchers predict that they won’t be able to keep up with the growing appetite for calculations. By 2030, information and communications technologies may consume a fifth of global electricity.11 And many of those watts won’t even make it into chips to do work. More than 60 percent of power is lost between generation and use, according to Tomás Palacios, an MIT professor and engineer. Resistance in power lines saps energy during transmission, for instance. And after the current comes out of the wall, it passes through power adaptors and other power electronics that repeatedly reduce the voltage to what a device’s processor can handle, wasting energy at each step.
To reduce power consumption and enable other game-changing technologies, Palacios believes the semiconductor industry needs to look beyond its favorite one-size-fits-all material.
“The future of society is all about managing energy, information, and communication,” he says.
Silicon excels at manipulating information but “is not very good for the other two pillars.”
Palacios co-founded Finwave Semiconductors with Bin Lu, another MIT engineer, to bring a new semiconductor into the fold: gallium nitride, or “GaN.”
Semiconductors enable electric switches because they hold onto their electrons loosely enough that the particles can be freed on demand. GaN, however, is an example of a material that won’t give up its electrons without a fight — a “wide bandgap” semiconductor. Compared with silicon, GaN transistors need a more energetic electric field to open and close, letting them handle higher voltages and switch states more frequently. Silicon transistors must prioritize one or the other at the cost of size or efficiency, but GaN transistors can do it all.
“For the future of power electronics, you need high frequency and high voltages,” Palacios says. “Wide bandgap semiconductors are the only materials that can give you both.”
Finwave will soon release a 650-volt GaN transistor that could help data centers save energy, but the company is really aiming to disrupt Palacios’s third pillar: communication.
Beaming information through the air amounts to another conversion of power, from electricity into microwaves. Here, GaN’s greater efficiency and higher frequency pay off in the form of ten times higher outpower for base stations and up to four times better battery life for handheld devices. The 5G infrastructure rolling out today operates at relatively low frequencies, but GaN transistors could catalyze a faster, “millimeter wave” communications network.
“We have a unique technology no one else has,” Lu says. It “has a chance to truly enable revolutionary millimeter-wave 5G.”
Starting a New Race
One trend Feynman did not anticipate in 1959 was that once computing hit the bottom it, might strike out in a new direction entirely. We already use light to move data between continents and cities, and recently between server racks in some data centers. For decades, streams of light laden with information have inched steadily closer to where the real action is happening: the motherboard.
“We are at the point where it’s starting to penetrate the box,” says Jean-Louis Malinge, an engineer and investor who has worked with photonics in telecommunications for 30 years. “The photons are progressively replacing the electrons.”
A universal computer based entirely on light remains a distant dream, but a handful of companies are taking the first steps toward bringing photons into the heart of the computational ecosystem with hybrid processors outsourcing specialized, arduous work to photons.
“Photonics computing has been this holy grail type thing for decades,” says Subal Sahni, Director of Photonics Engineering at Celestial AI. “[Moving and manipulating electrons] is expensive due to power dissipation in chips. For light, it’s pretty much free.”
Startups like Celestial AI are building chips that will take advantage of light’s properties for one specific application: machine learning.
When computer scientists first attempted to run machine learning algorithms on the computers of the 1950s, the machines just weren’t up to the challenge. Training neural networks to do useful tasks boils down to multiplying gigantic matrices. Doing so sequentially with a rudimentary CPU was a bit like asking a third grader to multiply interminable numbers by hand.
We owe the recent machine learning renaissance to the rapid development of Graphics Processor Units (GPUs), which run slower than CPUs overall but can execute hundreds to thousands of multiple operations simultaneously. Today, GPUs are also running out of room at the bottom.
Hybrid photonics chips, however, could be multiplication heroes. To multiply with light, Sahni explains, you simply write a variable into a light beam (in the normal way you might encode a Netflix video) and then modulate the beam a second time to calculate. In this way, the process condenses tedious multiplication into a single step.
All manipulation of the light takes place in silicon (which is transparent at telecom photon frequencies), where features like groves and fins guide and shape the beams. Many world-class semiconductor foundries, including Intel and TSMC, can already carve increasingly sophisticated photonics circuits into silicon.
Celestial AI is operating in stealth mode, and its founders couldn’t describe the design or abilities of its machine learning chip. But the company’s founders feel confident that hybrid photonics can restore the explosive computational growth society has come to expect.
“It can have the same exponential pace of improvement generation over generation that we have historically enjoyed with Moore’s Law,” says Michelle Tomasko, one of Celestial AI’s co-founders and its head of software. “We start at a big pop and go exponential from there.”
Funding the Future
The silicon industry now stands at a crossroads. Deep-pocketed giants like Intel and IBM will likely inch closer to the bottom, but 55 years of Moore’s Law has essentially perfected the silicon transistor.
“It worked really well for many decades,” says Palacios. “We are now at the point where we need another push.”
That push may come from the federal government, which has been considering a substantial investment in semiconductor technologies, initially in the form of the bipartisan $37-billion American Foundries Act (AFA) and the Creating Helpful Incentives to Produce Semiconductors (CHIPS) Act in 2020. This spring, Biden asked Congress to expand the semiconductor investment to more than $50 billion as part of his infrastructure plan, and in June the Senate adopted many of the key propositions in its United States Innovation and Competition Act. The initiative aims to recapture semiconductor manufacturing market share, more than 70% of which
has shifted to Taiwan and South Korea, and to help the United States keep its status as a global leader in cutting-edge technologies like AI and supercomputing, even as China endeavors to displace it.
By building local manufacturing capacity for silicon while supporting emerging technologies, the program could help labs and startups introduce new paradigms like analog computing, GaN transistors, and photonics into the wild even sooner.
“This has the potential to really change the world,” Palacios says, “in the same way that the Apollo program opened the space age.”
The transformation would be profound. Today’s watches clock our heartbeats, but tomorrow’s wearables could monitor much more. Apple has invested $70 million into Rockley Photonics, a UK-based company developing a “clinic-on-the-wrist” sensor that tracks blood oxygen, glucose, alcohol, and more — using light. Related photonics technology may shrink LiDAR, improving the eyesight of self-driving cars. Australia’s “Sydney Harbor Bridge” already hosts 2,400 sensors, which report vibrations to machine learning algorithms that look for signs of an impending catastrophe. And this is just the beginning of the possibilities enabled by the convergence of power-sipping circuits, lightning-fast wireless communication, and artificial intelligence to process it all.
“These are just proofs-of-concept we’re seeing at the moment,” Lozman says.
“In the coming decades, purpose-built chips matched to their application could slip into everything from appliances to clothing, literally weaving computation into the fabric of daily life. Screens will melt away as windows display the weather forecast and devices beam holograms into the air. Algorithms may even design the next generation of AI-boosting chips, accelerating the acceleration.”
Or perhaps something entirely different will come to pass. Imagining what future engineers will build with advanced versions of today’s rudimentary technologies is a bit like asking a young Moore to speculate about what people might do with billions of transistors in their pockets.
“We are basically back in 1969,” Palacios says. The microprocessor was not yet invented. Intel had not been founded yet. The personal computer was not yet here. Nobody had heard of the internet. That’s where we are today, with all the opportunities technology is going to give us.