On October 23, 2019 the journal Nature published Quantum Supremacy using a Programmable Superconducting Processor, which describes the breakthrough by Google’s AI Quantum team in building the first quantum processing system capable of clearly demonstrating an advantage over “classical” supercomputers on a particular class of difficult computational problems. But what does this mean, really, and what is its significance? To answer this question, let us first review just what a quantum computer is, and how Google has built theirs.
Why Bother with Quantum Computing?
Quantum computing algorithms have existed since before the first prototype of a quantum processor was built in 1998, but even so, quantum algorithms are in their infancy as a field. One of the earliest quantum algorithms, Shor’s algorithm, from 1994, allows an integer N to be factored on a quantum computer in polynomial time, O((log N)²(log log N)(log log log N)), which is exponentially faster than the best known classical algorithm, the general number field sieve. It is the very “hardness” of factoring integers that underlies RSA and a number of other cryptographic systems. So solving what was once thought of as a theoretical problem in pure mathematics takes on considerable importance.
Other established quantum algorithms offer fundamental advantages over the best known classical solutions. Some are quite theoretical. Others are more practical. Grover’s search algorithm from 1996 can find an element in an unsorted list of N elements in √N steps, compared to N/2 for a classical scheme. Proposed quantum optimization algorithms offer quadratic to exponential speedups over the best of known classical techniques. Quantum computations are the most natural way to simulate molecules and chemical reactions. But the field is in its infancy. We are only now learning how to build machines to run these algorithms.
Quantum Computing Basics
Classical Binary Data and Quantum States
Conventional binary digital computers,“classical” computers in the quantum computing idiom, operate on the basis of bits, binary elements of information that are either 0 or 1. That’s only two possible values, but they can be concatenated to represent arbitrarily large value ranges, with each additional bit doubling the range of distinct values. Arithmetic, logical, and other operations can be performed on classical bit values by combining sequences of elements of boolean logic — AND, OR, NOT, etc — to operate on binary values and create new binary values.
Quantum computers exploit the fact that quantum particles, such as electrons, photons, or ions, have properties like spin and polarization that can be manipulated to encode and process binary information. These processors operate on quantum bits, or qubits. This quantum information has unusual and counter-intuitive properties that make quantum computing both more powerful and more difficult than classical.
A qubit will be represented by the state of a quantum system that can be usefully visualized as a “Bloch sphere”:
The quantum state of a qubit may be in a definite 0 state or a definite 1 state, corresponding to the topmost and bottommost points on the Bloch sphere, respectively. But it can also be in a superposed state between them, as is the case for point x,y,z in the diagram. A superposed quantum state between ∣0⧽ and ∣1⧽ does not mean that the qubit has a fractional value greater than 0 and less than 1. Rather, it means that the value will be measured as 0 or as 1, according to the probability distribution described by its position on the Bloch sphere. The probability of it reading as 0 in the example above can be computed as cos(σ/2)². If the state corresponds to a point on the dotted “equator” of the Bloch sphere, it will measure as 0 or 1 with equal likelihood.
Measurement is a fundamental concept in quantum mechanics. Measurement is not merely an extraction of a value from a qubit, it also affects the quantum state. If the state of the qubit in the Bloch diagram above is measured, and the result is 0, the qubit’s state will thereafter be ∣0⧽, with σ=0, so that further measurements of the qubit will read as 0 until some operation changes the state. Conversely, had the initial measurement yielded 1, the qubit’s state would thereafter be ∣1⧽, while σ would be 𝜋. Measurement collapses superposition. Schrödinger’s famous cat is only both potentially alive and potentially dead until the moment that the box is opened.
Classical vs. Quantum Addition
For comparison with its quantum counterpart, consider how two classical bits can be added, producing a 2-bit sum, using AND, OR, and NOT operations in the binary logic circuit below. The two-bit binary output value o[1..0] is the sum of input bits i0 and i1. As there is no “carry” input, the only possible results are 00, 01, and 10.
Operations on Qubits
Just as classical binary logic is created by composing simple boolean primitives that can be implemented with a handful of transistors each, and just as one can, in theory, implement all possible binary logic circuits using only NAND gates, quantum computations can be expressed in terms of quantum gates. These have some pretty fundamental differences from their classical counterparts. Unlike binary bits, the full quantum state of a qubit cannot be copied or cloned. A quantum gate computation must be reversible, which requires every gate to have as many outputs as it has inputs. In the case of Google’s quantum technology, the qubits are small, superconducting oscillators, and the gates are applied to them by sending minutely controlled microwave pulses to perturb them. These qubits don’t flow through gates and wires, gates are applied to the qubits.
Where the classical logic model has only one useful unary operation, NOT, much of the richness of quantum computation involves unary operators on individual qubits. Unary gates can generally be visualized as rotations over arcs around the axes of a Bloch sphere. For example, the Hadamard, or H gate, can be thought of as a 180° rotation around the Z axis, followed by a 90° rotation about the Y axis of the Bloch Sphere.
When applied to a quantum state that has been prepared to be ∣0⧽ (a definite 0) , a Hadamard gate puts the qubit in a state of equal superposition of potential 0 and 1 values, a common idiom in quantum computing:
Two-qubit gates are often conditional versions of a unary operator, e.g. CNOT (conditional negation) or CZ (conditional phase inversion). The conditional-conditional-NOT (CCNOT) gate, commonly called a Toffoli gate, is the quantum analog to a NAND gate, the gate out of which one could theoretically construct all possible circuits.
Quantum gate model algorithms are graphically depicted as a left-to-right sequence of operations on qubits. A unary operation will be depicted as a labeled box on a given qubit’s time-line. Multi-qubit operations are depicted with vertical connections between points on multiple time-lines.
The basic one-bit add circuit for qubits is:
If the inputs ∣i₀⧽ and ∣i₁⧽ are the basis vectors ∣0⧽ or ∣1⧽, the top and bottom points on the Bloch sphere, then the measurements will be as one might naively expect based on experience with classical logic, i.e, if both are ∣0⧽, both of the output measurements will be zero, no matter how many times one runs the experiment. But, unlike the classical logic circuit, this gate program also manipulates superposed states.
As mentioned above, a Hadamard gate, when applied to a qubit in a ∣0⧽ basis state, results in a superposition with equal probabilities of measuring 0 or 1 from the qubit.. Extending the adder example with prepared state inputs and H gates, as depicted below, is in a real sense simultaneously computing all possible 1-bit sums, and the measurements will return 00, 01, or 10, with probabilities of 25%, 50%, and 25% respectively, reflecting the superpositions of the probability amplitudes for ∣00⧽, ∣01⧽, ∣10⧽ and ∣11⧽ as the state vectors of the output qubit pair.
So while 2 classical bits can contain one of 4 possible values, 2 qubits can represent 4 simultaneous values, in superposition; the probability amplitudes of 00, 01, 10, and 11. Each additional qubit potentially doubles the number of superposed states — which is why simulating systems with more than about 40 qubits with a classical computer is a major challenge. Each additional qubit doubles the memory requirement to classically store the state of the quantum system.
While there is little “quantum advantage” in the example of addition, this quantum parallelism, the ability to operate simultaneously on a potentially large set of superposed values, is the key to most useful quantum algorithms.
Qubits can become entangled so that measuring one of an entangled pair determines the value that will be measured from the other. This can seem to result in “spooky action at a distance” as Einstein skeptically referred to it, and is often misrepresented in the popular imagination, but it is a key property for secure communications applications, and it happens inevitably in the course of executing quantum algorithms. Some entanglement is induced in the quantum gate addition program above: If the O₁ bit is measured as 1, the probability of O₀ then also measuring as 1 collapses to zero.
Fidelity and Decoherence
Physical qubits operate at physical scales and energy levels such that errors are not so much a matter of if as a matter of when. Quantum states collapse, not only when measured, but when the qubits interact and entangle with elements of the surrounding system. The simple fact that the quantum system is in enough contact with the rest of the universe to allow deliberate manipulation and measurement implies enough random interaction to eventually cause the qubits to decohere and lose their superposed information. In addition, each gate operation carries an additional risk of inducing an error. Single and multiple-qubit gate operators have a fidelity, which is the likelihood of the operation completing without error.
The Space/Time Volume of Quantum Computation
For any implementation technology, there is a limit to the number of qubits that can be made to operate together in a single quantum computer, and there is a limit to the coherence times and operational fidelities of the qubits in the system. The coherence times and fidelities impose a temporal limit, a maximum number of quantum operations before the errors become inevitable and irrecoverable. The number of available qubits multiplied by the number of operations defines a space/time volume of quantum computations that can be performed on a given machine.
The history of quantum computing has been one of more and better qubits causing the available volumes to expand, while more sophisticated quantum algorithms reduce the volume required to solve a problem. The point at which the expanding operational envelope intersects the shrinking requirements envelope is the point where quantum computing becomes useful for applications.
Building a Real Machine
Google Superconducting Qubits
Any two-level quantum mechanical system can act as a qubit — a photon, an electron, an atom or ion, but also macroscopic quantum mechanical systems like oscillating superconducting magnetic fields. The Google team has been working with “Xmon” transmon superconducting qubit designs since 2012, continuing work that had been going on for over a decade at UC Santa Barbara. It may seem counterintuitive, in an industry where performance and value have been driven by driving the size of transistors down to a handful of atoms, but macroscopic qubits are easier to manipulate. Superconducting qubits are relatively easy to engineer, given the scale of research and investment in semiconductor fabrication technology. Google qubits look like this as a planar structure:
The qubits are instantiated in the blue and green false color shaded “plus” shaped structures, which, at half a millimeter, are visible to the naked eye. The qubit frequency, amplitude, and phase are controlled by microwave pulses arriving from below. Their state can be measured using resonators, the squiggly structure above each qubit, to pick up induced current. With this 1D array of qubits, the Google/UCSB team hit a breakthrough 2-qubit fidelity of 99.4% in 2015. The design efforts since then have focused on how to turn this into 2D scalable designs.
The 22-qubit “Foxtail” and 72-qubit “Bristlecone” quantum processor chips take the Xmon qubit design to a “2.5D” fabrication model, where the qubits are implemented on one piece of silicon in a 2-dimensional array, while the control and measurement elements are symmetrically tiled across a matching 2D silicon carrier. The two pieces of silicon are connected using face-to-face “bump bonding” to make a multi-chip module. The Foxtail carrier and qubit chips are shown below.
Control and Isolation in a 2D Qubit Array
The operation of multi-qubit Xmon quantum gates requires an electrical coupling between qubits. This can be accomplished by bringing together the oscillating frequencies of neighboring qubits.
While this is relatively straightforward in a 1D array, where each qubit has at most 2 neighbors, the problem becomes more subtle in a 2D array organization. If a 2-qubit operation is to be performed, not only do the frequencies of the qubits involved need to be brought close enough to induce coupling, but the frequencies of the other neighboring qubits need to be taken further away from the coupling frequency. The diagram below shows the 8 qubits directly affected when a 2 qubit operation is set up in a 2D array of superconducting qubits. The qubits being made to couple and entangle are sent to a common frequency (=) while the surrounding qubits are sent to either a higher (+) or lower (-) frequency in a way that reduces the likelihood of their being accidentally affected.
The Foxtail and Bristlecone chip designs used passive capacitive coupling between neighboring qubits, which allows maximum qubit density, and worked quite well in 1D qubit arrays. In parallel to the bring-up of Bristlecone, a new generation of design was under development to move beyond passive coupling.
The Sycamore Processor
The Sycamore processor design makes the trade-off of fewer qubits in the array in return for more control lines and hardware, to provide better operational fidelities. Active, adjustable coupler circuits are added to the chip, with associated external control logic. The number of qubits is reduced to 53 from the Bristlecone processor’s 72, but the overall space/time volume of computation is enhanced. The key is maintaining control, at scale.
While the limits of the quantum computer’s functionality may be defined by the superconducting qubit chip and its readout logic, which must be operated at about 0.01°K, much of the complexity is in the control and readout logic connected to the superconducting qubit module but located higher up in a dilution refrigerator cylinder, where the operating temperature is a warmer 3°K — the temperature of interstellar space. Signals are then routed to and from the top of the refrigerator and the 300°K environment of the machine room, from whence they are routed to conversion, measurement, and control systems in an adjacent rack.
The control systems include atomic clocks and signal generators to deliver the precisely timed bursts of microwave energy that modulate the quantum oscillators to implement the gate operations of a quantum program. When appropriate, the readout lines are interrogated, collapsing the superposition of qubits to sets of binary values that are measured, logged, and analysed.
NISQ Computers and Applications
Machines such as the Google Foxtail, Bristlecone, and Sycamore machines are often referred to as NISQ computers, for Noisy, Intermediate-Scale Quantum computers. They have enough of a quantum computational volume to run non-trivial programs, but not enough to do so whilst correcting for errors on-the-fly.
Given that eventual decoherence of qubits is inevitable and not just bad luck, error correction on quantum computers isn’t just a luxury or an added-cost feature for enterprise-grade systems, it is a necessity for broad classes of applications. However, depending on the degree of correction (e.g. bit-flip only, or bit-flip and phase-flip errors), and the underlying quality of the physical qubits, the number required to implement a single, error-corrected “logical qubit” could range from 10 to 100 to 1000. From a computer science perspective, the most important application of NISQ machines may well be in validating and advancing the theories of quantum error correction. But there are other application areas for which NISQ machines may prove useful.
While quantum information theory has its origins further back in the 20th century, the first proposal to build a quantum computer is generally attributed to Richard Feynman in a 1982 paper entitled “Simulating Physics with Computers”, in which he observed that if physics is fundamentally quantum mechanical, a computer based on quantum principles should logically be the most efficient means of accurately simulating physical systems. And so it is that molecular modeling looks to be one of the first areas where quantum computing may prove to be economically viable. Molecules are relatively simple quantum mechanical systems, but every additional subatomic particle in a molecule doubles the quantum state space, and many “interesting” molecules have too many simultaneous interacting quantum states to be simulated practically by a classical computer. The Google team performed the first scalable simulation of the simplest possible molecule, H₂, with a 5 qubit machine in 2016, but large molecules of heavy atoms will require large, error-corrected quantum computers to fully simulate. NISQ systems should be able to aid in the study and understanding of intermediate-scale molecules of interest, particularly if variational quantum hybrid techniques prove to be effective.
Other possible areas for useful NISQ applications include optimization and machine learning. Specialized quantum machines have been built to do “quantum annealing” optimization, analogous to simulated annealing algorithms used in classical operations research. These quantum annealing machines are not fully programmable quantum computers, but fully programmable gate-model machines can run adiabatic and other optimization algorithms.
The quest for quantum supremacy is perhaps more usefully thought of as a quest for quantum comparability. As has been seen, quantum computation has some significant advantages, as well as some disadvantages, over classical computation. When and how can one say that a quantum computing technology is “better” than classical? One of the first formulations of quantum supremacy was: when a problem is solved on a quantum machine that could never be solved on a classical computer. The problem there is that designers of classical computers have imaginations, too, and “never” is a very strong statement. A more nuanced definition might be: when a problem is solved on a quantum machine that is infeasible with current classical technology.
Seeking to demonstrate supremacy or advantage requires careful answers to two questions: Advantage doing what, precisely? And if a quantum computation is infeasible for a classical machine to reproduce, how can we verify that the quantum program was completely and correctly run?
Feynman’s original insight, that a quantum computer should be the best way to model quantum mechanics, is at the core of the “supremacy” benchmark proposed by Sergio Boixo et. al. If quantum computers are to be worth building, they must, at the very least, be more effective at running a quantum computer program than a simulation running on a classical machine. The idea is to generate pseudo-random gate-model programs of parametric size and depth, which generate high levels of entanglement across qubits, and randomly apply gates to them. Measurements of such a system provides a distribution of outputs that is not uniformly random, but instead exhibits a specific pattern of quantum interference. The challenge for the classical computer application is, from a given gate program, generate outputs with the same distribution as a quantum device. Classical simulation for programs on small numbers of qubits is quite simple, but for circuits of nontrivial depth, there is no sufficiently accurate classical simulation model that does not double its resource requirements with every additional qubit. Simulating 40–50 qubits to the necessary accuracy is a supercomputing problem.
Given that the objective is to demonstrate that the quantum computer has done more than a classical machine, how is it possible to verify that the results are correct? While the correct execution of a random gate program results in a distinctly non-uniform output distribution, the degree of entanglement is such that a single failed gate operator will result in a measured output distribution uncorrelated with the known ideal distribution. The difference between the expected and sampled distributions can be expressed as cross-entropy, a term more commonly used in machine learning. The cross-entropy fidelity of the system is the probability of a run of a given random program to a given depth producing an output bit string from the valid output distribution.
The larger and deeper a quantum gate program, the lower the achievable cross-entropy fidelity. As the fidelities of the constituent gate operations can be measured, they can be used to predict the fidelity of the full system. Even with high constituent fidelities, the multiplication of probabilities represented by a deep quantum gate program will result in a numerically low predicted fidelity. The largest random circuit tested used 53 qubits, and executed 1113 single-qubit gates and 430 two-qubit gates. The predicted fidelity for that circuit was 0.2%, which may seem numerically small, but which is well within the margin of confidence.
The experiment was run on the Google Sycamore machine, with simulations on Google servers, at the Jülich supercomputing center, and on the Summit supercomputer at the US Department of Energy’s Oak Ridge National Laboratory. At Google, a series of runs were done which established that measured cross-entropy fidelities tracked their predicted values to the limits of full simulation, and beyond.
53-qubit benchmark circuits were simulated on the Summit supercomputer to depths of 12 and 14 cycles. The classical computational cost of simulating to the same levels of fidelity increases non-linearly with circuit depth. Simulating one of the circuits for 12-cycles consumed 1.29 hours on 4550 nodes of Summit. The equivalent simulation to 14 cycles would consume 1624 hours, 67.7 days of Summit time. The Sycamore supremacy runs on 53-qubit circuits went to a depth of 20 cycles. Sampling this circuit a million times took 200 seconds on the quantum machine, but would require roughly 10,000 years to simulate on a million-core classical supercomputer. The circuits and the measured output bit strings from the quantum computations were archived, so that future classical simulation algorithms can be benchmarked against the experimental data.
In another ORNL experiment, a program over 49 qubits was simulated for 40 cycles on Summit, completing in 2.44 hours of elapsed time on 27,000 GPUs, or 281 petaflops. The classical run consumed an estimated 21 MWh of power. This compares to roughly 100 seconds (0.027 hours) and 0.00042 MWh on a Google quantum machine.
The diagram below shows how the operation of the Sycamore machine tracked full simulation (◯) to the limits of practical simulation, and partial simulation (✕, +) beyond.
A Quantum Kittyhawk
In thinking about the significance of the Google quantum supremacy experiment, I think that there’s a pretty strong analogy to the Wright brothers’ first manned, powered aircraft flights at Kittyhawk, NC in 1903. Not unlike the Google Quantum AI team, the Wright brothers were competing with other teams seeking to create the first practical airplane, including some who were far better known and better financed. Samuel Langley was Secretary of the Smithsonian Institution, with access to substantial US government funding and better engine technology than the Wrights.
But the Wrights had developed better means to estimate stress and loading, and above all, the Wrights understood that flying was unlike “classical” surface transport in that it requires simultaneous control over three axes, roll, pitch, and yaw. If Langley’s 1903 Aerodrome had survived launch, the pilot would have been in grave danger, as there were simply no mechanisms that could have enabled a landing maneuver.
The Wright flyer only made 4 flights on December 17, 1903. None lasted so long as a minute. None covered so much as 1000 feet. The basic Wright airframe design, with the horizontal and vertical stabilizers split fore and aft of the main wings, was proved impractical and abandoned within a few years, even by the Wrights. It was a decade before the first commercial passenger airplane flight. But human-piloted, heavier-than-air, powered flight had been shown to be more than just a dream that refused to die.
Somehow, more than a century later, there are echoes of the wind on the North Carolina dunes mixing with the sound of helium pumps in a Santa Barbara lab.