Every week there seems to be some new headline about quantum computers. At the time of writing the latest big thing was Google's claim of quantum supremacy [1]. Quantum computers certainly seem to have captured the public imagination enough to demand their continual place within the news cycle. So what do we as Data Scientists (aspirational or otherwise) need to know? Luckily for me my background in Theoretical Physics allowed me to look into the literature and hopefully glean some insights.
What follows is my idiot’s guide to Quantum Computers and their role in Machine Learning.
To get a better understanding of what the quantum means in Quantum Computer we must first build up a better idea of what a classical computer actually is. Very simply a classical computer can be thought of as a device that performs sequential actions on stored information with the aim of returning some desired result. For instance a calculator storing the numbers 5 and 8 might be told to perform the series of steps that multiply the two to return 40. Information is stored within a classical computer in an array of ‘bits’ that are either 0’s or 1’s but never both. Any programme that the computer runs will perform a series of steps one by one on these bits until a desired result is reached. The amount of information that a classical computer can work with is directly related to the amount 0’s and 1’s that it can work with. So far so good, but you may have noticed where one of the limitations of classical computers come from. The power of a classical computer is intrinsically linked to the amount of memory (space for 0’s and 1’s) it has available.
A quantum computer is effectively a device that can use the unintuitive properties of matter to perform calculations. This is the main difference between classical and quantum computers. In a quantum computer information is stored not in a bit but in a qubit, the difference being that a qubit is not just a 0 or 1 but actually a linear combination of the two. What’s going on? This strange and powerful behaviour is down to the quantum properties of the system; Superposition and Entanglement.
Quantum superposition is the idea that a state (qubit) can exist as a combination of any of its possible ways of being. The state only definitely become one of these after a quantum measurement happens and the wavefunction collapses /universe splits (depending on your own particular interpretation of quantum mechanics). In our quantum computer this corresponds to our qubits being linear combinations of 0’s and 1’s (up until we get our result). What this means is that each qubit is both a little bit like a 0 and a little bit like a 1. As each qubit is a union of 0 and 1 taking multiple combinations of qubits results in many different possible states.
(0 + 1)(0 + 1) = 00 + 01 +10 + 11
(0 + 1)(0 + 1)(0 + 1)= 000 + 001 +010 + 100 + 011 + 101 + 110 + 111
This is what leads to a small number of qubits being equivalent to a very large numbers of bits; N qubits can be thought of as equal to 2^N bits. This is also what gets around a classical computers limitation of having to perform tasks one at a time. The superposition of the qubits effectively allows for large numbers of tasks to be performed simultaneously.

Quantum states love to become entangled with their environment. An entangled state undergoing a change will prompt a change in any of the states that it’s become entangled with. For instance if state A and state B have become entangled and we measure A we will immediately be able to tell something about state B. We say that A and B are correlated with one another, learning something about one of them will reveal information about the other. In the context of quantum computing we’ll start off with a collection of unentangled states, these will become entangled as the programme runs (which is where the complexity comes from), finally measuring the outcome of the programme will cause a chain reaction of collapses resulting in the output.
It also needs to be mentioned that we can’t just port our familiar code and algorithms across to quantum computers. In order to take advantage of their strange properties we need to write new quantum algorithms. A comprehensive list of all the ones that have been developed so far can be found here: http://quantumalgorithmzoo.org/.
So why bother with quantum computing? Well aside from being intrinsically more interconnected, and therefore faster, there are three main areas in data science that quantum computers should be good at:
- Finding the local minimum of a loss function.
- Dimensional reduction of very large data sets.
- Quantum classical hybrids.
There’s another property of quantum systems that we’ve not yet touched on, that of quantum tunnelling. The principle is a simple yet strange one. If you leave a particle in a box for long enough it will tunnel out and be found outside the box. This is how quantum computers can effectively find the minimum of a loss function. They sit around in a local minima for long enough that they ‘tunnel’ through into the global minima.
A big problem in optimising machine learning is that the calculations involved often take place in high-dimensional spaces (typically each variable in your dataset adds a dimension). Historically data scientists have gotten around this by using something called the kernel trick (a topic for another blog post), which helps to break up non-linearly separable data by moving to a higher dimension where the maths is slightly easier. It was recently shown that the kernel trick can be performed using a quantum algorithm that results in an exponential speedup in computation time [2]. This is a huge speedup.
Combining the two would result in a hybrid composed of a large classical computing system that relied on a smaller quantum part to handle its data. One proposed use of this would be in building up the topology of a data set in order to perform topological data analysis.
I hope that I’ve managed to provide a brief glimpse into why quantum computers have the potential to become relevant in the field of data science. I believe that quantum computers are only going to become increasingly common and that we as data scientists need to stay relevant by at least having some heuristic understanding of how they work.
References
[1]: Arute, Frank, et al. “Quantum supremacy using a programmable superconducting processor.” Nature 574.7779 (2019): 505–510.
[2]: Cong, Iris, and Luming Duan. “Quantum discriminant analysis for dimensionality reduction and classification.” New Journal of Physics 18.7 (2016): 073011.
