Introduction into Quantum Support Vector Machines

Patrick Huembeli
7 min readOct 19, 2019

--

Today we are giving a hands-on introduction into Quantum Machine Learning (QML) at the QML workshop at the Institute of Photonic Sciences (ICFO) in Barcelona. As an example we introduced the Quantum Support Vector Machine (QSVM) from here [1] and showed how to run it on a simple dataset with QISKIT. In this post I would like to give you a brief summary of QSVMs and share the main parts of the code. The whole jupyter notebook can be found on my GITHUB page.

I will assume here that you already know what a classical SVM is. If you don’t know this I suggest that you download the GITHUB REPO and go through the jupyter notebook. Two of my colleagues, Alexandre Dauphin and Gorka Muñoz wrote a nice introduction, where classical SVMs are reviewed.

I will try to keep this post as simple as possible. If you want to have it a bit more rigorous and extensive go to the jupyter file on GITHUB.

We also prepared a Google Colab file to run it directly in the cloud.

Quantum Classification

Before we start with quantum SVMs, how does general supervised classification work? We have a set of points {𝑥⃗} that are in either one group or another (which we label e.g y={-1,1}) and we want to find a line (or hyperplane for higher dimensions) that separates these two groups. This line can be linear, but it can also be much more complex, in the classical SVM we achieve more complexity with the use of kernels.

So what will we need for supervised classification on a quantum computer?

  • 1st we will have to translate the classical data point X into a quantum datapoint |Φ(𝑥⃗). This can be achieved by a circuit V(Φ(𝑥⃗)). Where Φ(…) could be any classical function applied on the classical data 𝑥⃗.
  • 2nd we need a parameterised quantum circuit W(Θ) with parameters Θ that processes the data in a way that in the end we…
  • 3rd can apply a measurement that returns a classical value -1 or 1 for each classical input 𝑥⃗ that identifies the label of the classical data.

Following these steps we can define an ansatz for this kind of problem which is W(Θ)V(Φ(𝑥⃗))|0⟩.

These kind of ansätze are called quantum variational circuits.

Quantum SVM

In the case of a the quantum SVM in this paper they only used the quantum feature maps V(Φ(𝑥⃗)) to translate the classical data 𝑥⃗ into quantum states and build the Kernel of the SVM out of these quantum states. After calculating the Kernel matrix on the quantum computer they can train the Quantum SVM the same way as a classical SVM.

There are QSVMs where also the training is made on a quantum computer like for example [2]. But these kind of algorithms most likely won’t work on near term quantum computers.

Defining the Quantum Kernel

The idea of the quantum kernel is exactly the same as in the classical case. We take the inner product of the feature maps 𝐾(𝑥⃗,𝑧⃗)=|⟨Φ(𝑥⃗)|Φ(𝑧⃗)⟩|², but now with the quantum feature maps. The idea is that if we choose a quantum feature maps that is not easy to simulate with a classical computer we might obtain a quantum advantage.

Side note: There is no proof yet that the QSVM brings a quantum advantage, but the argument the authors of [1] make, is that there is for sure no advantage if we use feature maps that are easy to simulate classically, because then we would not need a quantum computer to construct the Kernel.

Feature map

For the feature maps we use the ansatz V(Φ(𝑥⃗)) = U(Φ(𝑥⃗))⊗𝐻ⁿ from [1] which simplifies a lot when we only consider up to Ising like interactions, which means we only let two qubits interact at a time. This leads to interactions ZZ and non interacting terms Z. 𝐻ⁿ is the Hadamard gate applied to each qubit, where n is the number of qubits.

And the classical functions are defined as Φ(𝑥⃗) = xᵤ and
Φᵤᵥ(𝑥⃗) = (π-xᵤ)(π-xᵥ).

If we write this ansatz for n=2 qubits with the classical data vector 𝑥⃗= (x₁, x₂) the feature map takes the form:

U(Φ(𝑥⃗)) = exp(i {x₁ Z₁ + x₂ Z₂ + (π - x₁)( π - x₂) ZZ₂})

We won’t get into details to much here, why we would take this ansatz. It is simply an ansatz that is simple enough that near term devices can run it and still leads to good results.

Finally we can define a depth of these circuits. Depth 2 means we repeat this ansatz two times. Therefore our feature map becomes

V(Φ(𝑥⃗)) = U(Φ(𝑥⃗))⊗𝐻ⁿ⊗U(Φ(𝑥⃗))⊗𝐻ⁿ

Measuring the Quantum Kernel

Finally we need to extract the information about the quantum kernel again from the quantum circuit to feed it into the classical SVM algorithm. This is actual a non-trivial task, because we want to measure the overlap of two states 𝐾(𝑥⃗,𝑧⃗)=|⟨Φ(𝑥⃗)|Φ(𝑧⃗)⟩|². There is a smart way to estimate this overlap, in [1] they used this circuit for the estimation of the overlap.

How to estimate the overlap |⟨Φ(𝑥⃗)|Φ(𝑧⃗)⟩|² for a circuit depth of S=2, where V(Φ(𝑥⃗)) = U(Φ(𝑥⃗))⊗𝐻ⁿ⊗U(Φ(𝑥⃗))⊗𝐻ⁿ. (Image from [1])

Which simply uses the fact that if V(Φ(𝑥⃗)) = V(Φ(𝑧⃗))) the whole unitary in the figure would simplify to the unity matrix and the input should be equal to the output.
If we now measure the output, the frequency of the measurement string [0,0,0,0,…] gives you an estimate of the overlap |⟨Φ(𝑥⃗)|Φ(𝑧⃗)⟩|². (Because initially all qubits were |0⟩ the output should also be |0⟩ for every qubit.)

Build QISKIT circuit for feature map

Qiskit Aqua provides a pre-built function that can construct the quantum circuit for V(Φ(𝑥⃗)). The Second Order Expansion refers to the number of interacting qubits which we called S before.

The function SecondOrderExpansion has the arguments feature_dimension, which is the dimension of the input data 𝑥⃗ and at the same time also the number of qubits n. depth is the number of repetitions of the feature map.

from qiskit.aqua.components.feature_maps import SecondOrderExpansionfeature_map = SecondOrderExpansion(feature_dimension=2, 
depth=1

As a first test you can define an arbitrary classical vector xand print the circuit to see how the SecondOrderExpansion looks like.

x = np.array([0.6, 0.3])
feature_map.construct_circuit(x).draw(line_length = 100)

Which returns this circuit:

You might wonder now why there are no exp(i ZZ₂) and exp(i Z₁) matrices. This is simply because Qiskit does not provide these gates. We have to tranlate them into the gate set that Qiskit provides, which are manly 1-qubit rotations (here U3) and CNOTs. Qiskit does this work for us. It translates our feature map directly to gates that can be run on their qunatum computer. This is called compiling.

QSVM Algorithm in QISKIT

Qiskit aqua also provides a pre-defined function to train the whole QSVM. Where we only have to provide the feature map, a training and a test set and Qiskit will do all the work for us.

Apart from finding the quantum Kernel the QSVM algorithm does only classical optimization. In the end there is no difference to the classical SVM, except that the Kernels are coming from a quantum distribution.

QSVM will minimize the loss

𝐿(𝑊)=∑ᵤ𝑤ᵤ−½ ∑ᵤᵥ𝑦ᵤ𝑦ᵥαα𝐾(𝑥⃗ᵤ,𝑥⃗ᵥ)

via optimizing the parameters 𝑊.

After training we can predict a label 𝑦′ of a data instance 𝑠⃗ with 𝑦′=sign(∑ᵤ𝑦ᵤαᵤ𝐾(𝑥⃗ᵤ,𝑠⃗)).

from qiskit.aqua.algorithms import QSVMqsvm = QSVM(feature_map, training_input, test_input)

As a training and test set we chose here the Sklearn datasets.load_breast_cancer() dataset. In the GITHUB repository we provide a python script that let you load everything.

Run QSVM in QISKIT

Finally we will have to define where we would like to run this algorithm. For now we will run it on a local QASM Simulator. But the algorithm could also be sent to the IBMQ an be evaluated on a real quantum computer.

We will have to define the shots, which are the number of measurements that we will take for each qubit. And for better reproducability we also set here the random seeds seed_simulator and seed_transpiler.

from qiskit.aqua import run_algorithm, QuantumInstance
from qiskit import BasicAer
backend = BasicAer.get_backend('qasm_simulator')
quantum_instance = QuantumInstance(backend, shots=1024, seed_simulator=10598, seed_transpiler=10598)
result = qsvm.run(quantum_instance)

1.3.4 Analyze output

results is a dictionary that contains elements like the kernel matrix, the label predictions and also the classical weights of the QSVM. For example you can print the test_accuracy.

print(“testing success ratio: “, result[‘testing_accuracy’])

The qsvm also has a function that is called predict. With this we can predict the labels of a given test set

y_test = qsvm.predict(test_set, quantum_instance)

And we can compare the labeling from the QSVM (first plot), with correct labeling (2nd plot).

test_set = np.concatenate((test_input[‘Benign’], test_input[‘Malignant’]))
plt.scatter(test_set[:, 0], test_set[:,1], c=y_test)
plt.show()
plt.scatter(test_input[‘Benign’][:,0], test_input[‘Benign’][:,1])
plt.scatter(test_input[‘Malignant’][:,0], test_input[‘Malignant’][:,1])
plt.show()
Left: Label prediction by the QSVM, Right: Correct labeling

References

[1] Vojtech Havlicek, Antonio D. C´orcoles, Kristan Temme, Aram W. Harrow, Abhinav Kandala, Jerry M. Chow, and Jay M. Gambetta1, Supervised learning with quantum enhanced feature spaces, Nature 567, 209–212 (2019).

[2] Rebentrost, P., Mohseni, M. & Lloyd, S. Quantum support vector machine for big data classification, Physical review letters 113, 130503 (2014).

--

--

Patrick Huembeli

I am postdoctoral reseracher in the Computational Quantum Science Lab at the École Polytechnique Fédérale de Lausanne (EPFL).