Darkwing Duck & Quantum Classification

Kathie Wang
6 min readApr 26, 2024

--

Beginner’s Guide to Quantum Support Vector Machines

Recently, we find ourselves lightning-struck with the terms “machine learning,” “classification,” and “supervised”. But many of us don’t know how these terms are related! This can all become quickly jumbled up in our minds and hinder us from fully understanding how these powerful algorithms were designed, developed, and constructed! This can charr us into a husk of misunderstanding and confusion leading to non-understanding.

For me, I initially thought that the three terms were on the same “level” to describe an algorithm, that a model is either a machine learning or classification algorithm. This especially confused me when I read about studies using classification and supervised machine learning! It turns out machine learning is the general term, and classification is only one big category under machine learning. For supervised classification, the process can be broken down into two parts: the learning step and the classification step.

In the learning step or training phase, the model is trained using the training data, a dataset that teaches what the expected output should look like. The training data contains the input objects and each of their desired output values, which allows the classification algorithms to “learn” what characteristics are associated with what labels and outputs. By learning the relationships and “rules”, the classification algorithm can classify similar data points correctly. Next, in the classification step, the model evaluates new unseen test data and performs predictions based on what it has previously learned.

image from labellerr.com
image from analytixlabs

There are even more subcategories within classification algorithms! Each classification algorithm has a similar goal of sorting the new data into its correct group, but reaches its conclusion through different methods. Today, I will be focusing on specifically the concept of support vector machines, and how the quantum version differs from its classical counterpart.

Important Vocabulary

To explain how quantum support vector machines (QSVMs) work, I have compiled below definitions of important terms used when describing SVMs. Take your time when reading these in order and really try to visualize the descriptions!

Hyperplane: a boundary that separates different classes of data points. When the data points are on a high dimension, the plane/boundary will also be a high dimension, so it’s called a hyperplane. Just like how a 2D plane (a standard graph) is split by a 1D plane (a line), a 5-dimensional plane can be split by a 4-dimensional plane. For QSVMs, hybrid classical-quantum optimization techniques are used to find the hyperplane.

Importance: Finding the best hyperplane is the main goal of SVMs because the hyperplane decides how test data points are categorized!

This image from pennylane shows how the original data can’t be separated with a line, but when transformed to a higher dimension, the red and blue data can be split by a hyperplane, which looks like a line!

Feature map: a map that transforms the raw features of a dataset by mapping the data points that are in a lower-dimensional feature space onto a high-dimensional space. Quantum feature maps, in particular, use quantum gates and circuits to transform classical data (typically in vector form) into quantum states, which are represented as qubits.

Importance: With the data in a higher dimension, the algorithm can capture more complex patterns in the data. Feature maps also allow SVMs to perform linear classification in a higher-dimensional space, even when the original data is not linearly separable in its original space.

Margin: the nearest data points of each class to the selected hyperplane. In SVMs, the optimal hyperplane is the one that splits the data best AND has the greatest margin.

Importance: By maximizing the margin subject to the constraints of the optimization problem, we are closer to obtaining a model with good generalization ability.

Kernel function: a function that uses the high-dimensional data and computes the similarity between two points in the feature map. In SVMs, the “similarity” of two points is the inner product between the points, and in QSVMs, this is accomplished through quantum circuits and algorithms.

Importance: The kernel function incorporates all the vocab terms above together — it decides which hyperplane has the greatest margin and best separates the data points on the feature map, allowing the model to make predictions with high accuracy.

This image shows a (n-1)-dimensional hyperplane of data mapped to the nth dimension (the “support vectors” are data points that influence the hyperplane). There are many possible hyperplanes to categorize the data, but the plane with the largest margin is the best one.

Putting It All Together

To create the quantum support vector machine, the first step is to encode the classical data into quantum states using a quantum feature map. In the quantum circuit, the parameters of the quantum gates encode information about the classical data. The quantum circuits are then used to define the kernel function, by applying quantum gates to manipulate the encoded states described in, and then measuring the quantum states to compute the inner product between them.

Next is the training phase. The QSVM model is trained using the learning/training data, finding the best hyperplane that separates the data points in the high-dimensional space, or the plane that maximizes the margin. Finally, during the testing phase, new unlabeled data is classified according to solution found in training phase.

In short, QSVMs classify data points into different classes by finding the optimal hyperplane that maximizes the margin between the classes, so unseen data can be correctly classified in the future. They use quantum computers to prepare the quantum states and evaluate the kernel function, and use classical computers to perform the optimization required to find the optimal hyperplane.

How does a quantum SVM differ from a classical SVM?

While classical support vector machines are already very well established, their quantum versions can be even more powerful! Below are the three main advantages of QSVMs:

  1. Higher-Dimensional Mapping: When encoding the data to the higher-dimension feature maps, quantum computers can compute multiple states at the same time. This can lead to better separation of data points in complex feature spaces compared to classical kernels. Also when finding the kernel function, quantum computers calculate the inner products faster than their classical counterparts because they utilize the quantum phenomena of superposition and entanglement.
  2. Faster training times: Due to increased computational power of quantum computers, the optimal hyperplane can be found faster than the process for classical computers. This can be especially useful for large-scale machine learning tasks where training time can be an issue.
  3. Better generalization performance: Since quantum computers use quantum mechanics (and properties including superposition and entanglement) to represent the data, the data are represented in a more efficient way. QSVMs have actually been shown to make more accurate predictions on unseen data than classical SVMs.

Summary

Today, you learned about quantum support vector machines, a type of classification algorithm in supervised machine learning. You learned about how quantum feature maps, kernel functions, margins, and hyperplanes all play a role in training the model, so it can make accurate predictions on future data. I hope this article cleared up some confusion on the key words in quantum machine learning, and that you understand SVMs better!

Click here to check out Vidur’s article on neural networks!

References

https://www.mdpi.com/1424-8220/19/23/5219

https://www.geeksforgeeks.org/major-kernel-functions-in-support-vector-machine-svm/

https://dzone.com/articles/quantum-support-vector-machine-101

https://www.analytixlabs.co.in/blog/classification-in-machine-learning/

https://www.labellerr.com/blog/supervised-vs-unsupervised-learning-whats-the-difference/

https://pennylane.ai/qml/demos/tutorial_kernels_module/

--

--