Hybrid Quantum-Classical Neural Network for classification of images in FashionMNIST dataset

Akif Baser
CodeX
Published in
7 min readAug 15, 2021

To kickstart my quantum journey after two weeks of IBM Qiskit Global Summer School 2021 on Quantum Machine Learning, I explored the Qiskit hybrid PyTorch classical-quantum neural network architecture that is available in Qiskit. For anyone who wants to explore quantum computing and kickstart their quantum journey, it is a good starting point to explore how to embed a quantum element to your already existing AI/ ML workflow and obtain faster practical results from experimenting with quantum computing.

The core advantage of quantum computing is leveraging quantum mechanical phenomenons like interference, superposition and entanglement for computational advantage. It promises to provide exponential speed up in computations compared to classical computing, solving complex problems that would not be possible otherwise.

An illustration of the hybrid architecture from the Qiskit textbook

The scope of this case study is to establish a basic hybrid quantum-classical neural network architecture for the classification of images in the FashionMNIST dataset. Qiskit’s TorchConnector will be used to integrate a Quantum Neural Network (QNN) layer into the PyTorch workflow. The QNN will be then available as a PyTorch module and can be trained jointly without additional considerations. The above picture from the Qiskit textbook provides a nice illustration of the hybrid architecture. Codes of the case study can be found in Github. If you want to take a deep-dive into quantum computing, check-out the literature in the references.

Quantum-Classical Neural Network

The PyTorch architecture that will be used is a Feed Forward Neural Network. The input to the classical layer of nodes are (real valued) vectors. The hidden layer of the network will be a parameterized quantum circuit that takes the output of the classical layer. The rotation angles for each gate of the quantum circuit would be specified by the components of the classical input vector³. A QNN works differently than a classical neural network. Instead of using neurons, weights and biases, the input data is encoded into qubits. The objective at the end will be the same, minimise the loss function. However, it is done slightly differently in case of a QNN. A sequence of quantum gates, equivalent to classical logic gates of conventional circuits but then a quantum logic gate in this case, are applied on the qubit. The gate parameters are changed in such a way that the loss function is minimized. It will actually minimize the difference between network predictions and input labels. The gate operations are actually unitary operators and the building blocks of a quantum circuit².

Quantum data (blue/ orange color shows two mapped labels) represented on Bloch sphere (left) and a representation of classical bit vs qubit (right)

It is important to note that in quantum computing, the basic unit of information is represented by a qubit. Unlike the classical bits, only represented by 0 and 1 at all time during computations, qubit can get into superposition by forming a linear combination of its states|ψ⟩ = α|0⟩ + β |1⟩. It has to be only well defined when a measurement is made, collapsing to either 0 with a probability of |α|² or to 1 with a probability of |β|². This might sound counterintuitive, since the qubit can exits in a continuum between |0⟩ and |1⟩ before measuring it, but it could take exponentially many logical states at once because of this property. Entanglement is another interesting property, highly correlated two quantum states |ψ⟩ = |a⟩|b⟩ that cannot be described independently, which make superdense coding possible¹. A linear entanglement will be used in this case study. There is a certain randomness and uncertainty in the measurements, so the output of the measurement will be the highest probable quantum state it would be in.

QNN Architecture

To go from the classical data in the hybrid architecture to quantum states, the classical data (xᵢ) is encoded into the quantum state space (|ϕ(xᵢ)⟩) by using a quantum feature map. The classical data is then represented in hilbert space as a quantum state and can function as input to our parametrized quantum circuit⁸. In this case, a ZZFeatureMap is constructed and used to map the classical feature vector to the quantum state.

A more detailed example of a ZZFeatureMap with its Hadamard and CNOT gate operations

A two layer QNN is then created with the TwoLayerQNN Class of Qiskit and embedded into the PyTorch architecture as a hidden layer. The TwoLayerQNN is initialised by providing the constructed ZZFeatureMap and an Ansatz (quantum states will have real amplitudes only) plus the number of qubits.

The input gradient is set True to allow for hybrid gradient backpropagation. A linear entanglement is enabled and for this case Pauli Expectation is used to compute the expectation value of the observable against a state function.

The complete FashionMNIST data set contains 60000 images and 10000 test images of fashion objects in 10 classes. For this case only two labels were filtered from the training dataset, namely the t-shirts and trousers. Once a working hybrid framework is established, the next step could be to expand to multi-label prediction by adding more qubits and gate operations (and time, I worked on this case study in my spare time during EuroPython 2021 Conference, multi-label prediction required more time). The filtered data is uploaded in the DataLoader of PyTorch, which wraps an iterable around the dataset to enable easy access to the samples.

A classical Convolutional Neural Network (CNN) is created with fully connected layers with an embedded QNN in it. To briefly define convolution in a neural network: ‘a mathematical operation on two functions to produce a third function, where the parameters are shared for computational efficiency’ (Raviv, E). At each operation, the size of the next layer is reduced with respect to the current layer. Each neuron of the CNN are fed with an input vector multiplied by its distinct weight. The outcome of that operation is evaluated and forwarded to the next layer.

Max pooling is used for the downsampling operation and ReLu as the activation function. The output from the input layer is forwarded into the QNN and then forwarded to the output layer to get a final prediction. TorchConnector function enables the integration of the QNN into the classical CNN. The measurement that is done at the end of the QNN along the Z axis of the Bloch sphere converts the quantum data actually into classical data. The optimization of the Model is done with Adam optimizer.

Results

The code is implemented on jupyter-notebook. The hybrid classifier performs well enough. Although, the accuracy might not be that great at first.

With linear entanglement enabled, the accuracy of the prediction of the model is 96,2%. It is desired to obtain a higher accuracy, however we have to leave that aside for further research.

Predicted images by the hybrid CNN-QNN architecture

The results may not indicate a great quantum advantage as promised at first. The quantum layer is actually computed/ simulated on a classical computer (not yet on a quantum computer itself), calculating what a quantum computer would ideally do. The scope of this brief case study was to explore a starting hybrid architecture with TorchConnector and two layer QNN that could be optimised or advanced later with further research and development. Furthermore, the project could be extended to multi-class prediction.

With the current Noisy Intermediate-Scale Quantum devices a lot can be done to create a proof of concept, while waiting for the fault tolerant quantum computers to create real quantum advantage (that would be quantum computers with around million qubits, see the roadmap of IBM). Therefore, it would be the right time now to start thinking about developing a quantum strategy for your business or organisation on how to adopt and integrate quantum computing to your advantage on time.

End of the brief journey..

As we approach the end of my quantum journey in the summer of 2021, it is at the same time the beginning of a bigger journey. Considering the current developments and significant investments by large enterprises in quantum computation, along with thousands of researchers and developers with passion and interest in quantum computing, I am very excited about the future of quantum computing and the many possibilities it will unlock, and the many opportunities it will create. I am happy that I can join this exciting great journey.

References

  1. https://qiskit.org/textbook/ch-machine-learning/machine-learning-qiskit-pytorch.html
  2. Sutor, R (2019) “Dancing with Qubits: How quantum computing works and how it can change the world”, Packt.
  3. Loredo, R (2020) “Learn Quantum Computing with Python and IBM Quantum Experience”, Packt.
  4. https://medium.com/from-the-diaries-of-john-henry/qml-6a5b68fb95d9
  5. Knill, W (2008) “Introduction to Quantum Information Processing”, arXiv:quant-ph/0207171v1.
  6. https://www.ibm.com/blogs/research/2021/02/quantum-development-roadmap/.
  7. https://medium.com/qiskit/building-a-quantum-variational-classifier-using-real-world-data-809c59eb17c2.
  8. Schuld, M. (2018) “Quantum machine learning in feature Hilbert spaces”, https://arxiv.org/pdf/1803.07128.pdf.
  9. Nielsen, M. & Chuang I. (2010) “Quantum Computation and Quantum Information”, Cambridge University Press.

--

--

Akif Baser
CodeX
Writer for

A multidisciplinary engineer with interest in data science, quantum computing and investment/ trading.