A framework for identifying where empirical quantum advantage might exist: a demonstration with electronic health records (EHRs)

Published in

Qiskit

5 min readJun 20, 2022

By Zoran Krunic (Amgen), Frederik F. Flöther (IBM Quantum), Omar Shehab (IBM Quantum)

How do I know if I should try a quantum computer for a given problem — and can I answer this question quickly and easily?

The recent progress in quantum computing applications in different industries has brought these widespread questions into focus. There is an increasing need to rapidly evaluate different computational models for the possibility of achieving verifiable, real-world quantum advantage for a given problem as well as for given classical and quantum approaches. We call this empirical quantum advantage (EQA). Industry organizations and their leaders would benefit from being able to rapidly assess use cases to determine which ones would be the best candidates for the application of quantum computing. Having selected the best candidate use cases, the current IT infrastructure could be enriched with software tools that are suited to provide an interface to quantum computers. Ideally, that could be done quickly, would not require extensive quantum computing knowledge, and would entail minimal disruption of current processes.

In a recent paper [1], the Amgen and IBM collaboration developed a framework to help identify data sets a priori where EQA could exist, carrying out one of the largest quantum machine learning (ML) experiments to date. This was done by introducing an index called the phase space terrain ruggedness index (PTRI). Inspired by geophysics, PTRI quantifies the ruggedness of a performance metric surface, such as accuracy, as a function of the number of features and training samples in an ML configuration space; an illustration is shown in Figure 1.

Figure 1: Example of a 3D plot showing how the ruggedness of accuracy (or another performance metric) can be quantified with the phase space terrain ruggedness index (PTRI) as a function of the number of features and training samples for ML models.

The PTRI quantification allows us to test hypotheses about the characteristics of regions where quantum advantage is likeliest. For instance, one can see in Figure 1 that as the number of training samples increases, the accuracy of the classical models plateaus (PTRI is low). This could be a natural region to explore quantum advantage. Determining precise quantitative links between the PTRI values of a certain region and the region’s quantum advantage (or lack thereof) is a promising direction for future research.

An example of one of the resulting plots for quantum-to-classical advantage for the F1 score — a performance metric for assessing ML models — is shown in Figure 2. Additional metrics were explored, as described in the paper.

Figure 2: Example of an empirical quantum-to-classical advantage plot as a function of the number of features. The metric here is the nonprobability-based F1 score — other metrics were also studied in the paper.

When looking for quantum advantage, or at least equivalence, we need to start with existing classical ML models. For a candidate ML use case, we would characterize the problem with the final data set used to train the model, the trained model itself, and the set of metrics and results. Since quantum ML today is limited to small data sets, some level of data subsampling and downsizing is then often required, which may be specified with a one-page well-structured input file. Use cases that naturally already have the target data size range would skip this step. Such an input file contains the sizes of the training, test, and validation sets, order of relevance of the topmost important features, different numbers of these features to be used for experimentation, and key quantum experiment characteristics (such as feature map and backend). This information could be input within a few hours by a subject matter expert familiar with the classical ML problem that had been previously studied. From there, the code generation component in the Amgen-IBM framework is executed to create the new script for the quantum processing of the full grid. The results are aggregated and presented with comparison tables and plots.

The Amgen-IBM framework therefore lends itself to seamless integration and rapid prototyping. The framework can evaluate dozens of different use cases with minimal knowledge of quantum processing, hands-on time, and impact to current infrastructure. It serves as a bridge from classical to quantum computing, in essence a quantum plug-in, and executes on top of Qiskit Runtime and IBM Quantum Services that provide the models with access to state-of-the-art quantum computers.

Moreover, the Amgen-IBM framework is applicable cross-industry to a wide range of machine learning and optimization problems, for example in demand forecasting, fraud detection, and reservoir modelling. The specific use case considered in the paper concerns predicting how long rheumatoid arthritis (RA) patients continue to take certain biologic medications. There are well over 10 million RA patients globally, with over 1 million in the USA alone. Biologic medications represent a key component of this multi-billion-dollar market. Every incremental improvement in predictive analytics results in this area can have a large positive impact. The analysis was based on the Optum® de-identified Electronic Health Record data set, a dataset of deidentified and aggregated clinical and medical administrative data, and suitably sized cohorts were created from the 100+ million longitudinal EHR lives, as shown in Figure 3. The early alpha version of the software framework was developed to support the classical and quantum computing experimentation and result aggregation, requiring minimal manual adjustments by different business use case owners.

Figure 3: The Amgen-IBM framework was applied to suitable subsets of the Optum® de-identified Electronic Health Record data set in order to study the persistence of rheumatoid arthritis patients on biologic medications.

While the framework does not guarantee an empirical quantum advantage, it goes a long way toward enabling significantly easier and less labor-intensive evaluations of a large number of use cases. It also integrates seamlessly into existing IT and cloud infrastructures. These are key considerations in the adoption of quantum computing [2],[3]. Possible next steps include exploring the aforementioned quantitative links between PTRI and EQA and supporting new types of ML and optimization problems not covered by the first release. Such enhancements should help achieve EQA sooner for a variety of use cases in the next years.

All coding was performed using Python 3 and Qiskit (of course), and code sections may be obtained on request to corresponding author Zoran Krunic, whose contact information is given in the paper linked below.

[1] https://ieeexplore.ieee.org/document/9779984

[2] https://www.mckinsey.com/business-functions/mckinsey-digital/our-insights/quantum-computing-use-cases-are-getting-real-what-you-need-to-know

[3] https://www.ibm.com/thought-leadership/institute-business-value/report/quantum-decade

A framework for identifying where empirical quantum advantage might exist: a demonstration with electronic health records (EHRs)

Written by Qiskit