Learn About Retworkx — The Graph Library Used by Qiskit — And How to Contribute

Published in

Qiskit

7 min readSep 21, 2021

By Matthew Treinish, Senior Software Engineer at IBM Research

The retworkx library is an open source, high performance, general purpose Python graph library that is written in Rust and is developed in tandem with Qiskit. It is Apache 2.0 licensed and development is hosted on Github at: https://github.com/Qiskit/retworkx. In Qiskit, we use retworkx to build, manipulate, and interact with graph data structures. Since we adopted the library, it has enabled us to significantly improve Qiskit’s runtime performance.

History

Around the time of the Qiskit 0.7 and 0.8 releases, in the first half of 2019, we were starting to see runtime performance scaling issues with the Qiskit transpiler. At that time, we were using the NetworkX library, which is a popular library for manipulating graphs and complex networks. When we looked at performance profiles when transpiling circuits, we saw that Qiskit was spending a large portion of time on graph operations, particularly in NetworkX. As the size of quantum systems started scaling, this increasingly became an issue.

It was around this time I started the retworkx project, initially mostly as an experiment in interfacing the Rust programming language with Python. At the start of the project I was intending for it to just be a replacement for Qiskit Terra’s internal NetworkX usage, one that offered higher performance with minimal API differences that made it easy to adapt Qiskit to use the new library. However, since it was initially adopted by Qiskit, the goal of retworkx has grown to be a general purpose graph library that strives to offer similar functionality to NetworkX while offering significantly improved performance and still retaining the ease of use that NetworkX provides.

How Qiskit uses graphs

Qiskit uses graphs at pretty much every level. The biggest example is the DAGCircuit class in qiskit-terra which is a directed acyclic graph (DAG) representation of a quantum circuit, where we represent operations as nodes, and edges represent the flow of data between nodes.

For example, consider this basic Bell circuit:

When built as a DAGCircuit, it looks like:

In the DAG representation it’s easy to see the flow of data between instructions in the circuit. The transpiler in Qiskit operates on a DAGCircuit object, each transpiler pass either performs analysis on the DAG or transforms it to make a change to the circuit. The simplest example of a transformation is basis gate translation, for which the BasisTranslator pass is used in qiskit-terra. The transpiler pass iterates over every node in the graph, and for any gates outside of the basis set of the target backend, it replaces it with an equivalent circuit employing the target basis set. For example, let's say our target basis set is that which is currently used by IBM Quantum's devices (['x', 'sx', 'rz', 'cx', 'id']). If the Bell circuit above was being transpiled with our target basis set, then the Hadamard gate in the DAG would be replaced with RZ(pi/2) SX RZ(pi/2) by the BasisTranslator pass. For example:

Qiskit uses graphs in other places as well. The most obvious example is the coupling map or connectivity graph, that shows the connectivity of qubits on a quantum processor. For example, the coupling map for the retired IBM Quantum Rochester system looks like this:

There are also additional uses of graphs throughout Qiskit, but these are the data structures used in the transpiler that have the largest effect on runtime performance.

Why Rust?

For those who aren’t familiar with the Rust programming language, it is a systems programming language designed for performance and safety. While I could talk at great length about all the general language advantages of Rust, this is about retworkx, so instead I’ll link to other articles that cover this:

In our case, we see performance gains because Rust is a compiled language that results in a machine code output that runs without the python interpreter.

Besides the inherent language qualities of Rust, the biggest advantage for using it to develop retworkx is the developer tooling and workflow, and how it integrates cleanly with Python.

From a developer tooling perspective, Rust provides a much simpler experience that involves using a single tool, cargo, for dependency management, project lifecycle, and build system. This simplifies the task of compiling the library and any dependencies when installing the package with Python standard packaging tools like pip. Additionally, by default, Rust will statically link all your Rust dependencies so that you end up with a single self-contained library file that is easier to package for Python, since you do not have to worry about ensuring users have all the Rust dependencies installed. Finally, in retworkx, we also rely on the setuptools-rust library. With the setuptools-rust extension, an end user from Python can use pip or setup.py to install from source (assuming they have the Rust compiler installed) just like with a pure python library regardless of what platform or system they're running on.

These are just a few examples of how using Rust has been advantageous for the development of rewtorkx for Qiskit. Most users don’t see these details as they install pre-compiled binaries from PyPI, but for development, this makes a large impact as you need to build every time you make a change to the source code, and being able to do this easily and reliably improves your productivity.

The Performance Improvement

At the time of first introduction for Qiskit’s DAGCircuit class, retworkx provided a ≥ 2x improvement in runtime, and graph operations were >10x faster than the NetworkX implementations.

Let’s take a look at the performance improvement that we observed for the following example, a transpile for a 53 qubit quantum volume circuit on the IBM Quantum Rochester system (which was the largest device available at the time) with current Qiskit code:

Looking at the runtime performance differences between NetworkX and retworkx In the Qiskit Terra 0.13.0 release (which is quite old at this point — 18 months old at the time of writing), yielded a substantial performance improvement:

The above table shows a subset of the internal calls from Qiskit to retworkx and NetworkX comparing the times spent in each call.

Then looking at a profile visualization (generated with SnakeViz) shows the overall performance difference between using NetworkX and retworkx more clearly:

Looking at NetworkX first, the above profile graph shows the call hierarchy vertically and horizontally to represent time spent in a function. Here we see that we’re spending most of our time in the StochasticSwap pass. However, at a high level view like this, it's hard to see exactly what parts of the StochasticSwap pass are causing the bottleneck. To show where the graph operation bottlenecks are, the highlighted pink boxes represent time spent in the lexicographical_topological_sort() function, which is the cumulatively the slowest graph function in the profile.

The following graph shows the performance when using retworkx instead of NetworkX. We can see that the overall time for running transpile() is roughly half of what it was with NetworkX. But more importantly, the pink boxes in this graph represent calls of lexicographical_topological_sort just like the NetworkX profile visualization. Here you can see that what was cumulatively one of the slowest function calls with NetworkX is only a tiny portion of the time with retworkx.

It’s worth noting that the performance of Qiskit has improved significantly since Qiskit Terra 0.13.0, when this analysis was run. This analysis also used the retworkx 0.3.4 release, while retworkx 0.10.2 is the latest release. However, this was the only time when side-by-side comparison was feasible — since then, Qiskit solely uses retworkx. While retworkx has gotten faster, most of that improvement was found by eliminating bottlenecks in Qiskit’s code and also adding additional features to the retworkx API to provide native operations that Qiskit can leverage. By introducing retworkx as the graph library, we basically eliminated graph operations as a bottleneck, which enabled us to find (and begin to fix) other bottlenecks in the transpiler. Prior to the introduction of retworkx it was difficult to find these performance issues in Qiskit’s transpiler because over 50% of the transpiler run time was spent in NetworkX.

Contributing to retworkx

The retworkx library continues to grow and has many applications outside of just Qiskit. In recent months we’ve been working on expanding the library to include more general graph applications and functionality. If you’re interested in helping out, come talk with us in the #retworkx channel on Qiskit slack so we can work together on improving retworkx.

You can also look at some good first issues or if you have a use case or functionality you’d like to see added to retworkx please open an enhancement issue here.