graalvm
Published in

graalvm

Porting Matplotlib from C API to HPy

GraalVM is a truly polyglot runtime that supports multiple languages, including Python, JavaScript and Ruby. The work of the GraalVM team often goes beyond implementing and supporting languages: we also actively contribute to the ecosystems of the languages that run on our runtime. As part of these efforts, in the Python ecosystem we are contributing to HPy project, which is an alternative to the existing CPython C API.

The CPython C API contributed greatly to the extensive ecosystem around the Python language. Interacting directly with the C API, which is closely coupled to the CPython implementation of the language, has hindered a number of initiatives for Python. Module extensions using the C API become tightly coupled with the CPython implementation, making it harder for CPython to evolve.

HPy proposes an alternative to the CPython C API that provides binary compatibility across Python versions and even across different Python implementations. Thus, the C extensions’ binary will disengage from Python releases with no direct dependence on CPython internals. Therefore, a change in CPython internals should never break your binaries.

Port to HPy assurance

HPy permits an incremental porting process that allows both C API and HPy to work together side-by-side during the transition process. The following example shows how it looks having both C API and HPy:

Example of a type implemented using both C API and HPy

As we can see in this example, foo is implemented using the C API and bar is implemented using HPy and both methods are accessible for type ClassA. This feature can ease the porting process to HPy piece-by-piece without requiring a full rewrite at once. Moreover, HPy has a killer feature called debug mode that can detect memory leaks, use-after-free, and more; you can read more about it here.

Why Matplotlib?

Matplotlib is one of the most popular Python packages, particularly for data science applications. Its a comprehensive Python tool that produces graphs and animations with its easy to use API. Most of the heavy lifting computations of Matplotlib are implemented in C/C++ which in part interact with the C API. Moreover, it uses NumPy’s native API to accelerate some of its computations. Thus, porting Matplotlib to HPy has a couple of unique challenges in terms of performance, compatibility, and complexity.

Porting Matplotlib C API to HPy

Beginning with the port of Matplotlib to HPy journey, there are 10 modules implemented using the C API: _backend_agg, _c_internal_utils, _contour, _image, _path, _tkagg, _ttconv, ft2font, _qhull, _tri. On top of those Matplotlib modules, there are two C extension module dependencies: NumPy and Kiwisolver. NumPy is very big in itself, so we are going to take advantage of the incremental porting that HPy allows and keep the C API for parts of Matplotlib that use the NumPy API. Kiwisolver, on the other hand, we did ported to HPy entirely.

The porting process is fairly simple for most of the process. HPy’s naming convention makes it obvious what the HPy equivalent of any existing Python/C API is. Here are a couple of examples:

Snippet of the Matplotlib HPy port

As shown in the example, the differences merely affect the old routine or the naming of the target function. The changes can be summarized as follow:

  • Adding H in front of the C API function.
  • Having ctx of type HPyContext * as the first argument.
  • Replacing a tuple arg with an array style * args and its length nargs.
  • Replacing PyObject * with HPy.

The HPyContext * argument is automatically provided by HPy for each down call from Python to C. HPyContext represents the Python interpreter all the handles belong to, which includes Python’s standard global variables. Therefore, the standard exception PyExc_ValueError is replaced with ctx->h_ValueError.

Arguments Parser

Arguments in HPy are passed to the callee as an array instead of a tuple. This eases the parsing process for HPyArg_Parse and HPyArg_ParseKeywords. HPy parses the arguments using its own internal tools with a much simplified routine for common use-cases compared to the process of C API’s PyArg_ParseTuple. The simplicity of HPy’s arguments parser comes at a cost of missing a few non-critical features that need to be implemented manually. Here is an example:

Using the C API’s PyArg_ParseTuple function with the & command, it will convert a Python object with the help of a converter function that is being passed as a function pointer. This feature is not available on HPy, so in this case the HPyArg_Parse command& should be removed and do the conversion manually after parsing the arguments similar to z.converter in the example above.

From Pointers to Handles

The HPy object represents a handle, hence the ‘H’ in HPy, which points to the Python object. Encapsulating the pointer within a handle rather than using the PyObject * directly has a couple of advantages:

  • It helps abstracting Python object implementation details of the host Python interpreter, which allows, for example, to have a different GC implementation, such as moving GC.
  • It provides more visibility for memory analysis tools to detect leaked handles and used after being closed handles cases.

The HPy approach is notably different for allocating and releasing a Python object. Handles returned by a function are never borrowed and handles passed as function arguments are never stolen. This means that when you get a handle from a call to an HPy API function, you will be responsible to close it using HPy_Close. On the other hand, if a handle comes in as an argument, you should never close it. Instead, if you need to return it, a copy of the handle should be made, hence HPy_Dup. One note that needs to be addressed when dealing with an HPy object is that:

HPy represents short lived handles that live no longer than the duration of one call from Python to HPy extension function.

Another notable difference is that each handle acts independently even if more than one handle is pointing to the same Python object. Moreover, handles cannot be compared with each other directly but instead using HPy_Is or for checking if the handle is pointing to NULL, then HPy_IsNull can be used; you can read more about it here.

Handling Type Fields

Some of the Python types that are implemented using The C API store field values within a struct for fast access and/or to store non-Python objects. Those types rely on having a pointer to the data structure, i.e. a struct which shares the layout with PyObject using the PyObject_HEAD members macro. HPy on the other hand, isolates the type’s struct from the Python object and provides APIs to access it. This kind of isolation give freedom for the underlying garbage collector (GC) of the host Python interpreter to move data without encountering side-effects. However, if the type object was referencing a PyObject*, the replacement must not be HPy but rather an HPyField as HPy is short lived and must not be referenced. Here is an example of how this works:

In the example above, type PyFT2Font has a couple of fields, some are Python objects and some are not. During object instantiation, i.e.HPy_New, of PyFT2Font, HPy will allocate and associate the struct the with Python object. The referenced Python objects, name and py_file within the type’s struct, need to be represented as HPyField and then accessed using HPyField_Store and HPyField_Load. With the help of the HPyType_HELPERS macro, HPy generates a helper function PyFT2Font_AsStruct to retrieve the type’s struct using the HPy handle.

What about NumPy usage on Matplotlib?

Matplotlib interacts with the NumPy native API, which is implemented using the C API. Since NumPy port to HPy is in a work-in-progress state, this doesn’t impose an obstacle. HPy provides APIs for converting an HPy handle into a PyObject* and vice versa. Here is an example:

What about missing APIs needed for Matplotlib in HPy?

Before the port HPy was missing some important APIs for Matplotlib. Most of the Matplotlib sources are implemented using C++, which has more restrictive pre-processing rules. HPy’s headers, on the other hand, were using some C features that weren’t compatible with C++ compiler. Therefore, HPy project welcomed the suggested adjustments to make it compatible. Moreover, some unicode, long and tuple operations were missing in the HPy API, which has, also, been added. However, HPy’s criteria for adding new API and functionalities is based on how commonly are those feature are used within the top 4000 PyPi Python packages. Matplotlib is using some features that are not commonly used, so those features, parsing nested tuples in a PyArg_ParseTuple for example, had to be implemented in Matplotlib as part of the port.

Performance

Kiwisolver

Kiwisolver was ported entirely to HPy as a dependency of Matplotlib, so we were able to run the benchmark included in the repository on the fully ported binaries. We ran the benchmarks using the following configurations:

  • Machine: Intel Core i9–10885H at 2.40GHz — Linux kernel version 5.10.60.1.
  • Kiwisolver (rev 1.3.2) built with the CPython C API using
# git checkout 1.3.2
# python setup.py install
  • Kiwisolver (rev HPy-1.3.2) built with the CPython HPy CPython ABI using
# git checkout HPy-1.3.2
# python setup.py install
  • Kiwisolver (rev HPy-1.3.2) built with the CPython HPy Universal ABI using
# git checkout HPy-1.3.2
# python setup.py --hpy-abi=universal install

In the following interactive graphs:

  • The X-axis represents the number of consecutive runs.
  • The Y-axis represents the time spent in seconds.
kiwi.suggestValue: CPython vs HPy

As shown, both the C API and HPy performed the same without any noticeable impact on performance.

Matplotlib

We ran the mpl-bench basic benchmark against multiple modes in CPython 3.8 with the following configurations:

  • Matplotlib (rev v3.4.x) built with the CPython C API using
# git checkout v3.4.x
# python setup.py install
  • Matplotlib (rev HPy-V3.4.x) built with the CPython HPy CPython ABI using
# git checkout HPy-V3.4.x
# python setup.py install
  • Matplotlib (rev HPy-V3.4.x) built with the CPython HPy Universal ABI using
# git checkout HPy-V3.4.x
# python setup.py --hpy-abi=universal install

In the following interactive graphs:

  • The X-axis represents number of consecutive runs.
  • The Y-axis represents the time spent in seconds.
basic.time_plot: CPython 3.8 C API vs HPy
basic.time_subplots[1]: CPython 3.8 C API vs HPy
basic.time_subplots[2]: CPython 3.8 C API vs HPy
basic.time_subplots[10]: CPython 3.8 C API vs HPy
basic.time_savefig: CPython 3.8 C API vs HPy
basic.time_projection[‘rectilinear’]: CPython 3.8 C API vs HPy
basic.time_projection[‘polar’]: CPython 3.8 C API vs HPy
basic.time_projection[‘mollweide’]: CPython 3.8 C API vs HPy
basic.time_projection[‘lambert’]: CPython 3.8 C API vs HPy
basic.time_projection[‘hammer’]: CPython 3.8 C API vs HPy
basic.time_projection[‘aitoff’]: CPython 3.8 C API vs HPy

As shown in the benchmarks results, the Matplotlib port to HPy performed marginally the same for the CPython ABI and Universal ABI binaries compared to the C API.

Performance outcome on other Python implementations

HPy’s Universal ABI promotes a portable approach for the produced binary to be loaded and executed by a variety of different Python implementations, such as PyPy, and GraalVM Python. Moreover, HPy can lead to a significantly better performance on alternative Python implementations.

Both PyPy and GraalVM Python have a just-in-time (JIT) compiler that can compile the most frequently executed code, i.e. hot path, into machine code on the fly. Though, JIT-Compilers in general require warm up runs to allow the JIT compiler to identify the hot path and produce machine code to reach peak performance. Since GraalVM Python adopted the latest revision of HPy, we were able to run Kiwisolver benchmark using the following configurations:

  • GraalVM Python HPy Universal ABI (Sulong) that ran the ported version of Kiwisolver to HPy (rev HPy-1.3.2) using the LLVM backend. The module binary had to be recompiled into a bitcode format.
  • GraalVM Python HPy Universal ABI (Native) that ran the ported version of Kiwisolver to HPy (rev HPy-1.3.2) using the native interface backend. The module binary is the same as the one we ran CPython HPy Universal ABI.
kiwi.suggestValue: CPython, HPy and GraalVM Python

As shown, the results are very encouraging! They show that a complete port to HPy not only has the same performance on CPython as before, but also that GraalVM Python now achieves the same performance as CPython.

Matplotlib, on the other hand, isn’t a complete port to HPy as Matplotlib relies on NumPy for many of its calculations. NumPy port to HPy is in a work-in-progress state as mentioned earlier and GraalVM Python does not have full support for it yet. We were able to run basic.time_plot using the following configurations for GraalVM Python:

  • GraalVM Python C API that ran a non-HPy version of Matplotlib (rev v3.4.x).
  • GraalVM Python HPy Universal ABI that ran the ported version of Matplotlib to HPy (rev HPy-V3.4.x).

Both GraalVM Python modes require a bitcode build of the Matplotlib binary and ran using the GraalVM LLVM (Sulong) backend. The following graph shows the benchmark results:

basic.time_plot: GraalVM Python and CPython 3.8

As shown in the graph, GraalVM Python gained a massive increase in performance for the HPy version of Matplotlib compared to the C API. The HPy implementation of Matplotlib was almost four times faster than C API for both peak and warm up runs. Though GraalVM Python is still slower than CPython for this benchmark due to the C API implementation of NumPy.

Final note

We hope that in this article you have seen the benefits of moving to HPy within your modules. Not only does it offer the possibility to decouple yourself from CPython and thus target other implementations, it also can offer great performance.

--

--

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store