Scientific computing in a polyglot world with xtensor

5 min readOct 18, 2019

In the future, there will be actual wars about which language humans should use to speak to the machine.

The One Language to Rule Them All probably does not exist yet. In the world of interactive scientific computing, countless articles and blog posts argue on the merits of R, Python, Julia and other technologies. The truth is, while this diversity is a chance, efforts are duplicated in each community and collaboration is sparse, which hurts sustainability in the long run.

For numerical code, C-level efficiency can be achieved in different ways. Preferred approaches differ in Python and Julia. On the one hand, authors of Python extension modules often make use of tools such as cython, or write their code in a way that is amenable to vectorization with NumPy. On the other hand, Julia fully relies on just-in-time compilation and the LLVM stack. These two approaches are not easily reconcilable.

As library authors, we directly work with C++ for the numerical code, and provide bindings for the preferred languages of the end users (Julia, Python, R, etc…). This is a use case in which the xtensor C++ library particularly shines.

xtensor is a C++ lazy tensor algebra library providing an API similar to that of NumPy, including broadcasting and universal functions. You can check out the NumPy to xtensor cheat sheet. More importantly, xtensor is an expression system, which enables the use of tensor expressions backed by arbitrary backends such as

in-memory containers
database systems
filesystem operations
data structures from other libraries or programming languages providing a C API.

In other words, xtensor enables a uniform way of operating on data regardless of the source. Besides, it transparently makes use of lazy computing techniques, to prevent the computation of temporary variables, and allow the symbolic manipulation of large expressions. This is also what enabled language bindings such as xtensor-python, xtensor-julia, and xtensor-r which wrap native NumPy and Julia arrays as xtensor expressions.

xtensor-python makes use of the excellent pybind11, which was created by Wenzel Jakob.
xtensor-julia relies on CxxWrap, which was authored by Bart Janssens.
xtensor-r is built upon the popular Rcpp, which was written by Romain François and Dirk Eddelbuettel.

Learning by the example

Assume we have some Python module based on a process function that operates on two numpy arrays. After some time, we realize that the performance is not so good and that the bottleneck is the process function. To fix that, we decide to replace it with a C++ native implementation, exposed to Python as an extension module.

First implementation

Let’s start with a straightforward implementation, that is, a simple C++ module exposing one function to Python using pybind11 and xtensor-python, which can be done with a single cpp file, pyprocess.cpp. In this example, we provide all the boilerplate to produce a minimal extension module.

// pyprocess.cpp
#define FORCE_IMPORT_ARRAY
#include "xtensor-python/pyarray.hpp"

namespace py = pybind11;
using pyarray_type = xt::pyarray<double>;

double process(const pyarray_type& e1, const pyarray_type& e2)
{
    // Operate on `e1` and `e2` using the xtensor numpy-style API.
}

PYBIND11_PLUGIN(cppprocess)
{
    xt::import_numpy();
    py::module m("cppprocess");
    m.def("process", process);
    return m.ptr();
}

Once the module is built, we can use it in our Python code like any other Python function:

import cppprocess as xp
res = xp.process(e1, e2)  # e1 and e2 are numpy arrays

In the process C++ function implemented above, the pyarray C++ objects offer an API very similar to that of a numpy array, while following the idioms of the C++ standard library, including:

iterator pairs to iterate in various fashions over N-dimensional arrays
special functions, reducers, broadcasting, universal functions
linear algebra routines

A cheat sheet for the numpy to xtensor migration can be found here.

Now we would like to reuse this new implementation in other contexts:

a pure C++ context
a Julia module

A naive approach would be to make this C++ implementation standalone and use it both in a native C++ context and a Python extension. However, the resulting C++ program would then depend on numpy, since process operates on arguments of type pyarray, which is a C++ wrapper for numpy arrays. In a pure C++ module, we would rather make use of the xarray and xtensor containers, or any other implementation of the xtensor expression interface. The way to reconcile these use cases is to make use of the abstract xtensor expression type.

Enforcing the usage of expressions

Before separating the implementation of process, we make it a template function so it can work with any expression type. A naive approach would be a signature of the form

template <class E1, class E2>
double xprocess(const E1& e1, const E2& e2)

This xprocess template function is generic and allows the use of any type for its arguments. If we try to call it with something that does not actually implement the xexpression interface, we will end up with cryptic errors. To avoid this, we can modify the signature of xprocess so that the compiler complains when the function arguments are not xtensor expressions:

// in xprocess.hpptemplate <class E1, class E2>
double xprocess(const xt::xexpression<E1>& e1,
                const xt::xexpression<E2>& e2)
{
    const E1& m1 = e1.derived_cast();
    const E2& m2 = e2.derived_cast();
    // Operate on m1 and m2 using the numpy-style xtensor API.
}

The first line in the function retrieves the real type of the expression so we can keep the rest of the implementation unchanged.

// in pyprocess.cppPYBIND11_PLUGIN(cppprocess)
{
    xt::import_numpy();
    py::module m("cppprocess", "cppprocess extension module");
    m.def("process", xprocess<xt::xexpression<pyarray_type>,
                              xt::xexpression<pyarray_type>>);
    return m.ptr();
}

Splitting the project

Now we can split the project into:

A header-only C++ project containing the generic xprocess template function.
A Python extension depending on the C++ template library, exposing to Python a specialization of the abstract implementation for the pyarray and pytensor wrappers on numpy.

We can now use xprocess in pure C++ contexts without any dependency on pybind11 or xtensor-python. Since xprocess is generic, we can use it with other types of expressions, such as xtensor, xarray, or even expresions backed by database calls, filesystem operations, etc.

But we can go further. We can now expose xprocess to Julia with very little additional work:

// in jlprocess.cpp#include "jlcxx/jlcxx.hpp"
#include "xtensor-julia/jlarray.hpp
#include "xprocess/xprocess.hpp"using jlarray_type = xt::jlarray<double>;double jlprocess(jlarray_type e1, jlarray_type e2)
{
    return xprocess(e1, e2);
}

JLCXX_MODULE define_julia_module(jlcxx::Module& mod)
{
    mod.method("process", jlprocess);
}

Conclusion

Creating extension modules for Julia and Python based on a single implementation of the C++ engine is made easy by the xtensor, xtensor-python, and xtensor-julia projects.

In summary, these guiding principles help you reuse your code in many different languages and contexts:

separate the pure C++ implementation in a dedicated project that does not depend on xtensor-python, xtensor-julia, or other language bindings.
operate on abstract xexpression types rather than explicit types such as xtensor or xarray.
Python bindings and Julia bindings should be thin wrappers around the expression-based C++ implementation.