Calling C functions from Python

And how to interact with Numpy arrays in C

Matias Aravena Gamboa
spikelab

--

Introduction

I love Python because it’s a fantastic programming language. It’s simple to learn, it’s very flexible, there are tons of libraries, tools and frameworks that makes developing easier. That’s why it’s one of the most beloved programming languages right now. It can be used in almost any project from web developing to machine learning.

But pure Python has a big problem. For some tasks it could be very slow, like veeeery slow 🐢, especially for math related tasks which involves working with arrays or vector operations. There exists optimising compilers like Cython or Numba that help us get something close to C speed, without loosing the feeling that we are programming in Python. However, sometimes this is not appropriate or convenient, and you just want to write a function in C and then call it directly from Python.

Working with arrays in Python

Numpy is one of the most popular packages for scientific computing in Python. It allows you to create very efficient matrices and vectors in Python with a C backend. In this post, we are going to create a C function that takes an numpy array, does some operation and then returns a numpy array back to Python.

Using C functions in Python

Let’s suppose that we have a n x m matrix and we want to calculate, for each row, the sum of its elements. Thus, the function takes a n x m matrix and returns an n x 1 array. This operation which is quite easy to implement needs to iterate over each element of the matrix. If we do it in Python it would be something like this

This script in my computer takes like 55 seconds to run. Python is not fast doing for loops so the time taken is not a surprise. You can read more about Python slowness here, but someone new in Python probably would do this if they need to work with numpy arrays or Python lists.

Of course, this is a somewhat silly example, because you could just use numpy to compute it using matrix.sum(axis=1). However, when the procedure doesn’t exist in numpy, you’ll be forced to write it in pure Python and it will quite likely be very slow.

Let’s write the same function in C:

The C function takes a pointer to the numpy array, then we use malloc to allocate enough space for our resulting array. Then we iterate over the matrix using a double for loop. Notice that the first for is from 0 to n*m with increments of n , this is because we need to iterate over the 2d array as if it were a 1d array.

Once our C function is ready, we need to compile it like this:

cc -fPIC -shared -o src/c_sum.so src/c_sum.c

This will create an .so file that we can load into Python, for this we need to import some functions from thectypes library that allows us to interact with C. Then we need load our.so file and load our function in a python variable, next we need to specify the output type of the C function. Our C function returns a pointer to an array, so we use ndpointer as return restype.

#load the compiled C library
lib = cdll.LoadLibrary("src/c_sum.so")
c_sum = lib.c_sum #c_sum is the name of our C function
c_sum.restype = ndpointer(dtype=c_double,
shape=(n,))

C is a strongly typed language, so when we declare a variable or we call a function we need to specify the variables types. Python on the other hand is a dynamically typed language which means that we don’t tell Python which data type is used, in fact everything in Python is an object type. That why we need the usectypes library to specify the C data type that we are passing to our C function:

result = c_sum(c_void_p(matrix.ctypes.data),c_int(n),c_int(m))

The ctypes.data is an attribute of numpy that returns a pointer to the array, we use c_void_p to specify that we are passing a pointer to our function. In the same way we usec_int to indicate that we are passing a data of type int.

The C function takes 0.27 seconds to run, which is like 200 times faster!!! 🤯, the full python code is in the following gist:

Conclusions

Calling C functions in Python is a great way to optimize bottlenecks in our code. Python allows to develop applications very fast due the flexibility of the language. If part of our code is not fast enough, we can use C to make it faster, as Donald Knuth said:

Premature optimization is the root of all evil

Please feel free to make comments or suggestions.

--

--

Matias Aravena Gamboa
spikelab

Machine Learning Engineer and beer enthusiast. Valdivia, Chile