Performance gain by writing a C extension in python

abhijeet gorhe
Oct 9, 2017 · 3 min read

Interpreted language will never match the performance of compiled languages . Ever since I moved on to python from C/C++ , I always wanted to combine best of both worlds by extending python in C .

To gauge performance benefits i tried coding same algorithm (trivial sort ) in python , C & Cython . followed by running different versions of same algorithm with same input size

Lets start by writing a simple C code

void swap(int * a, int * b) {
int tmp = * a; * a = * b; * b = tmp;
void sort(int * array, int len) {
for (int i = 0; i < len; i++) {
for (int j = i; j < len; j++) {
if (array[i] > array[j]) {
swap( & array[i], & array[j]);

We can build this and create a .so by

gcc -fPIC -c sort.c 
gcc -shared -o sort.o

This should create in your PWD

Now lets write same algorithm n C[p]ython and save this file as cysort.pyx

def cy_sort(iarray,length):
for i in range(length):
for j in range(i,length):
if iarray[i]>iarray[j]:

Write as

from distutils.core import setup
from Cython.Build import cythonize

Build cython extension

python build_ext --inplace

This should create cysort.c and in your PWD

Image for post
Image for post

Now we are done with cython and c part , now lets build Python part .

from ctypes import *
import random
import sys
from cysort import cy_sort
def convert_list_to_array(lll):
for l in lll:
return intarray
def print_c_array(iarray,length):
for i in range(0,length):
print iarray[i]
def py_sort(iarray,length):
for i in range(length):
for j in range(i,length):
if iarray[i]>iarray[j]:
if len(sys.argv)!=3 :
print "Incorrect number of arguments Arg1=sample size (< 10000000) ,Arg2=C/Python/Cython"
lr= random.sample(xrange(10000000),sample_size)
iarray = convert_list_to_array(lr)
if func_call.upper()=="C":
sort_lib = cdll.LoadLibrary("./")
elif func_call.upper()=="CYTHON":

Now its time to test code and profile same algorithm in python,C & Cython

python -m profile 1000 python
python -m profile 1000 c
python -m profile 1000 cython

Y axis is response time in seconds and X is sample input size

Image for post
Image for post

As we can clearly see ctypes clearly outperforms both C & cython . running code with cython improves performance by 35–40% . But c/ctypes is 33 times faster than pure python !!

So choice is simple if you have only performance in mind , write a code in C and hook it with your python code using ctypes . this however comes with couple of caveats

  1. should be proficient in C/C++
  2. being C ,its inherently not portable . You need to write/build as many versions of C code as number platforms you wish to support

We should probably go for ctypes extension only when we have a small CPU bound code which is taking more than half of total processing time & we have ran out of other optimizations . As a last resort this small piece of code can be put out in C .

If you are not familiar with C or don’t want to have C dependency in your ecosystem , cython is more suitable . most of the python code that you have can be put it out in cython by adding simple build step in between . Again i would recommend profiling your code and putting only small CPU bound pieces to cython .

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store