Distributing python packages protected with Cython

Artem Vasilyev
The Startup
Published in
4 min readNov 25, 2019
Photo by Chris Ried on Unsplash

One of the possible solutions to protect the source code of a python application is to use Cython. Cython translates source code into C/C++ code and compiles it. Resulting extensions still can be reverse-engineered, but not to equivalent source code. The problem with distributing compiled extensions is that they are platform-specific. We will use wheel as a packaging format to solve this issue.

Final solution is available on my github page.

Step 1. Sample application

Create a virtual environment:

$ virtualenv .venv --python=python3.6
$ source .venv/bin/activate

Create a simple hello-world application with the following structure:

.
├── app
| ├── __init__.py
| ├── core.py
│ └── main.py
└── setup.py

Our application is in the app directory. main.py is the application entry point:

https://gist.github.com/art-vasilyev/672a2384dc5d7be3631ddaffa1315c2e

core.py contains application logic, that we want to protect:

https://gist.github.com/art-vasilyev/4495eea756370c9bd832e8bc0f0340b5

setup.py is required to pack our application into a package. Without compilation, it can look like this:

https://gist.github.com/art-vasilyev/b7172d39eb71a4ecf63b43b9465cf368

Let’s build the package and look at what goes inside it:

$ python setup.py sdist
$ tar -xzf dist/app-0.1.0.tar.gz
$ tree app-0.1.0
app-0.1.0
├── app
│ ├── core.py
│ ├── __init__.py
│ └── main.py
├── app.egg-info
│ ├── dependency_links.txt
│ ├── PKG-INFO
│ ├── SOURCES.txt
│ └── top_level.txt
├── PKG-INFO
├── setup.cfg
└── setup.py
2 directories, 10 files

As you see, the package contains our package with py files. At the next step, we are going to compile python files.

Step 2. Compilation

We need to install Cython to translate and compile python files:

$ pip install Cython

Let’s update setup.py to add compilation:

https://gist.github.com/art-vasilyev/d4ce5b17583e35792ec072cc81f0ba84

get_ext_paths function returns a list of files that needs compilation. EXCLUDE_FILES is a list of files that we will include as-is. We exclude main.py from compilation because it is the application entry point.

Let’s compile the application:

$ python setup.py build_ext --inplace
tree app
app
├── core.c
├── core.cpython-36m-x86_64-linux-gnu.so
├── core.py
├── __init__.c
├── __init__.cpython-36m-x86_64-linux-gnu.so
├── __init__.py
└── main.py
0 directories, 7 files

As you see, there are files ending with cpython-36m-x86_64-linux-gnu.so. These extensions are platform-specific, and they won’t run on another platform, so we need to create a separate package for each target platform.

Step 3. Packaging

We will use wheel packaging format because wheels can contain platform information in package names. Let’s try to build the application package and extract:

$ python setup.py bdist_wheel
$ unzip dist/app-0.1.0-cp36-cp36m-linux_x86_64.whl -d dist/app
$ tree dist/app
dist/app
├── app
│ ├── core.cpython-36m-x86_64-linux-gnu.so
│ ├── core.py
│ ├── __init__.cpython-36m-x86_64-linux-gnu.so
│ ├── __init__.py
│ └── main.py
└── app-0.1.0.dist-info
├── METADATA
├── RECORD
├── top_level.txt
└── WHEEL
2 directories, 9 files

Extracted package contains compiled extensions, but there are also source files, and this is not what we wanted. The workaround for this is to override setuptools build_py. build_py is called from bdist_wheel command and is responsible for collecting package files. Our custom command will filter .py files if there are compiled extensions with the same name.

Final version of setup.py:

https://gist.github.com/art-vasilyev/39b29d2aee6de0dbfa7a99d61e040f27

Note: we used the cmdclass argument to override default commands. This trick won’t work if you use pbr for collecting package information, in that case, place custom build_py in the separate module and put the reference to the setup.cfg configuration file.

Build the package:

$ rm -rf app.egg-info dist build
$ python setup.py bdist_wheel

Now we have a wheel package named app-0.1.0-cp36-cp36m-linux_x86_64.whl. Let’s check it and try to run:

$ unzip dist/app-0.1.0-cp36-cp36m-linux_x86_64.whl -d dist/app
$ tree dist/app
dist/app
├── app
│ ├── core.cpython-36m-x86_64-linux-gnu.so
│ ├── __init__.cpython-36m-x86_64-linux-gnu.so
│ └── main.py
└── app-0.1.0.dist-info
├── METADATA
├── RECORD
├── top_level.txt
└── WHEEL
2 directories, 7 files
$ cd dist/app
$ python -m app.main
Hello world

Finally, it contains no source code and it works. This package is platform-specific, so you need to build packages for every target python version and upload them to PyPI server. When you try to install the package from PyPI, pip will choose package built for an appropriate platform or will fail if the platform doesn’t support any of them.

Note: when building packages, take into account that Python can be built with UCS-2 and UCS-4 option (number of bytes required for Unicode characters). In our example 36m means that this package is built on Python 3.6 with UCS-2.

--

--