Distributing python packages protected with Cython
One of the possible solutions to protect the source code of a python application is to use Cython. Cython translates source code into C/C++ code and compiles it. Resulting extensions still can be reverse-engineered, but not to equivalent source code. The problem with distributing compiled extensions is that they are platform-specific. We will use wheel as a packaging format to solve this issue.
Final solution is available on my github page.
Step 1. Sample application
Create a virtual environment:
$ virtualenv .venv --python=python3.6
$ source .venv/bin/activate
Create a simple hello-world application with the following structure:
.
├── app
| ├── __init__.py
| ├── core.py
│ └── main.py
└── setup.py
Our application is in the app
directory. main.py
is the application entry point:
core.py
contains application logic, that we want to protect:
setup.py
is required to pack our application into a package. Without compilation, it can look like this:
Let’s build the package and look at what goes inside it:
$ python setup.py sdist
$ tar -xzf dist/app-0.1.0.tar.gz
$ tree app-0.1.0
app-0.1.0
├── app
│ ├── core.py
│ ├── __init__.py
│ └── main.py
├── app.egg-info
│ ├── dependency_links.txt
│ ├── PKG-INFO
│ ├── SOURCES.txt
│ └── top_level.txt
├── PKG-INFO
├── setup.cfg
└── setup.py2 directories, 10 files
As you see, the package contains our package with py
files. At the next step, we are going to compile python files.
Step 2. Compilation
We need to install Cython
to translate and compile python files:
$ pip install Cython
Let’s update setup.py
to add compilation:
get_ext_paths
function returns a list of files that needs compilation. EXCLUDE_FILES
is a list of files that we will include as-is. We exclude main.py
from compilation because it is the application entry point.
Let’s compile the application:
$ python setup.py build_ext --inplace
tree app
app
├── core.c
├── core.cpython-36m-x86_64-linux-gnu.so
├── core.py
├── __init__.c
├── __init__.cpython-36m-x86_64-linux-gnu.so
├── __init__.py
└── main.py0 directories, 7 files
As you see, there are files ending with cpython-36m-x86_64-linux-gnu.so
. These extensions are platform-specific, and they won’t run on another platform, so we need to create a separate package for each target platform.
Step 3. Packaging
We will use wheel packaging format because wheels can contain platform information in package names. Let’s try to build the application package and extract:
$ python setup.py bdist_wheel
$ unzip dist/app-0.1.0-cp36-cp36m-linux_x86_64.whl -d dist/app
$ tree dist/app
dist/app
├── app
│ ├── core.cpython-36m-x86_64-linux-gnu.so
│ ├── core.py
│ ├── __init__.cpython-36m-x86_64-linux-gnu.so
│ ├── __init__.py
│ └── main.py
└── app-0.1.0.dist-info
├── METADATA
├── RECORD
├── top_level.txt
└── WHEEL2 directories, 9 files
Extracted package contains compiled extensions, but there are also source files, and this is not what we wanted. The workaround for this is to override setuptools
build_py
. build_py
is called from bdist_wheel
command and is responsible for collecting package files. Our custom command will filter .py
files if there are compiled extensions with the same name.
Final version of setup.py
:
Note: we used the cmdclass
argument to override default commands. This trick won’t work if you use pbr
for collecting package information, in that case, place custom build_py
in the separate module and put the reference to the setup.cfg
configuration file.
Build the package:
$ rm -rf app.egg-info dist build
$ python setup.py bdist_wheel
Now we have a wheel package named app-0.1.0-cp36-cp36m-linux_x86_64.whl
. Let’s check it and try to run:
$ unzip dist/app-0.1.0-cp36-cp36m-linux_x86_64.whl -d dist/app
$ tree dist/app
dist/app
├── app
│ ├── core.cpython-36m-x86_64-linux-gnu.so
│ ├── __init__.cpython-36m-x86_64-linux-gnu.so
│ └── main.py
└── app-0.1.0.dist-info
├── METADATA
├── RECORD
├── top_level.txt
└── WHEEL2 directories, 7 files
$ cd dist/app
$ python -m app.main
Hello world
Finally, it contains no source code and it works. This package is platform-specific, so you need to build packages for every target python version and upload them to PyPI server. When you try to install the package from PyPI, pip
will choose package built for an appropriate platform or will fail if the platform doesn’t support any of them.
Note: when building packages, take into account that Python can be built with UCS-2
and UCS-4
option (number of bytes required for Unicode characters). In our example 36m
means that this package is built on Python 3.6 with UCS-2
.