Python code obfuscation

Arye
Analytics Vidhya
Published in
3 min readDec 6, 2020
Photo by Josh Boot on Unsplash

You are using Python and deploying your code directly in docker images or through pip install-able packages. Without extra precautions your source code is easily accessible by anyone. If you don’t mind that, you might still want to prevent anyone from tampering with the code. Because of the interpreted nature of Python, doing that involves extra steps. Thankfully and as usual in the rich Python ecosystem: solutions exist. Listed below, the categories of existing solutions to this problem and details on the selected one.

Disclaimer — I am told that with enough effort anything can be reverse engineered. Our goal here is to make that more difficult. Also an opinionated selection is made based on trust. For once a closed source solution is preferred because malicious users don’t have the opportunity to look at it’s implementation details.

  • Python to exe family
    The tools that ship your python code as an executable bundle the interpreter and the source into one binary file. The source is easy to access without extra precautions. Those tools should be used together with other techniques as described below!
  • Obfuscation
    Simple python obfuscation for example base64 encoding can be easily cracked. Quite easy to guess…
  • Obfuscation through Encryption
    Those are more advanced and allow extra features such as setting an expire time (TTL) on top of encrypting the code. (enforcing an expire time: not tested)
  • Ad hoc
    Some more involved techniques (example dropbox) also ended up being reverse engineered. For a very interesting read on that please follow these links.
  • Cython
    A special mention for Cython which deserves it’s own category. With Cython you compile your Python code to C binaries and make it also difficult enough to reverse engineer. However choosing that can involve changing the code and extra work.

Encrypt python source with sourcedefender

The selected solution is: SourceDefender because of it’s simplicity:

  • you only need to run one command to encrypt the pieces of source code you wish to protect from .py to .pye.
  • and because of the technical support they provided by solving a bug quickly.

SOURCEdefender can protect your plaintext Python source code with AES 256-bit Encryption. There is no impact on the performance of your running application as the decryption process takes place during the import of your module or when loading your script on the command-line. Encrypted code won’t run any slower once loaded from a .pye file compared to loading from a .py or .pyc file.

As can be seen from the pypi page, basic usage is straightforward. Once running with the protected code, basic snooping tools like Python’s inspect module’s getsource return gibberish. In the error tracebacks file names and line numbers are included but no source either.

Traceback (most recent call last):
File "src/config/helper.pye", line 85, in start
IxlqeEBtjE9CDDaHwLr395P6GeE1hjM07faWWV+D6ytu7lgbRXBeHGlRt...
File "src/config/settings.pye", line 49, in get_service
kIAsrHxOSEXRBwKAPMe7Ys9GY85aT5J+d9muxXUoMFeajjZFCsvPwG121...
ImportError: cannot import name '+++++'

When deploying your code as by copying it in a docker image, that’s all you need. If you deploy as python packages keep reading.

Python packages encryption

In order to create Pip installable packages with encrypted python code you need to trick setuptools and have it package *.pye files for you. Historically distutils and setuptools rely on the presence of __init__.py files. With Python3’s Implicit Namespace Packages that allow to create packages with no __init__.py file and the corresponding ability in setuptools: find_namespace_packages, all you need to do in your setup.pyfile is:

  • use find_namespace_packages instead of find_packages
  • bundle all *.pye files as package_data

Full setup.py example

try:
import sourcedefender
except ModuleNotFoundError:
pass
import osfrom setuptools import setup, find_namespace_packagesPROJECT = "hello_world"def package_pye_files(directory):
paths = []
for (path, directories, filenames) in os.walk(directory):
for filename in filenames:
if filename.endswith('.pye'):
paths.append(os.path.join('..', path, filename))
return paths
pye_files = package_pye_files(f"./{PROJECT}")PACKAGE_DATA = {
PROJECT: ["./resources/*"] + pye_files
}
with open("requirements.txt", "r") as f:
REQUIREMENTS = f.read().splitlines()
setup(
name=PROJECT,
version="0.0.1",
description="hello_world",
author="None",
python_requires='>=3',
packages=find_namespace_packages(include=[f"{PROJECT}*"]),
package_data=PACKAGE_DATA,
include_package_data=True,
install_requires=REQUIREMENTS
)

references

--

--