A Python Package Developer’s Cheat Sheet
Notes and thoughts on how to design and set up a clean Python package structure
According to developers, Python is among the top five programming languages in 2019. Based on the strength of its open-source community and high adoption levels in emerging fields such as big data, analytics, and machine learning, no one should be surprised when noticing its popularity growing in the coming years. The number of packages available for Python developers shall keep growing, too. And maybe you’ll be responsible for some of them.
When it happens, keep in mind Python is very flexible in terms of package setup … there are lots of docs and blog posts on this subject, by the way. But sometimes we may get confused among so many options, mainly when getting started to package development and distribution.
The goal of this article is to describe a clean package structure, making it easier for developers to test, build, and publish it, writing as few configurations as possible, while taking advantage of conventions.
A Clean Package Structure
The proposed folders and files, including testing stuff, are presented below:
I’ll explore all of them but
README.md since these are widely known files. Please ask Google in case you have questions about their contents.
Let me start from
setup.py, which is the package’s descriptor file. It consists of a Python script where multiple properties can be set declaratively, as shown below. The properties declared in this file are recognized by package managers such as pip and IDEs such as PyCharm, which means this is a must-have for any package.
Some properties’ meanings are pretty straightforward:
author … but others require a bit of explanation:
package_dir: used to set where your package source files are located and the namespaces they declare. In the above example,
srcis configured as the sources root folder (in case it has subfolders, they’re included in the package by default), and such files declare namespaces starting by
install_requires: a tuple with all dependencies your package needs to work — warning: add only operational dependencies; nothing related to testing or building should be put here (test dependencies are covered in the next section). You can also think of this as a partial replacement for
requirements.txtin case you’re familiar with it.
These properties allow us to bundle only the files users need when working with the packages we provide them, which results in smaller distribution files. Also, they will download only the dependencies each package needs to run, avoiding unnecessary network and storage usage.
There’s a practical exercise to see it in action. Sample code for this article is available on GitHub (https://github.com/ricardolsmendes/python-package-cheat-sheet), and
pip allows us to install packages hosted there. Please install Python 3.6+ and activate a virtualenv. Then:
pip install git+https://github.com/ricardolsmendes/python-package-cheat-sheetpip freeze
You'll notice two packages were installed, as follows:
The first one is declared in
setup.py, available on the GitHub repo. The second is a required (operational) dependency for that package.
Now, let’s call the
package_cheat_sheet.StringFormatter.format_to_snakecase method using the Python Interactive Shell:
>>> from package_cheat_sheet import StringFormatter
As you can see,
foo_bar is the output for
StringFormatter.format_to_snakecase('FooBar'), which means the package installation works as expected. This a quick demonstration of how you can set up a Python package and make it available for users with a few lines of code.
Packages Also Need Automated Tests
Modern software relies on automated tests, and we can’t even think about starting the development of a Python package without them. Pytest is the most used library for this purpose, so let’s see how to integrate it into the package setup.
In the first exercise we wore a user’s hat — now it’s time to wear a developer’s one.
First of all, please uninstall the package gathered from GitHub, clone the full sample code, and reinstall from the local sources:
pip uninstall python-package-cheat-sheetgit clone https://github.com/ricardolsmendes/python-package-cheat-sheet.gitcd python-package-cheat-sheetpip install --editable .
The command to trigger a test suite based on the setup file is
python setup.py test. It does not use
pytest by default, but there’s a way to replace the default testing tool: Create a
setup.cfg file in the package’s root folder, setting an alias for the
pytestdependencies are required from now on; otherwise, the command will fail after the alias is created. The dependencies will be added to
setup.py, using distinct properties:
pytest-runneris responsible for adding
pytestsupport for setup tasks
pytest-covwill help us to generate coverage statistics for our code, as we’ll see next
Two more config files must be included in the package’s root folder:
pytest.inicontains additional parameters for
pytestexecution. For example, presenting coverage results both in the console and HTML files:
addopts=--cov --cov-report html --cov-report term-missing
.coveragerccontrols the coverage script scope. This is pretty useful when you have folders in your project that don’t need to be monitored by the tool. In the proposed clean structure, only the
srcfolder must be covered:
And we’re ready to run
python setup.py test, now powered by
pytest. By default,
pytest looks for test files inside the
tests folder. For the GitHub repo you just cloned, the expected output is:
collected 10 itemstests\package_cheat_sheet\string_formatter_test.py .......... [100%]----------- coverage: platform win32, python 3.7.2-final-0 ---------
Name Stmts Miss Cover Missing
src\package_cheat_sheet\__init__.py 1 0 100%
src\package_cheat_sheet\string_formatter.py 17 0 100%
tests\package_cheat_sheet\string_formatter_test.py 31 0 100%
TOTAL 49 0 100%
Coverage HTML written to dir htmlcov
Please also check
htmlcov/index.html after running
pytest, since HTML output helps a lot when you need a deeper understanding of the coverage reports.
This article presents a clean Python package structure, covering both general setup and testing instrumentation. It proposes an explicit separation of sources and tests files, using Python standards, convention over configuration, and common tools to get the job done writing as little as possible.
And that’s it!