Building and managing Python libraries
MeilleursAgents guidelines to make maintainable Python libraries.
This article is part of a series about backend web development practices at MeilleursAgents. These are internal documents that we are making public to share how we do things and to get feedback from the community.
We’ll take the example of a library named MA Super Lib
,MA
if for MeilleursAgents and our libraries should have this prefix.
Naming
Root folder: MA-Super-Lib
Source package: ma_super_lib
Library name: ma-super-lib
Folder structure
MA-Super-Lib/
dist/
ma_super_lib/
tests/
Jenkinsfile
Makefile
MANIFEST.in
README.md
setup.cfg
setup.py
Configuration
setup.py
setup.cfg
MANIFEST.in
see https://docs.python.org/3/distutils/sourcedist.html#specifying-the-files-to-distribute
We use packages=find_packages()
in setup.py
, so all files and packages in the directory of setup.py
will be included. But we do not want to add the tests
directory to the distribution, prune tests
removes the tests
directory from the distribution
⚠️ why not use find_packages
to exclude tests in the first place? As far as we know, if you do this you will not be able to lint the tests
package.
Makefile
Versioning
Use semantic versioning: https://semver.org/
For version identifiers, follow this PEP specification: https://www.python.org/dev/peps/pep-0440
Should I pin my dependencies versions?
install_requires and extras_require
Use version ranges: major versions as maximum version and current version as minimum version.
Why?
Even though every MeilleursAgents project using your library will pin a version of your library and its dependencies, we don’t want a library breaking when new versions of its dependencies are available.
Here is a timeline we don’t want:
- day 1: there is a need for a library on project P, we decide to implement MA-Super-Lib
- day 10: MA-Super-Lib with dependency D is published, D is not a MeilleursAgents library and does not have a pinned version, MA-Super-Lib version is 1.0.0
- day 11: project P uses MA-Super-Lib it pins MA-Super-Lib version to 1.0.0 and dependency D version to 3.4.5
- day 150: a new version of dependency D is published: 4.0.0 ⇒ risk of breaking changes
- day 200: project Q starts, developers decide to use MA-Super-Lib but to their surprise it doesn’t work anymore, version 4.0.0 of dependency D introduces a breaking change that makes it incompatible with the implementation of MA-Super-Lib
Why not pin version instead of using version ranges?
It is too restrictive and will end up being a problem if your project or your project dependencies share dependencies with your library.
Here is a timeline we don’t want:
- day 1: there is a need for a library on project P, we decide to implement MA-Super-Lib
- day 10: MA-Super-Lib with dependency D is published, D is not a MeilleursAgents library, MA-Super-Lib version is 1.0.0 and dependency D version is 3.4.5
- day 11: project P uses MA-Super-Lib and library L, MA-Super-Lib and L share dependency D but L requires D==3.4.5 and M requires D>3.5,<4 ⇒ version incompatibility
Drawbacks of using version ranges
In theory, version ranges allow us to make sure our libraries will work while avoiding version conflicts in projects using our libraries. BUT this is only true if library developers stick to semver, this is not always the case and it is very much out of our hands.
setup_requires and tests_require
Do not use version ranges, get the latest versions of your dependencies.
Why?
These dependencies are used for development and will not be part of your library distribution. Use the latest versions of your development tools to get the latest changes: new flake8 style guidelines, new pytest options, etc.
When a setup or tests dependency breaks
YOU are responsible for alerting the team if this happens.
YOU do NOT have to fix all the libraries using this dependency.
If you can fix it, great: fix it and add a ticket to the tech backlog listing libraries using it.
If you can’t fix it: find help and add a ticket to the tech backlog listing libraries using it.
Our libraries are not all in an active state of development meaning we have time to analyze and fix the problem across all of our libraries.
Development lifecycle
Locally
In projects using your library, install your library locally using pip install -e /local/path/to/your/library
in order not to have to publish releases for every change you make to your library.
Continuous integration
In projects using your library, update the project’s requirements with development releases and push your code so it goes through continuous integration.
Publish intermediate development versions and follow PEP440 for developmental releases:
0.0.1.dev0 # first development release
0.0.1.dev1 # second development release
1.0.0 # first stable release
1.0.1.dev0 # bugfix development
1.0.1 # bugfix release
Always publish development releases before publishing stable versions. It is very useful for continuous integration as we can make sure the library is working for projects using it.
Let’s say you have to implement a new feature that requires you to edit your library:
⚠️ Do not merge development versions. Right now, there is nothing to prevent you from doing so, be careful.
Publishing
Continuous integration takes care of publishing releases.
For development releases use make build
and make distribute
, we use packagecloud to manage our packages. All you need is a .packagecloud
in your home directory.
Miscellaneous
Why not use pip to handle requirements?
Use as few tools as possible! Setuptools is enough (for libraries).
If you are interested in joining MeilleursAgents, we are hiring! Take a look at our job board to see if there’s a role for you!