Python Best Practices
The guidelines we follow at Edge Analytics for dependency management, linting, autoformatting, testing, and publishing in python
At Edge Analytics, we specialize in data science, machine learning, and algorithm development both on the edge and in the cloud. We provide end-to-end support throughout a product’s lifecycle, from quick exploratory prototypes to production-level AI/ML algorithms. While we use a variety of tools to get the job done, we often find ourselves using Python in at least some stages of the process.
We have put together this handy guide to share our best practices for using Python. Here you will find guidelines on dependency management, linting, auto-formatting, testing, building, and publishing packages!
Dependency Management — Poetry
We recommend using Poetry for installing dependencies and creating isolated environments between projects. There are a couple of reasons why Poetry is preferred over conda, pip and virtualenv, or other alternatives:
- It removes many of the pitfalls and traps found in alternative tools, which is especially important for newcomers. For example, it automatically creates a distinct virtual environment for each project and automatically adds dependencies to
pyproject.tomlwhen they are installed (no need to remember to run
- It drastically simplifies the process for building and distributing packages (e.g. on PyPI or a private GitHub repo). Instead of managing several different config files with duplicated information, it uses a single config file and building is as simple as
- It uses a lockfile (called
poetry.lock) which is critically important for guaranteeing reproducible builds and making sure things don’t break unexpectedly in production. In contrast,
pip install -r requirements.txtdoes not always result in the same version of each dependency being installed.
Poetry Basic Usage
Note that it is not necessary to activate the virtual environment before running
poetry add or
poetry install. Poetry commands automatically work within the corresponding virtual environment for each project. No more accidentally using the wrong environment or forgetting to activate it 🎉
Installing required dependencies (analogous to
pip install -r requirements.txt):
Adding new dependencies (analogous to
pip install numpy && pip freeze > requirements.txt):
poetry add numpy
Running a command from inside the virtual environment (analogous to
source /.envs/myproject/bin/activate followed by
python -m flake8):
poetry run flake8
Typically with Poetry, you don’t need to work with virtual environments directly.
poetry install, and
poetry run each automatically run inside the virtual environment corresponding to the project. However, you still have the option to open a new shell inside the virtual environment if you need or want to (analogous to
Exiting the virtual environment shell (analogous to
See the Poetry documentation for a full list of commands.
Linting — Flake8
We recommend using Flake8 to lint Python code. Flake8 helps improve code quality and catches certain kinds of bugs early, before you even run any code.
You can install Flake8 and some recommended plugins in a new project with:
poetry add --dev flake8 flake8-unused-arguments
You can run Flake8 directly from the command line:
poetry run flake8
However, we also strongly recommend running Flake8 automatically in your editor. There are plugins available for most major editors/IDES (see below).
Use the following configuration file, which should be called
.flake8 and be located in the project root directory:
; Minimal configuration for Flake8 to work with Black.
max-line-length = 88
extend-ignore = E203, E501
This config file ensures compatibility with Black and includes some other sensible defaults.
Flake8 Editor Integration
Auto-formatting — Black
We recommend using Black to automatically format Python code. Black takes care of formatting for you so that there is one less thing to worry about and argue over. If needed, you can install Black in a new project with:
poetry add --dev black
You can run Black directly from the command line:
poetry run black
However, we also strongly recommend running Black automatically in your editor. There are plugins available for most major editors/IDES.
Black Editor Integration
Testing — PyTest
Testing is a critically important part of writing maintainable production-level code. Any code which is going to stick around and be used by others (either internally or externally) should be tested.
Types of Tests
There are many types of tests in the software development world, but the two main categories are “unit tests” and “integration tests”. This blog post goes into depth about the difference between the two. To quickly summarize:
- Unit tests operate on the smallest individual units of code: functions. They are used to ensure that a function does what it is supposed to do when given different inputs. Typically, unit testing should focus on functions without any side-effects.
- Integration tests operate at a higher level and test integrations between different functions or between different APIs and services (e.g. a database, mailserver, or third-party API). They can sometimes have similar entrypoints (e.g. calling a function), but may also require a different, more complicated testing process (e.g. spinning up a full environment with a local server and database).
Installing and Using Pytest
We recommend using pytest for testing in Python. If needed, you can install pytest in a new project with:
poetry add --dev pytest
You can run pytest directly from the command line:
poetry run pytest
See the official docs for more information about how to use pytest.
Building and Publishing
If you are using poetry for dependency management, building and publishing a package is dead simple and does not require any additional configuration files.
Any code which is published as a package or deployed as a server should follow semantic versioning.
poetry version command automatically bumps the version number for a patch, minor, or major release. For example, when building and publishing a patch release, you should run:
poetry version patch
This will automatically bump the version in pyproject.toml.
After bumping the version, simply run:
This will add both a source code distribution and a binary distribution (wheel).
While not all code needs to be published, publishing code as a Python package is the best way to share and re-use it across different projects or even different companies/organizations. There are two different ways of publishing packages: public and private.
- Public packages are for open source code that is shared with everyone.
- Private packages are shared only with a select group of people (i.e. people from your company or organization).
To publish, simply run:
You can then install it the same way as any other public package:
poetry add my-package-name
For packages that you don’t want to be public, you can also publish to a private GitHub repository. To do that, simply push the changes to GitHub after building. We also recommend assigning a tag in GitHub corresponding to the version number, for example:
git tag v1.0.2
git push --tags
After pushing to the GitHub repository, the package can be installed with:
poetry add git+ssh://firstname.lastname@example.org/your-username-or-organization/your-package.git#main
Edge Analytics is a company that specializes in data science, machine learning, and algorithm development both on the edge and in the cloud. We provide end-to-end support throughout a product’s lifecycle, from quick exploratory prototypes to production-level AI/ML algorithms. We partner with our clients, who range from Fortune 500 companies to innovative startups, to turn their ideas into reality. Have a hard problem in mind? Get in touch at email@example.com.