Python Best Practices

The guidelines we follow at Edge Analytics for dependency management, linting, autoformatting, testing, and publishing in python

Alex Browne
Jul 15 · 6 min read

At Edge Analytics, we specialize in data science, machine learning, and algorithm development both on the edge and in the cloud. We provide end-to-end support throughout a product’s lifecycle, from quick exploratory prototypes to production-level AI/ML algorithms. While we use a variety of tools to get the job done, we often find ourselves using Python in at least some stages of the process.

We have put together this handy guide to share our best practices for using Python. Here you will find guidelines on dependency management, linting, auto-formatting, testing, building, and publishing packages!

Dependency Management — Poetry

We recommend using Poetry for installing dependencies and creating isolated environments between projects. There are a couple of reasons why Poetry is preferred over conda, pip and virtualenv, or other alternatives:

  • It removes many of the pitfalls and traps found in alternative tools, which is especially important for newcomers. For example, it automatically creates a distinct virtual environment for each project and automatically adds dependencies to pyproject.toml when they are installed (no need to remember to run pip freeze).
  • It drastically simplifies the process for building and distributing packages (e.g. on PyPI or a private GitHub repo). Instead of managing several different config files with duplicated information, it uses a single config file and building is as simple as poetry build.
  • It uses a lockfile (called poetry.lock) which is critically important for guaranteeing reproducible builds and making sure things don’t break unexpectedly in production. In contrast, pip install -r requirements.txt does not always result in the same version of each dependency being installed.

See https://python-poetry.org/docs/#installation.

Note that it is not necessary to activate the virtual environment before running poetry add or poetry install. Poetry commands automatically work within the corresponding virtual environment for each project. No more accidentally using the wrong environment or forgetting to activate it 🎉

Installing required dependencies (analogous to pip install -r requirements.txt):

poetry install

Adding new dependencies (analogous to pip install numpy && pip freeze > requirements.txt):

poetry add numpy

Running a command from inside the virtual environment (analogous to source /.envs/myproject/bin/activate followed by python -m flake8):

poetry run flake8

Typically with Poetry, you don’t need to work with virtual environments directly. poetry add, poetry install, and poetry run each automatically run inside the virtual environment corresponding to the project. However, you still have the option to open a new shell inside the virtual environment if you need or want to (analogous to source /.envs/myproject/bin/activate):

poetry shell

Exiting the virtual environment shell (analogous to deactivate):

exit

See the Poetry documentation for a full list of commands.

Linting — Flake8

We recommend using Flake8 to lint Python code. Flake8 helps improve code quality and catches certain kinds of bugs early, before you even run any code.

You can install Flake8 and some recommended plugins in a new project with:

poetry add --dev flake8 flake8-unused-arguments

You can run Flake8 directly from the command line:

poetry run flake8

However, we also strongly recommend running Flake8 automatically in your editor. There are plugins available for most major editors/IDES (see below).

Flake8 Configuration

Use the following configuration file, which should be called .flake8 and be located in the project root directory:

[flake8]
; Minimal configuration for Flake8 to work with Black.
max-line-length = 88
extend-ignore = E203, E501

This config file ensures compatibility with Black and includes some other sensible defaults.

Flake8 Editor Integration

Auto-formatting — Black

We recommend using Black to automatically format Python code. Black takes care of formatting for you so that there is one less thing to worry about and argue over. If needed, you can install Black in a new project with:

poetry add --dev black

You can run Black directly from the command line:

poetry run black

However, we also strongly recommend running Black automatically in your editor. There are plugins available for most major editors/IDES.

Black Editor Integration

Testing — PyTest

Testing is a critically important part of writing maintainable production-level code. Any code which is going to stick around and be used by others (either internally or externally) should be tested.

There are many types of tests in the software development world, but the two main categories are “unit tests” and “integration tests”. This blog post goes into depth about the difference between the two. To quickly summarize:

  • Unit tests operate on the smallest individual units of code: functions. They are used to ensure that a function does what it is supposed to do when given different inputs. Typically, unit testing should focus on functions without any side-effects.
  • Integration tests operate at a higher level and test integrations between different functions or between different APIs and services (e.g. a database, mailserver, or third-party API). They can sometimes have similar entrypoints (e.g. calling a function), but may also require a different, more complicated testing process (e.g. spinning up a full environment with a local server and database).

We recommend using pytest for testing in Python. If needed, you can install pytest in a new project with:

poetry add --dev pytest

You can run pytest directly from the command line:

poetry run pytest

See the official docs for more information about how to use pytest.

Building and Publishing

If you are using poetry for dependency management, building and publishing a package is dead simple and does not require any additional configuration files.

Any code which is published as a package or deployed as a server should follow semantic versioning.

The poetry version command automatically bumps the version number for a patch, minor, or major release. For example, when building and publishing a patch release, you should run:

poetry version patch

This will automatically bump the version in pyproject.toml.

After bumping the version, simply run:

poetry build

This will add both a source code distribution and a binary distribution (wheel).

While not all code needs to be published, publishing code as a Python package is the best way to share and re-use it across different projects or even different companies/organizations. There are two different ways of publishing packages: public and private.

  • Public packages are for open source code that is shared with everyone.
  • Private packages are shared only with a select group of people (i.e. people from your company or organization).

To publish a package to the public PyPI index, you first need to register for a PyPI account, then set up API tokens.

To publish, simply run:

poetry publish

You can then install it the same way as any other public package:

poetry add my-package-name

For packages that you don’t want to be public, you can also publish to a private GitHub repository. To do that, simply push the changes to GitHub after building. We also recommend assigning a tag in GitHub corresponding to the version number, for example:

git tag v1.0.2
git push --tags

After pushing to the GitHub repository, the package can be installed with:

poetry add git+ssh://git@github.com/your-username-or-organization/your-package.git#main

Edge Analytics is a company that specializes in data science, machine learning, and algorithm development both on the edge and in the cloud. We provide end-to-end support throughout a product’s lifecycle, from quick exploratory prototypes to production-level AI/ML algorithms. We partner with our clients, who range from Fortune 500 companies to innovative startups, to turn their ideas into reality. Have a hard problem in mind? Get in touch at info@edgeanalytics.io.

Edge Analytics

Solutions at the intersection of bits and atoms