venv-cli — A lightweight tool for managing virtual environments and dependencies in Python
Have you ever tried to manage a Python project and ended up in dependency hell, only to realize that it was actually the newest version of pipenv that was broken, or a package that was missing from Anaconda, or poetry was using the wrong version of Python?
If you have ever had to distribute or deploy a Python project, or replicate the dependencies of said project on a different machine, you have probably found yourself frustratingly debugging your deployment, trying to figure out why it works on your machine and not on your colleagues, or spent hours trying to find out how to activate a conda environment inside a Docker container.
If you have, then venv-cli might be for you.
The creation of venv-cli was heavily inspired by a series of articles from bitecode.dev on Python, pip and venv. If you are unfamiliar with virtual environments or why it is important to use them, I highly suggest reading these:
- Relieving your Python packaging pain
- Why not tell people to “simply” use pyenv, poetry or anaconda
- Explaining why the Python installation process is such a mess
- Back to basics with pip and venv
What is venv-cli?
venv-cli is a command-line interface (CLI) tool which acts as a thin layer on top of venv and pip. It is built on the principles laid out in the bitecode.dev articles listed in the introduction, with a bit of developer convenience and quality-of-life added on top. It can be used to create virtual environments and to manage dependencies within those environments.
It was built after several years of frustration stemming from managing many different Python projects, each with their own set of requirements and support for different Python versions.
venv-cli is available for download here:
github.com/SallingGroup-AI-and-ML/venv-cli
Why another tool? Why not just use …
There are already a lot of tools that can help with either creating virtual environments and/or managing dependencies; Anaconda, pipenv, poetry, pipx, virtualenv etc. Over the years I have tried most of these, and while they have their upsides, they all ended up causing more issues than they solved.
In the ML&AI team at Salling Group, we used conda to manage our python versions and pipenv to create virtual environments and manage dependencies. This has caused us several headaches over the years, including (but not limited to):
- conda being very slow to roll out their own compiled versions of Python, and missing packages in the anaconda package repository,
- pipenv often introducing bugs in their latest versions,
- managing conda and pipenv when using Docker containers, since they both rely on “activating” the shell; something that doesn’t work in Docker,
- pipenv’s dependency resolution taking forever and being very opaque when encountering errors, especially errors from pip itself, and
- problems arising from having two or three package managers interact.
In an attempt to solve these headaches, spend less time debugging tools and leaving more time for actual development, we decided to opt for the simplest¹ workflow possible that still follows the guiding principle of always working within a virtual environment and never installing packages outside of these.
Trading a bit of overhead for a lot more transparency
The simplest workflow we could find came from following the guidelines layed out in Relieving your Python packaging pain and Back to basics with pip and venv. The main points are:
- install Python from the (most) official source,
- always use virtual environments, and
- always be specific about which version of Python you are using.
Following these guidelines (and the articles), we end up with a workflow that looks something like:
$ python3.10 -m venv .venv
$ source .venv/bin/activate
$ python3.10 -m pip install -r requirements.txt
We use Python 3.10 to create a virtual environment linked to that specific version of Python. We then activate that virtual environment and use pip to install dependencies from requirements.txt
.
Note that we do not use python -m venv .venv
since we could have several versions of Python installed, and it is not always clear which one is being used when just invoking python
. We also don’t use pip
but python3.10 -m pip
, since again pip
by itself might refer to several different versions depending on your PATH
.
Using these commands every time we create a new virtual environment or install/uninstall packages makes the process very transparent, however, it is quite cumbersome to type out every time, and when managing a lot of different projects each with their own environments, this end up being a point of irritation. This is where venv-cli comes in.
Okay, so what is it really?
I stated in the beginning, that “venv-cli is a tool that can be used to create virtual environments and to manage dependencies within those environments.” When installed, it runs the same commands as those we just looked at, but without the same need for specificity. The idea is to be specific once, and then having venv-cli handle the rest.
Installation
You can install the venv-cli by cloning the GitHub repository at github.com/SallingGroup-AI-and-ML/venv-cli, then running the install.sh
script:
$ git clone https://github.com/SallingGroup-AI-and-ML/venv-cli.git
$ cd venv-cli
$ ./install.sh
When restarting your shell you now have the venv
command available. You can test it by running venv --version
:
$ venv --version
venv-cli v1.2.0
To create a virtual environment, recall the preferred way according to our guidelines:
$ cd my-project
$ python3.10 -m venv .venv
With venv-cli this can be achieved with the command:
$ cd my-project
$ venv create 3.10
Creating virtual environment 'my-project' using Python 3.10.12
This command looks for a python3.10
executable on your system, and if found, uses that to run the original command. This creates a virtual environment in the folder .venv
, which is tied to your python3.10
executable. The virtual environment can then be activated by running
$ venv activate
(my-project) $
This command looks for the .venv
folder in your current folder, then runs the activation command. To deactivate the environment again, run
(my-project) $ venv deactivate
Now that the environment is activated, you can use the python
command as you normally would, and that will use the python
that is now linked to your environment:
$ cd my-project
$ venv activate
(my-project) $ which python
/home/stefan/my-project/.venv/bin/python
You can also use the pip
command in the same way, however in the next section I will give a few reasons as to why that might not be a good idea, and why you should instead use the alternative command provided by venv-cli; venv install
.
Managing dependencies
To keep in line with the guiding principles, we set out to solve another very common source of confusion and errors in package management, especially where a reproducible environment is needed; pip install
.
Don’t get me wrong, pip install
is our bread and butter when installing packages, but installing a single package at a time can lead to some unpredictable behavior when later trying to recreate the environment, e.g. for deployment or when others want to work on the same project.
To illustrate why this is the case, let’s look at a typical workflow when developing a project:
package-a==1.0
- package-a-dep
- common-lib < 2.0
package-a==2.0
- package-a-newdep
package-b
- package-b-dep
package-c
package-d
- common-lib >= 2.0
Diagram: Fictional packages and their dependencies
We start off by installing package-a
and package-b
by runningpip install package-a package-b
. Later on, we might need package-c
, so we install that as well. Then while developing, a new version 2.0
of package-a
is released with just the bugfix we need, so we run pip install --upgrade package-a
to get the new version. Then, when the project is done and ready to deploy, we go through the code and realize that we no longer use package-b
, and so we run pip uninstall package-b
. Finally, we run pip freeze > requirements.txt
as all good guides tell us to do, and now we have a reproducible environment, right?
Technically, yes. It’s true that if someone else now runs pip install -r requirements.txt
, they will get the same environment as the one we ended up with, with package-a
and package-c
and no package-b
. However, if they had instead installed package-a
and package-c
directly, even if no new versions of these had been released in between, they could very likely end up with a different environment than ours.
This is because the packages we installed might install dependencies of their own, and these are not necessarily removed when upgrading or uninstalling the package again. To illustrate this, let’s go through the example above again, but with some added dependencies of our dependencies, as illustrated in the above Diagram.
First, we ran pip install package-a package-b
. This not only installed package-a
and package-b
, but also their dependencies; package-a-dep
and package-b-dep
. We then installed package-c
with no additional dependencies.
Then, in the new version of package-a
that we upgraded to, the maintainers of package-a
decided they no longer needed to depend on package-a-dep
, but instead replaced it with package-a-newdep
. So when we ran pip install --upgrade package-a
, we now get package-a-newdep
as well, but package-a-dep
is not removed.
The same goes when uninstalling package-b
: package-b-dep
is also not removed. In the end, running pip freeze
gives us:
package-a==2.0
package-a-dep==...
pacakge-a-newdep==...
package-b-dep==...
package-c==...
(package versions omitted where not important)
We can see that both package-a-dep
and package-b-dep
still hang around even when they are not needed: No code in the project is using these packages, and the other packages are no longer relying on them.
If we instead ran pip install package-a package-c
, we would have gotten:
package-a==...
pacakge-a-newdep==...
package-c==...
The extra packages are not just dead weight which is downloaded and installed without being needed; they can cause problems later on as well as they add unnecessary constraints on our environment and which versions of packages can be installed, since these dependencies might again themselves have dependencies².
venv install
To alleviate this issue using venv-cli, the workflow looks a bit different. Instead of directly running pip install
, packages and their constraints should be added to a requirements.txt
file, then installed from there.
Let’s redo the same process from the previous section, but now using venv-cli:
First, package-a
and package-b
are added to requirements.txt
:
# requirements.txt
package-a==1.0
package-b
Then we run venv install requirements.txt
. This will install all packages from the file, including any of their dependencies, and then lock the installed versions into a corresponding requirements.lock
file:
# requirements.lock
package-a==1.0
package-a-dep==...
package-b==...
package-b-dep=...
We can see that pip
installed version 1.0
of package-a
.
Now, when version 2.0
of package-a
is released, instead of running pip install --upgrade package-a
, we instead update the requirement in requirements.txt
, then rerun venv install requirements.txt
:
# requirements.txt
package-a>=2.0
package-b
This will completely clear the existing environment, then re-install all packages according to the updated requirements.txt
. This process makes sure that there are no “orphaned” packages left in the updated requirements.lock
:
# requirements.lock
package-a==2.0
package-a-newdep==...
package-b==...
package-b-dep=...
The same applies when adding package-c
and later on removing package-b
: they should be added to/removed from requirements.txt
, and then the environment is updated using venv install requirements.txt
.
In the upcoming v2.0 of venv-cli it will be possible to install/uninstall packages one at a time using the syntax venv install <package>
. This will however still add the package to requirements.txt
and then runvenv install requirements.txt
behind the scenes to make sure the environment is always updated and reproducible.
requirements.txt and requirements.lock
The reason for keeping these files separate is to make it easier to differentiate between direct and indirect dependencies of your project.
If your project only requires pandas
, you probably only care about the constraints and versions of the pandas
package, not about the five other packages that pandas
itself requires.
This way you can specify only the packages you directly require, with their constraints (e.g. pandas >= 2.0
) in requirements.txt
, and the full environment will still be recorded in requirements.lock
for reproducibility.
Then, when you deploy to production, or need to test a colleague’s project and need to install the reproducible version of the environment, you can install the locked requirements:
$ venv install requirements.lock
This will make sure that you (or your colleague, or the production server) installs exactly the same packages every time.
Development dependencies
Some projects might have dependencies that are only used during development, like mypy
or black
. These are not required when deploying the project to production, or when building the package for redistribution.
To help differentiate between these, you can create a separate dev-requirements.txt
file which includes the line -r requirements.txt
, and then all the development requirements:
# requirements.txt
matplotlib
numpy
pandas >= 2.0
# dev-requirements.txt
-r requirements.txt
black
mypy
pytest
When running venv install dev-requirements.txt
, this will install all packages from both requirements.txt
and dev-requirements.txt
and lock the resulting environment in dev-requirements.lock
.
If you want to install the development dependencies, but don’t want to lock the environment afterwards, you can pass the flag --skip-lock
to venv install
:
(my-project) $ venv install dev-requirements.txt --skip-lock
Collecting matplotlib
Using cached matplotlib-3.8.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (11.6 MB)
Collecting numpy
[...]
Installing collected packages: [...]
Locked requirements in dev-requirements.lock
Similarly, you can have a build-requirements.txt
, test-requirements.txt
etc., and each of them can include references to one or more other requirement files.
venv-cli in production
Some of the great benefits of the fact that venv-cli is a quality-of-life wrapper around venv
and pip
are:
- Both
pip
andvenv
come built-in when you installpython
³. - All dependencies are specified in the standard
requirements.txt
format, which is easily transferable to any other package management system, or to asetup.py
/pyproject.toml
file for package distribution. - Creating and tearing down virtual environments is very easy, and has very little overhead. This is a must, as the whole idea of a virtual environment is that it should be easy to remove and recreate.
- In contrast to other popular virtual environment managers, activating a
venv
virtual environment just adds some paths to yourPATH
variable; no new sub-shell is spawned, and nothing needs to be sourced for the environment to work.
That last point is especially important when deploying your project somewhere, e.g. having a Docker container that needs to install the project requirements. Other tools like conda
and pipenv
typically require some fiddling around to get their virtual environments to work (since you cannot just “activate” the environment in a Dockerfile⁴).
In contrast, when your project just uses pip
and venv
, you don’t have to activate the environment; in fact, you actually don’t need venv-cli at all.
All you need to do is use the specially linked versions of python
and pip
that exist in the virtual environment folder .venv
, which will automatically make sure to reference the packages within that virtual environment. These special versions are located at .venv/bin/python
and .venv/bin/pip
, respectively⁵.
Here’s an example of running a project in a Dockerfile using a requirements.lock
file and these special versions of python
and pip
.
The project is structured as:
my-project/
src/
main.py
Dockerfile
requirements.txt
requirements.lock
And the Dockerfile:
FROM python:3.10 AS base
# Create the virtual environment
RUN python3.10 -m venv .venv
# Pip install requirements inside the virtual environment
COPY requirements.lock .
RUN .venv/bin/pip install -r requirements.lock
# Copy project files into container
RUN mkdir ./src
COPY ./src ./src
# Run your project
CMD [".venv/bin/python", "src/main.py"]
Conclusion
In summary, venv-cli simplifies Python project management by streamlining working with virtual environments and dependency handling. It offers a more predictable and reproducible workflow, making it easier to recreate environments among colleagues and in deployments. With venv-cli, you can save time on troubleshooting and focus on productive development, making it a valuable tool for Python developers.
I’m Stefan Mejlgaard, developer in the ML&AI team at Salling Group. As a team focused on delivering solutions that provide value to the business, we often spend time re-evaluating the tools we use, and whether or not they actively assist us in development or instead act as roadblocks.
This tool is our way of both removing an internal roadblock while hopefully adding something of value to the Python community at large, since we rely so heavily on the community in our day-to-day work.
Oh, and one last thing…
Managing multiple versions of Python
So how do you manage multiple versions of Python? I would actually venture so far as to say: you don’t. If you install from the official sources, and are specific about which version you are installing, you can easily have multiple versions installed on your system at the same time. Where people usually run into problems are when they are not being specific enough about which of the installed versions they want to invoke, but with venv-cli that shouldn’t be a problem since we are required to be specific once when creating the environment, and then the rest is handled for us.
As an example, on Ubuntu³ we can install both Python 3.9 and 3.10, and create virtual environments using both without issues:
$ sudo apt install python3.9-venv
$ cd project_39
$ venv create 3.9
Creating virtual environment 'project_39' using Python 3.9.18
$ venv activate
(project_39) $ which python
/home/stefan/project_39/.venv/bin/python
(project_39) $ python --version
Python 3.9.18
(project_39) $ venv deactivate
$ sudo apt install python3.10-venv
$ cd ../project_310
$ venv create 3.10
Creating virtual environment 'project_310' using Python 3.10.12
$ venv activate
(project_310) $ which python
/home/stefan/project_310/.venv/bin/python
(project_310) $ python --version
Python 3.10.12
¹”Simplest” here does not mean the one with the prettiest CLI, or the fewest commands, or least number of required files in the project. “Simplest” to us means having the least potential for spawning bugs stemming from simply using the tool itself.
²Imagine package-a-dep
had its own dependency on common-lib < 2.0
. If we later on try to install package-d
which has the dependency common-lib >= 2.0
, these constraints cannot be solved and pip
will fail to install package-d
. If package-a-dep
and its dependencies had been properly removed, this had not been an issue, and the more dependencies you have, the more likely it is that this will become a problem at some point.
³Note that on Ubuntu, the python<version>
package does not include the venv
library, so you need to install the python<version>-venv
package to be able to use it to create virtual environments.
⁴https://pythonspeed.com/articles/activate-conda-dockerfile/, https://pipenv.pypa.io/en/latest/docker/
⁵These special linked versions of python and pip are the ones that are called when you use python
or pip
within an activated virtual environment. You can see this by activating the environment, then typing $ which python
.