python deploy path series: virual vironment venv

Teddy Li
6 min readApr 27, 2023

--

Virtual environment provided venv module at its core is a file path structure. From official doc of venv:

“…supports creating lightweight “virtual environments”, each with their own independent set of Python packages installed in their site directories. A virtual environment is created on top of an existing Python installation, known as the virtual environment’s “base” Python”

So what does that mean ?

When you create a virtual environment using the venv module in Python, it creates a directory with the following internal file structure:

myenv/
├── bin/
│ ├── activate
│ ├── activate.csh
│ ├── activate.fish
│ ├── easy_install
│ ├── easy_install-3.8
│ ├── pip
│ ├── pip3
│ ├── pip3.8
│ ├── python -> /usr/bin/python3.8
│ └── python3 -> python
├── include/
│ └── python3.8/…
└── lib/
└── python3.8/
├── site-packages/…
└── venv/
  1. bin/: This directory contains executable scripts that are used to activate and deactivate the virtual environment, as well as install packages using pip. The activate script is used to activate the virtual environment, and the deactivate script is used to return to the global Python environment.
  2. include/: This directory contains header files used by the Python interpreter and compiled extension modules.
  3. lib/: This directory contains the Python standard library and any third-party packages that have been installed using pip. The site-packages subdirectory contains all packages installed within the virtual environment, while the venv subdirectory contains files and scripts used by the venv module itself.

The bin/activate script sets the PYTHONPATH environment variable to include the bin/ directory of the virtual environment, so that any packages or scripts installed within the virtual environment will take precedence over those installed globally on the system. When the virtual environment is activated, the python executable in the bin/ directory is used to run Python scripts, and the pip executable is used to install and manage packages within the environment.

Overall, the internal file structure of a virtual environment created using the venv module is designed to be self-contained and isolated from the global Python environment, so that you can manage packages and dependencies separately for each project you work on.

There are 5 major benefits of using virtula env. And the first advantage is isolation.

Isolation

Virtual environments in Python are used to create an isolated environment for a specific Python project. This is particularly useful when you have multiple Python projects with different dependencies that might conflict with each other.

For example, my old flask application requires older version of nvidia-nccl packages where as new version pulled in a new dependency: torchvision.

$ python3 -m pip list
...
numpy 1.24.2
nvidia-cublas-cu11 11.10.3.61
nvidia-nccl-cu11 2.14.3
...
$ python3 -m pip install torchvision
numpy 1.24.2
nvidia-cublas-cu11 11.10.3.66
nvidia-cuda-runtime-cu11 11.7.99
nvidia-nccl-cu11 2.14.3
torchvision 0.15.1
...

Since one python environment ran can only have singleversion of package. The newer dependency will overwrite the old package. Making you only able to work on one project.

When you create a virtual environment, you are essentially creating a new Python environment with its own set of installed packages and dependencies. Any packages or modules that you install within the virtual environment are isolated from the global Python environment, which means that you can have different versions of the same package installed in different virtual environments without any conflicts.

To create a virtual environment in Python, you can use the built-in venv module, which is included with Python 3.3 and later versions. Here's how you can create a virtual environment:

  1. Open your terminal or command prompt and navigate to the directory where you want to create the virtual environment.
  2. Run the following command: python3 -m venv myenv (replace myenv with the name you want to give to your virtual environment).
  3. This will create a new directory called myenv in your current directory, which contains the virtual environment.

To activate the virtual environment, run the following command:

  • On Windows: myenv\Scripts\activate.bat
  • On Unix or Linux: source myenv/bin/activate

Once the virtual environment is activated, you can install packages using pip just like you would in the global Python environment. Any packages you install will be installed within the virtual environment and will not affect the global Python environment.

To deactivate the virtual environment, simply type deactivate in your terminal.

Reproducibility

Let’s say you’re working on a project that requires the NumPy package. You want to ensure that the project always uses the same version of NumPy, even if you upgrade or reinstall Python on your system. By creating a virtual environment and installing NumPy within that environment, you can ensure that the project always uses the exact same version of NumPy, regardless of changes to your global Python environment.

# create a new virtual environment
python3 -m venv myenv

# activate the virtual environment
source myenv/bin/activate

# install the required packages
pip install numpy

# save the list of installed packages to a requirements file
pip freeze > requirements.txt

# deactivate the virtual environment
deactivate

This creates a new virtual environment called myenv, installs NumPy within that environment, and saves the list of installed packages to a requirements.txt file. This requirements.txt file can then be used to recreate the exact same environment on another machine or at a later time.

Dependency management

Similar to first advantage in isolation, virtual environment reduce risk of running into conflict dependency. Let’s say you’re working on a project that requires several packages, including NumPy, Pandas, and Matplotlib. You want to ensure that all of these packages are installed and available to your project. By creating a virtual environment and using pip to install each of these packages within the environment, you can easily manage the dependencies for your project and ensure that all required packages are installed and available.

Testing

You want to ensure that your tests are not affected by other packages or dependencies installed on your system. By creating a separate virtual environment for testing and running your tests within that environment, you can ensure that your tests are isolated from other packages and dependencies on your system, which can help make them more reliable and accurate.

# create a new virtual environment for testing
python3 -m venv testenv

# activate the virtual environment
source testenv/bin/activate

# install required packages
pip install pytest

# run tests
pytest myproject/tests.py

# deactivate the virtual environment
deactivate

This creates a new virtual environment called testenv, installs pytest within that environment, and runs tests located in myproject/tests.py. By running the tests within the virtual environment, you can ensure that they are isolated from other packages and dependencies on your system.

Plus virtual environment also helps with integration tests as well. When integrating with an external API, you want to make sure that your code works correctly with the API, but you also want to avoid accidentally making real requests to the API during development and testing. This is where virtual environments can be useful.

Here’s an example workflow for using virtual environments to perform isolation testing and integration testing.

  1. Create a new virtual environment for your project using the venv module and activate environment,
python -m venv myproject-env
source myproject-env/bin/activate

2. Install your project dependencies using pip, including any testing frameworks you want to use:

pip install requests pytest

3. Write isolation tests for your code using pytest, mocking out the external API using the requests-mock library.

import requests
import requests_mock

def test_my_api_integration():
with requests_mock.mock() as m:
m.get('http://example.com/api/data', json={'result': 'success'})
response = requests.get('http://example.com/api/data')
assert response.json() == {'result': 'success'}

This test simulates making a request to the external API, but uses a mocked response instead of actually sending a request over the network.

4. Run your isolation tests:

pytest

This will run your isolation tests in the virtual environment, verifying that your code works correctly in isolation from the external API.

5. Write integration tests for your code using pytest, making real requests to the external API:

import requests

def test_my_api_integration():
response = requests.get('http://example.com/api/data')
assert response.json() == {'result': 'success'}

By using virtual environments to perform isolation testing and integration testing, you can ensure that your code works correctly both in isolation and when integrated with other systems, without accidentally making real requests to external systems during development and testing.

This wraps up a quick walk into venv module and how to use it for isolationn, reproductlity, package management and testing. Next we will compare it with conda as another favorite environment management tool among machine learning learning community.

--

--

Teddy Li

I explain and scale AI technologies. 📈 Help my clients embed AI technologies to their course 💡 get a free AI booster from: http://thebabyai.com/