Python Virtual Environments: The Why, What, and How.

Dolamu Oludare
Towards Data Engineering
10 min readFeb 11, 2023

Get proper insight into best practices in Python programming workflows.

Image source: Unsplash
Photo by Mhin Pham from Unsplash

“The first principle is that you must not fool yourself and you are the easiest person to fool” — Richard Feynman.

The Python programming language has gained massive popularity over the years because of its gentle learning curve, great and active user community, open-source nature, and green ecosystem of standard libraries.

Maybe it’s because Python has several libraries that prevent programmers from reinventing the wheel, thus increasing their productivity and efficiency.

But it occurred to me that knowing the nitty-gritty of the language and being ignorant about the basics of project workflow and environment setup would only limit you as a developer. Sometimes it could even bring chaos and disorderliness to your project workflow, especially when you have to use third-party libraries, no matter how elegant and logically sound your code looked.

I have always heard of the term “Virtual Environment” but had little or no idea what it is meant for, why it was supposed to be used, and when it was meant to be used. In this article, you will learn the essence of virtual environments in Python, what they are, and how to initiate them for your Python projects.

Why are Python Virtual Environment Important?

“Knowing “why” (the idea) is far more important than learning “what” (the fact)” - James D. Watson

Before we go into why Python virtual environments are essential, It’s important we understand what Python standard libraries are and how they are used.

What are Python Standard Libraries?

Python standard libraries are external sets of Python codes that are written and made open source for other Python developers to use and save time rather than reinventing the wheel. These libraries are packages your project depends on to carry out a particular task. This is why they are also called “Dependencies”, and they are often found on the Python Package Index (PyPi). For instance, a mathematics student is told to write a program that finds the factorial of a number. There are a lot of ways to go about this if there are no restrictions on the task.

def factorial(a):
"""
Function that calculates the factorial of a number
"""
fact = 1
if a < 0:
print("Sorry factorial does not exist for negative numbers")
elif a == 0:
print("The factorial of 0 is 1")
else:
for i in range(1, a+1):
fact *= i
return fact


print(factorial(5))
# Output
>>> 120

In the above code, the student simply writes a code from the first principle to calculate the factorial of the number. An alternate and more efficient way of solving this problem and avoiding reinventing the wheel is by using the Python math module that has this functionality pre-written in them, as we would do in the following code block.

import math
print(math.factorial(5))
# Output
>>> 120

As we can see, we obtained the same result for the two approaches, but the difference between the approaches is that the one that uses the third-party library, boycotts the nitty gritty of calculating the factorial of the number and just fetches the predefined function (factorial) for these functionalities from the math library and uses it in the code. The standard library approach saves time and is more efficient. Standard libraries are almost unavoidable in any Python project. Other common third-party libraries are the requests library used for web scraping, the Pandas library used for data manipulation, and the Numpy library used for numerical and matrix calculations.

So now that we have a glimpse of what Python standard libraries are, we can go on to explore why Python virtual environments are important to our project workflow and structure as developers. I would give the basic analogy of a chef that desires to cook two different dishes probably “Jollof rice” and “Fried rice”. Though the two dishes have something in common (rice), they are entirely different and require different ingredients and approaches to prepare them.

Photo by ABDALLA M from Unsplash

The rational thing for the chef to do is to prepare the dishes in two different pots for a proper outcome and avoid a hodgepodge of recipes and dishes while preparing the dishes, also the framework for cooking the dishes which is the pot could vary in size depending on the quantity of food the chef is willing to cook for the two different dishes.

But the setup for cooking would remain the same for the two dishes as the chef would probably make use of a gas cooker, a pot, a serving spoon, and the likes.

The only difference between them would be the recipes for both dishes. This is also relatable to Python virtual environments as it is used to separate or isolate two different Python projects which are synonymous with the two different dishes we mentioned in our example, the recipe for the dishes is also analogous to the Python dependencies we explained initially.

The conventional Python Development environment has a global Python interpreter and a folder of third-party modules that are installed using the Python package manager called pip. So like I explained earlier the third-party modules are sets of codes that your project depends on to fulfill or carry out a particular task.

Illustration by Author

I would give two crucial instances as to why the Python virtual environment is needed in the setup of our Python projects. From the image above, we have an idea of how our conventional Python environment is setup. The first project in the global python interpreter is a web scraping project that uses Python version 2.7.9 and the requests web scraping library 1.1.0, while the second project on the right is another web scraping project that also uses Python version 3.8.9 and requests version 2.0.0.

The third-party module, requests, doesn't come preinstalled on Python libraries and is constantly been updated with every new release using the pip package manager.

In our example from the image, the developer starts a web scraping project with requests 1.1.0 with all his codes written according to the documentation of requests 1.1.0, then after 2 months he decides to carry out another web scraping project in the same global python interpreter. This time he updates his requests library to version 2.0.0, which is not a bad move to make, but the issue right now is that if he was to work on the first web scraping project with his updated request library, he would either need to change his entire code base of the project to make it compatible with requests 2.0.0.

Since there is a probability that the discrepancies in the documentation of request version 1.1.0, which he used for the first project, and version 2.0.0 is massive or he would have to uninstall the requests version 2.0.0 library and reinstall the version 1.1.0 to make his initial code run without error.

Image showing central code base and different setups of the collaborating developers
Illustration by Author

Another instance is where you have the same web scraping project which is built on requests version 1.1.0. But in this case, the project is open-source and requires collaboration between several Python developers. There is a huge possibility that the several developers have different project environments and different versions of the third-party modules (requests) required for the project, while the code base of the project is running on requests version 1.1.0, this would certainly engender incongruity amongst the developers and also to the code base for the project.

These issues can all be resolved using the Python virtual environments as it allows users to store the particular versions of their dependencies in a particular file conventionally called “requirements.txt”, so a new developer looking to collaborate on the project can easily access the versions of libraries for the project, install them, and be up to speed with the project requirements and developments. I would show how to go about this later in the article.

Now that we clearly understand why we need Python virtual environments in our Python project workflow, it's safe to properly talk about what Python virtual environments actually are.

What are Virtual Environments?

According to the Python’s official documentation,

“A virtual environment is a Python environment such that the Python interpreter, libraries and scripts installed into it are isolated from those installed in other virtual environments, and (by default) any libraries installed in a “system” Python, i.e., one which is installed as part of your operating system”

Virtual environments are like containers that allow developers to embed and separate their Python projects from other projects, and in the process prevent clashes between different versions of dependencies and other packages.

The instance regarding the web scraping projects we talked about in the why section of the article can be resolved using the virtual environments technique, basically by initiating a virtual environment for the two different projects as shown in the images below.

Virtual environment for first Web scraping project, Illustration by Author.
Virtual environment for second Web scraping project, Illustration by Author

The second instance can also be resolved by putting the project in a virtual environment and writing the libraries used for the particular project in a file traditionally called “requirements.txt” so the developers can easily access this information and download them before collaborating on the project. After that has been done, the Collaboratory setup now looks like the image below and seamless and harmonious collaboration can be enabled.

Collaboratory setup with virtual environment, Illustration by Author

How to Use Python Virtual Environments

In this section of the article, I would be doing a walkthrough of how to install and work with virtual environments. Python virtual environments are always initiated with a module called “venv”, if you do not have this module in your global environment, you can install it using the following line of command:

pip install venv

After installing the venv library, let’s check the location of our Python interpreter with the following command

which python
Image Source: Author

So we can see from the output that the Python Interpreter accessible to us is the one in the global environment. Now, let's check the available third-party libraries we have in our global environment with the following command:

pip list
Image source: Author

We can see from the output that we have a lot of third-party libraries installed on our global environment ranging from Flask, beautifulsoup, Babel, and the likes. Now, let us create our virtual environment in our project directory with the following command:

python -m venv <Environment Name>
Image Source: Author.

our virtual environment name is “Virtualenv” and as we can see in the image our virtual environment has been created in the project directory, the next thing to do is to activate the virtual environment with the following commands for Linux and Windows Operating systems (OS) respectively.

For Linux OS:

source <Environment Name>/bin/activate

For Windows OS:

source <Environment Name>/Scripts/activate
Image source: Author

We can notice from the terminal that the “(Virtualenv)” now appears beside our directory text which simply means that the virtual environment for the project has been activated. Now if we check the location of our Python interpreter using the which python command, we would notice that it has changed and it’s now running in the virtual environment “Virtualenv”, as we can see below:

Image source: Author.

Let’s go ahead and check the list of third-party libraries present in our libraries at the moment and observe the changes.

Image source: Author

You can see from the output of the pip list command, that we have just a few third-party libraries compared to the global environment we checked initially, so now we can go ahead and install the requests library using the following command:

pip install requests
Image source: Author.

We have successfully installed the requests library into our virtual environment, we can verify this by checking the list of third-party libraries present in our virtual environment using the pip list command.

We can see that the requests library is already present on the list of installed libraries in the virtual environment. Next, let's write our third-party libraries information into a file called “requirements.txt” using the following commands:

pip freeze > requirements.txt
Image source: Author.

From the image, we can see that the packages present in our virtual environment have been written to the requirements.txt file, so in case of a Collaboratory project, collaborators can easily access this file and install the packages used for the project using the following command:

pip install -r requirements.txt

Lastly, we can deactivate the virtual environment and return to our global environment using the following command:

deactivate
Image source: Author.

We are now back in our global environment and can go back into our virtual environment when we need to by activating the virtual environment again.

Conclusion

In this article we covered virtual environments: they are containers used to isolate Python project that has lots of dependencies and drives proper and efficient development workflow structure. We now know why virtual environments are important and how to integrate them into our workflow. I hope you start spinning up your own virtual environment in your complex data analysis, data science, software development projects, and also projects that require collaboration.

Thank you for reading!

--

--