Use virtual environments in Jupyter Notebooks in three easy steps
Data scientists often need to work on Python notebooks to do data explorations and experiments. In this post, I will briefly explain what are Python virtual environments and how to use them in Jupyter Notebook.
Click here to directly go to the meaty part if you already know about environment variables and Jupyter.
Multiple Python projects need multiple virtual environments
When working on several projects (which is the case almost always), it is necessary to keep in mind the reproducibility, deployment and independence between these projects. And one important aspect is to avoid conflicts between the versions of the same libraries used in different projects.
One project might need version 1.0 of a certain library, but another one needs version 2.0. Using the same python environment will certainly lead to leaving one of the two projects unable. Some projects might need a certain library that another project doesn’t need, why then run the second project inside in an environment containing libraries (and dependencies) that are not needed? This might also lead to conflicts in certain cases.
Virtual environments are the solution of the problem above. A virtual environment is a self-contained directory tree that contains a Python installation for a particular version of Python, plus a number of additional packages.
I will stop here in explaining the virtual environment concept as it is not the subject of this post, the main idea here is to note that we need to create different virtual environments when working on unrelated or loosely connected projects.
More information about how to create virtual environment can be found in the official documentation of Python.
Working inside virtual environments from Jupyter Notebooks
Jupyter Notebook is probably the most famous tool for data scientists, it is an open-source web application that allows you to create and share documents that contain live code, equations, visualizations and narrative text. Click here to install Jupyter Notebook if you haven’t done it already.
We suppose that you have pip
installed on your machine, if not, you can install it as explained here.
Let’s suppose that you have already created a virtual environment called myvenv
and you want to use it from inside a Jupyter Notebook.
In order to use activate myenv
, you need to do the following three steps:
- In your command line interface, activate the virtual environment:
- On Windows, run:
path\to\myenv\Scripts\activate.bat
- On Unix or MacOS, run:
source path/to/myenv/bin/activate
2. After the activation, run the following two commands (you can replace projectname
with any meaningful name you choose):
(venv) $ pip install ipykernel
(venv) $ ipython kernel install --user --name=projectname
3. Start Jupyter notebook and if you click the new button on the top right hand side of the web UI you will find your projectname
available in the list.
Congratulations, you can now create a notebook if you click on projectname
, and the new notebook will use myenv
as a virtual environment.
If you wish to use this environment from a previously created notebook, you can open that notebook and then select the virtual environment from the Kernel menu (Kernel → change kernel → projectname).
I hope it was clear!