Table of Contents
- Background and Motivation
- Getting Started With Conda Environments
- Setting New Conda Environment
- Setting Up Conda Environment for Jupyter Notebook
- Managing Jupyter Kernels
- Installing Packages in Jupyter Notebook
- Conclusion
Background and Motivation
Jupyter notebooks and Conda environments are great tools for developing data science applications in Python. There is clear motivation as to why most of us would want to work out how to manage Jupyter notebook kernels using Conda environments.
As a self-taught student of data science, I sometimes find myself confused with the many ways how many of us manage this intricate relationship between kernels and environments. In this article, I compile the process workflow which has worked best for me, while keeping things as simple as possible so that anyone who is just starting out could follow easily.
Getting Started with Conda Environments
Open Anaconda Prompt
Open the Windows Start Menu and search for “Anaconda Prompt” then launch it. Make sure it is the right one as shown in the pic below.
Once opened, we should see the command prompt window with our base Conda environment activated by default, indicated by the “(base)” prefix.
Inspect List of Available Conda Environments
To see the list of existing Conda environments in our PC, run the command:
> conda env list
If you have not created any Conda environments before, you should only have the base Conda environment. I have several Conda environments in my PC so you can see them below. The asterisk (*) indicates the Conda environment that is currently activated.
Inspect List of Packages Installed in Activated Conda Environment
To see the list of packages installed in the currently activated Conda environment, run the command:
> conda list
The pic below shows the list of installed packages I have in my base Conda environment.
Setting Up New Conda Environment
You can skip this step if you already have a Conda environment which you want to use for the Jupyter notebook kernel.
Create New Conda Environment
Run the command:
> conda create --name myenv0
Replace the term “myenv0” with the desired name for your new Conda environment. Then, enter y
when prompted to proceed with creating the environment.
Once successful, a new Conda environment named “myenv0” will be created. We can check this by running the conda env list
command and we will see that myenv0 is now in the list of environments.
Activate Conda Environment
The new Conda environment is not automatically activated right after creation. Activate it in order to perform operations on it by running the command:
> conda activate myenv0
Once done, we will realize that the “(base)” prefix has changed to “(myenv0)”, indicating that now the Conda environment named “myenv0” is activated.
Install Packages in Activated Conda Environment
Making sure that the target Conda environment is activated, we can now install the packages we want in the Conda environment. Run the install command recommended by the package provider. For example, to install the NumPy package in the Conda environment, we run the command:
> conda install numpy
Then enter y
when prompted to proceed with the installation.
Setting Up Conda Environment for Jupyter Notebook
Once our Conda environment is ready, we have to set it up for Jupyter notebook.
Installing jupyter
Run the command:
> conda install -c anaconda jupyter
Installing ipykernel
ipykernel is the reference Jupyter kernel built on top of IPython, providing a powerful environment for interactive computing in Python. — Jupyter Docs
Run the command:
> conda install -c anaconda ipykernel
This will install ipykernel in the Conda environment along with a bunch of dependencies.
Add Conda Environment as Jupyter Kernel
After the installation has completed, run the following command:
> python -m ipykernel install --user --name=myJupyterKernel0
Replacing the term “myJupyterKernel0” with the desired name for the Jupyter kernel which is shown in the list of kernels in Jupyter notebook. This command will install the kernel specs for the Conda environment as a Jupyter kernel.
Once this is completed, a new Jupyter kernel is created and ready to be used when we restart our Jupyter notebook.
Managing Jupyter Kernels
Listing Jupyter Kernels
In Anaconda Prompt, to see the list of available Jupyter kernels, run the following command:
> jupyter kernelspec list
Notice in the screenshot above, we can see that the newly created Jupyter kernel is included in the list of available kernels.
Uninstalling Jupyter Kernel
In case we made a mistake and would like to uninstall our jupyter kernel,
> jupyter kernelspec uninstall myjupyterkernel0
Replacing “myjupyterkernel0” with the name of the kernel we wish to uninstall and entering y
when prompted to proceed with uninstallation.
Note that this operation does not remove the Conda environment from which the Jupyter kernel is created.
Installing Packages in Jupyter Notebook
Often times, we only realize that we need a package when we are mid-way crunching code in our Jupyter notebook using the Jupyter kernel we have created from our Conda environment.
We can install packages from inside Jupyter notebook by running the following code in a code cell:
import sys
!conda install --yes --prefix {sys.prefix} numpy
(Thanks to Min-Ragan Kelley for suggesting this approach.)
Replacing “numpy” with the name of the package that we wish to install.
The --yes
argument is used to automatically answer y
for us when the conda command runs and subsequently pauses to ask for user confirmation; whereas the --prefix
argument specifies the full path of the environment location to install the package. {sys.prefix}
gives the prefix of the current Jupyter kernel. This ensures that the package will be installed in the currently-running Jupyter kernel. Alternatively, we can find the path of the Python executable used by the Jupyter kernel in the kernel.json
file located in the kernel directory (which is shown when running jupyter kernelspec list
).
Note: this code execution will install the package in both the current Jupyter kernel and the corresponding Conda environment.
The above boiler-plate code can also be adapted to other variants of Conda installation commands. For instance, the original recommended command to install TensorFlow for a Conda environment is:
> conda install -c conda-forge tensorflow
We can do this in a Jupyter notebook by adapting the boiler-plate code above:
import sys
!conda install -c conda-forge --yes --prefix {sys.prefix} tensorflow
On the other hand, running conda install
directly as a shell command in Jupyter notebook will generally NOT work to install packages in the current Jupyter kernel i.e. by running the code
!conda install --yes numpy
Jake Vanderplas gave a detailed (and wonderful!) explanation in his post why the above code will generally NOT work, which has something to do with the intricacies of Jupyter and Python package installation. It is a good read for those of you who want to understand the possible mismatch between the Jupyter shell environment and the Python executable, which causes the above code to not work.
Conclusion
I hope this article can help to smoothen out the learning curves a little for some of you who are exploring how to integrate Conda environments with Jupyter notebook. Happy coding, everyone!