AI Platform Notebooks with multiple “Docker” kernels
AI Platform Notebooks allows you to use various Machine Learning frameworks via Deep Learning VM images and Deep Learning Containers. During installation, you can use different frameworks such as R, RAPIDS, TensorFlow, PyTorch, scikit learn, XGBoost, etc. There are situations in which you want to use not only one framework but multiple in the same AI Platform Notebook: If you are a ML Engineer that develops in both PyTorch and TensorFlow, you would normally create a new AI Platform notebook, select the PyTorch image and for TensorFlow create a second notebook. There is solution which consists in creating virtual environments (virtualenv) inside your notebook, but this will require to install manually each of the dependencies. This post will show you how to create different Jupyter kernels where each kernel will be using a different Deep Learning Container, no need to install a second notebook, use virtualenv or create additional instances.
We will create a new Notebook using the following kernels:
- PyTorch
- TensorFlow
- Spark via Apache Toree
Steps
- Create a new AI Platform Notebook using base Python 3 image or Python3 (CUDA Toolkit 11.0) if you want to use NVIDIA GPU. (Don’t forget to select “install Driver” checkbox)
2. List existing kernels
Open a Terminal connection and enter:
jupyter kernelspec list
3. (Optional) Delete Python2 Kernel
echo y | jupyter kernelspec uninstall python2
rm -rf /opt/conda/envs/py27/share/jupyter/kernels/python2/
4. (Optional) Display only valid Kernels. This will hide the kernels showing as conda (env*)
Open a Terminal connection and enter:
# Edit
vim /home/jupyter/.jupyter/jupyter_notebook_config.py# Add the following line at the end of filec.KernelSpecManager.whitelist = set(['python3','pytorch','tensorflow', 'swift', 'apache_toree_scala', 'r'])sudo service jupyter restart
5. Install PyTorch kernel
cd /opt/conda/share/jupyter/kernels/
cp -r python3 pytorch
vim pytorch/kernel.json
Remove existing content and replace it with the following:
{
"argv": [
"/usr/bin/docker",
"run",
"--network=host",
"-v",
"{connection_file}:/connection-spec",
"gcr.io/deeplearning-platform-release/pytorch-xla.1-6:m59",
"python",
"-m",
"ipykernel_launcher",
"-f",
"/connection-spec"
],
"display_name": "pytorch",
"language": "python"
}
Pull the container m59 version
docker pull gcr.io/deeplearning-platform-release/pytorch-xla.1–6:m59
Note: In this case the version is m59, you can replace it with the correct environment.
6. Install TensorFlow kernel with GPU support
cd /opt/conda/share/jupyter/kernels/
cp -r python3 tensorflow
vim tensorflow/kernel.json
Remove existing content and replace it with the following:
{
"argv": [
"/usr/bin/docker",
"run",
"--runtime=nvidia",
"--network=host",
"-v",
"{connection_file}:/connection-spec",
"gcr.io/deeplearning-platform-release/tf2-gpu.2-4:latest",
"python",
"-m",
"ipykernel_launcher",
"-f",
"/connection-spec"
],
"display_name": "tensorflow",
"language": "python"
}
Pull the container latest container version
docker pull gcr.io/deeplearning-platform-release/tf2-gpu.2-4
Note: This Docker image is ~15GB, make sure you have enough disk space.
7. Apache Spark using Toree
pip install --upgrade toree
Note: You may want to use “use-feature=2020-resolver” if you get a dependency error.
Download Apache Spark 3.x
wget https://archive.apache.org/dist/spark/spark-3.0.1/spark-3.0.1-bin-hadoop3.2.tgz
tar zxvf spark-3.0.1-bin-hadoop3.2.tgz
mv spark-3.0.1-bin-hadoop3.2 $HOME/spark-3.0.1
ln -s $HOME/spark-3.0.1 $HOME/spark
Setup Spark dependencies and environment
echo "SPARK_HOME=$HOME/spark" >> $HOME/.bashrc
echo "PATH=$SPARK_HOME/bin:$PATH" >> $HOME/.bashrc
source $HOME/.bashrc
jupyter toree install --user --spark_home=$SPARK_HOME
8. Verify Kernel installations
jupyter kernelspec listAvailable kernels:
apache_toree_scala /home/jupyter/.local/share/jupyter/kernels/apache_toree_scala
python3 /opt/conda/share/jupyter/kernels/python3
pytorch /opt/conda/share/jupyter/kernels/pytorch
tensorflow /opt/conda/share/jupyter/kernels/tensorflow
Restart jupyter service
service jupyter restart
Refresh your Jupyter page, now you should see the new Kernels and they are ready to use.
Note: In order to see the icons for PyTorch and TensorFlow you need to upload the correct icons.
Debugging commands
sudo journalctl -u jupyter.service --no-pager
NVIDIA GPU
nvidia-smi
Next steps
Please give it a try and let me know what you think? If this is useful I can create a shell script that does this automatically for you. You may also want to take a look at this utility: dockernel which allows you to convert Docker images to kernels automatically.