Containerized Jupyter notebooks on GPU on Google Cloud
In a previous post, I listed out the steps to run Jupyter notebooks on GPU instances on GCP Compute Engine. It turns out, there is a much easier and more flexible way. Using Docker containers.
I am assuming your Google Cloud Platform account allows you to create GPU based instances. If not, please follow step 1 from the previous post. Also, make sure you have the latest
$ gcloud components update && gcloud components install beta
This post on NVIDIA’s blog explains how this setup works. The GPU instances only need the NVIDIA drivers to be installed on the host (and a thin wrapper around Docker called
nvidia-docker). All other software, like CUDA toolkit, cuDNN, Python, Jupyter and any deep learning libraries, could be simply containerized into reusable Docker images. NVIDIA and authors of most deep-learning frameworks (like TensorFlow, Keras, PyTorch), provide ready-to-use Docker images that you can use directly or as a base image.
Step 1: Create GPU instance
You can, of course, use Cloud Console to create a GPU based instance. But, I am going to use the
gcloud command line tool.
$ gcloud beta compute instances create gpu-docker-host --machine-type n1-standard-2 --zone us-east1-d --accelerator type=nvidia-tesla-k80,count=1 --image-family ubuntu-1604-lts --image-project ubuntu-os-cloud --boot-disk-size 50GB --maintenance-policy TERMINATE --restart-on-failure
This creates an instance named
us-east1-d zone with 1 GPU and Ubuntu 16.04 (persistent disk size 50GB).
Once your GPU instance is ready, you can connect to it via your ssh client, or
gcloud compute ssh gpu-docker-host --zone us-east1-d command.
Step 2: Install NVIDIA driver, docker and nvidia-docker
Once on the server, download this script to install the dependencies:
$ curl -O -s https://gist.githubusercontent.com/durgeshm/b149e7baec4d4508eb4b2914d63018c7/raw/798aadbb54b451abcaba9bfeb833327fa4b3d53b/deps_nvidia_docker.sh
The script automates the following tasks (always a good idea to take a look, instead of just running a stranger’s script) :
- Confirm that the instance has a GPU from NVIDIA, otherwise exit.
- Check and install NVIDIA driver if necessary (for Tesla K80).
- Check and install
- Check and install
$ sudo sh deps_nvidia_docker.sh
Note: You can install these dependencies while creating the instance in one single step, if you download the script locally and pass it as a startup script to ‘gcloud’ command, by appending
--metadata-from-file startup-script=deps_nvidia_docker.sh. But, I have separated out the steps here.
Step 3: Ready to run any CUDA enabled docker container !
Once the script finishes installing nvidia-docker, we are ready to run a simple test container from NVIDIA.
$ sudo nvidia-docker run --rm nvidia/cuda nvidia-smi
If you see GPU and driver information in the console, then your setup is ready. (Obviously, the first time you run this, it will take few seconds to pull the
nvidia/cuda image from Docker hub).
Now, let’s try a TensorFlow/Keras/Jupyter docker container (Dockerfile).
$ mkdir notebooks # to persist notebooks on the host
$ sudo nvidia-docker run -it --rm -d -v $(pwd)/notebooks:/notebooks -p 8888:8888 --name keras durgeshm/jupyter-keras-gpu
Check the container log to confirm that Jupyter is running:
~$ sudo docker logs keras
To make it easier to start the container in the future, I have also added a script
~$ echo 'sudo nvidia-docker run -it --rm -d -v $(pwd)/notebooks:/notebooks -p 8888:8888 --name keras durgeshm/jupyter-keras-gpu' > run-keras.sh && chmod u+x run-keras.sh
Step 4: SSH tunnel forwarding
Set up a tunnel from your local machine to access Jupyter over ssh.
If you have already started the
keras container on the server, then run the following on your local machine.
$ ssh -i .ssh/ubuntu_gcp -L 8899:localhost:8888 -f -N ubuntu@<gpu-docker-host>
I have defined a handy alias for myself to start the keras container remotely and open the tunnel immediately from my local machine.
$ alias tf-gpu="ssh gpu-docker-host './run-keras.sh' && ssh -fNL 8899:localhost:8888 gpu-docker-host"
# ^ that is the container id that was just started.
Step 5: Start using Jupyter locally in your browser
Navigate to http://localhost:8899/ and create a new notebook. Verify by importing keras or tensorflow.
ssh gpu-docker-host "sudo docker logs keras" can confirm if CUDA libraries are being loaded.
Once you are done, please remember to stop your instance to save costs. Thanks for reading.
- I have created Docker images labeled
durgeshm/jupyter-pytorch-gpu(Dockerfile) for Keras/TensorFlow and PyTorch respectively.
- In addition, you can always use https://github.com/fchollet/keras/blob/master/docker/Dockerfile, https://github.com/pytorch/pytorch/blob/master/Dockerfile or https://hub.docker.com/r/tensorflow/tensorflow/ for official docker images as a base image for your own customizations.
- Note that the
/notebooksvolume in the container is mounted from
~/notebookson the host. This way, you can always remove old containers, but your notebooks will be persisted on the host.
- Only Step 1 is specific to Google Cloud Platform. Other steps should work on other cloud platforms (with GPU and specifically Tesla K80)