There have been many tutorials on building your own Jupyter server, but few focuses on building a lasting one, that is — how to ensure Jupyter continues to run even after ssh connection breaks up or system reboots.
ML engineers have been cursing the unstable ssh connections that their precious hours of training had gone to waste because of it. Here we are going to summarize three popular solutions for a forever Jupyter server that had been tested on Ubuntu (Let me know if any CentOS user runs into problem). This post assumes you have experience spinning up a Jupyter server.
The easiest solution comes to rescue. We get that not all machine learning researchers have experience with server provisioning — that is why I recommend
tmux as our first pick. Both Ubuntu and CentOS come with
tmux out of the box, so there’s no need for
sudo to install. What I love about
tmux is it doesn’t require writing a configuration file, nor does it need to manage deep learning dependencies — something both AWS and GCP have images built for, with DLAMI and GCP DLVM respectively.
tmux is as easy as it gets — start a
tmux session, and run a Jupyter server in it. So long as you don’t exit the
tmux session, the server will continue running, even after your ssh connection has broken up.
# Start a tmux session
$ tmux# Run a jupyter server
$ jupyter notebook# Leave tmux session without exiting the running jupyter server
Ctrl+b + d# Check if jupyter server is running
$ ps aux | grep jupyter# Reenter tmux session
$ tmux attach# Exit tmux session while you are in it
tmux doesn’t survive system reboot. But the fact it survives unstable ssh connection should be enough for most use cases that’d otherwise risk losing their work in progress.
supervisor can be slightly more complicated to set up, but it beats
tmux at one point — it restarts the specified program after system reboot, so the program could seemingly run forever, in this case, our Jupyter server.
# Install supervisor
$ sudo apt install supervisor
$ sudo service supervisor start
supervisor to run your Jupyter server, we need to first create a configuration file and place it in the
/etc/supervisor/conf.d. Below is an example of that configuration file (which must be named with
command = jupyter notebook --no-browser --config=/path/to/config
directory = /path/to/working/directory
user = ubuntu # or whoever
autostart = true
autorestart = true
stdout_logfile = /var/log/your_log_file.log
redirect_stderr = true
After saving the configuration file to the directory
/etc/supervisor/conf.d, don’t forget to ask
supervisor to read in the configuration file and kickstart the program:
$ sudo supervisorctl reread
$ sudo supervisorctl update
$ sudo supervisorctl status # Check if Jupyter is running
Downside: The need for writing a damn config file lol, though
supervisor can also be super helpful for other automation tasks.
p.s. The supervisor section of the post takes reference from Albert Yang’s Post. I do not claim credit for it. Albert’s post has a more detailed walk-through for setting up
supervisor as well as using
nginx as reverse proxy, which provides easier solutions for routing and SSL connections.
Enter DevOps’ favorite choice — Docker. Docker’s advantage may not be best reflected in AWS’ DLAMI and GCP’s DLVM for the dependencies has been taken care of. That said, Docker will still come in handy when you have to start off with a plain server environment. Almost all vendor-provided Linux images today come with Docker pre-installed, including DLAMI and DLVM; even if it doesn’t, there are many tutorials for installing docker (See here to install Docker on Ubuntu 18.04).
There are many approaches for using Docker in deep learning, but here we are only concerned with running a lasting Jupyter server.
First we pull a pre-built deep learning image from Docker Hub, and run it at port 8888. For the sake of simplicity, we are pulling the Tensorflow image from the official Jupyter repository on Docker Hub.
$ docker pull jupyter/tensorflow-notebook
$ docker run -d -p 8888:8888 -e JUPYTER_ENABLE_LAB=yes -v "$PWD":/home/ubuntu jupyter/tensorflow-notebook
Most docker images have their own setups, this one included. Since the container’s working directory doesn’t have our Jupyter notebook folder, we need to symlink the folder to the working directory.
# Find your container name
$ docker ps# Enter into the container
$ docker exec -it your_container_name bash# Inside the docker container, symlink the folder to the working directory
$ ln -s /your/jupyter/notebook/folder /home/jovyan/whatever
Voila! A Jupyter server is now serving your folder at port 8888.
- Docker container wouldn’t survive system reboot either.
- You are likely going to need different docker images for different dependencies, and each image size can be huge (~5GB).
- Third-party image comes at a cost — the more it customizes, the less applicable to general use. Of course you could build your own docker image, or even build image per project (Check out repo2docker), but that’s outside the scope of this tutorial.
So here it goes —
Docker in easiness,
Docker in ability to customize. Even though we present these tools in separate approaches, they can sometimes be used together — in fact, my personal favorite is
docker in conjunction with
command = docker container start your_container_name
directory = /home/ubuntu
user = ubuntu
autostart = true
startsecs = 1
startretries = 0
stdout_logfile = /var/log/your_log.log
redirect_stderr = true
The caveat here is
supervisor is designed to run a running job, not a one-time job. Therefore, we need to specify
startretries = 0and
exitcodes = 0 to tell
supervisor to stop issuing retries after command had exited, though this feels more like a workaround to me.
The three solutions above are by no means the only three for spinning up a lasting Jupyter server. If you find other ways to implement or other tools that are just as convenient, please share with us.