Creating a multi-user JupyterHub Instance
In this article I will show how to set up JupyterHub with SSL on Google Cloud Platform. Supervisor will be configured to manage the process. JupyterLab will also be installed. Finally, a dataset will also be shared with the users.
About JupyterHub
Jupyter Notebooks provide a way of creating and sharing documents containing live code, visualisations and descriptive text. Typically these would be run locally on the Python kernel on the user’s machine. However, with JupyterHub, the user accesses the Notebook through a browser but storage and computation takes place on a multi-user server.
Jupyterhub consists of a configurable HTTP proxy and a hub from where Notebooks are spawned. By default the hub spawns Notebooks on the server machine itself, however more advanced options can spawn Notebooks into containers using Docker and/or Kubernetes (out of scope but see here).
Creating the JupyterHub Instance
Here I’ll use a single, free Google Cloud f1 micro-instance running Linux Debian 9 to demonstrate setting up a JupyterHub web application. Users will access the UI via a browser and log in via the underlying server’s PAM authentication. SSL will be set up and Supervisor will be used to manage and maintain the Jupyterhub process.
1. Creating the VM
I created a free Google Cloud f1 micro-instance running Debian 9. Remember to select Allow HTTP traffic and Allow HTTPS traffic (these settings should be considered carefully when moving to a production environment). All other default options were used.
In this example we’ll use the default external IP address to access JupyterHub. This IP changes each time the machine is reset, so a permanent IP address should be assigned in production environments.
2. Installing the configurable proxy
I used NPM package manager to install the configurable HTTP Proxy (a wrapper for node-http-proxy). First I downloaded the latest package manager installation script and ran it:
curl -sL https://deb.nodesource.com/setup_12.x | sudo -E bash -and then installed it:
sudo apt install -y nodejsThen I installed the proxy itself using NPM:
sudo npm install -g configurable-http-proxy3. Installing Jupyter and JuptyterHub
I used Pip to install Python packages so let’s get that first:
curl https://bootstrap.pypa.io/get-pip.py -o get-pip.pyand now install it:
sudo python3 get-pip.pyInstall Jupyter:
sudo pip3 install jupyterLet’s install JupyterHub itself:
sudo pip3 install jupyterhubFinally let’s generate a configuration file for JupyterHub:
jupyterhub --generate-configWe won’t make any changes to this initially.
4. Initial testing without SSL
Now let’s run try running JupyterHub without SSL on port 80. First we’ll need to create an account with a password on the OS to login with:
sudo adduser jupyterhubtestEnter a password and we can leave the name, address etc. fields blank.
Now start JupyterHub:
sudo jupyterhub --no-ssl --port 80With luck this should start JupyterHub and the UI should bee accessible at http://<external_IP>:80 :

Enter the login details created earlier and a server will be spawned. The view should be familiar if you’ve used Jupyter Notebooks before:

The OS user’s home directory on the server will be shown and any Notebooks and files will be saved there.
N.B. JupyterHub is unsecured at this stage so no important data should be stored on the server.
Now let’s install the next generation web-interface for Project Jupyter, JupyterLab:
sudo pip3 install jupyterlabThe JupyterLab interface should now be accessible at http://<external_IP>/user/jupyterhubtest/lab:

5. Adding SSL
In order to use secure communication rather than plain HTTP, I created a self-signed certificate/key pair using openssl:
openssl req -x509 -newkey rsa:1024 -keyout jhub.key -out jhub.crt -days 365 -nodesNow these files can be referenced when starting JupyterHub and port 443 can be specified:
sudo jupyterhub --ssl-key jhub.key --ssl-cert jhub.crt --port 443Since the certificate is self-signed, the user will get a security warning when loading the page. In production a CA-signed certificated should be used.
Now JupyterHub can be accessed at http://<external_IP>:443 . Communication over port 80 can now be disabled via the Google Cloud Console.
6. Creating an admin user
Admin users are defined in jupyterhub_config.py which we generated earlier. The following line should be added to give the user we created earlier admin access:
c.JupyterHub.admin_users = set(["jupyterhubtest"])Now, the user can access the admin console where other users’ processes can be stopped and users can be created:

7. Setting up Supervisor
Currently we are just running JupyterHub via the command line: if the process crashes or the server restarts, JupyterHub will no longer be available. The supervisor tool will run and monitor the Jupyter server process and automatically restart it if it crashes, or also if the server is rebooted.
First, let’s install Supervisor:
sudo apt-get install -y supervisorNow we need to create a config file with command to start JupyterHub in /etc/supervisor/conf.jupyterhub.conf with the following contents:
[program:jupyterhub]
command = jupyterhub --ssl-key /path/to/cert/jhub.key --ssl-cert /path/to/cert/jhub.crt --port 443 -f /path/to/config/jupyterhub_config.py
autostart=true
autorestart=true
stopasgroup=true
killasgroup=trueN.B. The full path to the key, certificate and config file are used because the process is run as the root user, so the relative locations used before won’t work.
Reload the Supervisor config:
sudo supervisorctl reloadSupervisor should now start the JupyterHub process automatically. If there are issues then the Supervisor log can be checked in /var/log/supervisor/supervisord.log
8. Sharing data with users
Often, it’s required to automatically share dat with users, e.g. when creating Notebooks for students in a class, the class materials can automatically be available to the students.
To do this, we’ll create a shared folder and configure a symbolic link to it so that each newly created user has the data in their home directory (alternatively, there is also a plugin that uses Git to a similar end: nbgitpuller).
First, we’ll create a shared directory:
sudo mkdir -p /srv/data/my_shared_data_folderNow we’ll create a file in this directory:
sudo touch /srv/data/my_shared_data_folder/sample.txtWe’ll add a symbolic link to the directory in the skeleton directory, so that it’s added to each new user’s home directory when created:
cd /etc/skelsudo ln -s /srv/data/my_shared_data_folder my_shared_data_folder
Now when we create a new user the symbolic link will be created in their home directory the user will see the data in the JupyterHub UI:

Conclusion
We have created a secure JupyterHub instance that can be accessed from anywhere. Supervisor manages the process automatically and when a new user is created, they can automatically access a dataset in their home directory.
