Jupyter Notebook security in cloud

Artem Trunov
4 min readMar 18, 2019

--

Many studying and practicing data scientists turn to cloud to use Jupyter — one of the main data science instruments. I’d like to discuss a good and secure way of running your Jupyter notebook on the Cloud.

What usually happens: people lunch a EC2 instance, add port 8888 to their AWS security group, login into the instance with ssh, install Jupyter and run it. They copy the Jupyter URL shown in the stdout, edit it to include their instance IP address, and call it good shot when everything works.

This approach has serious security shortcoming. You have a web application running that is not really meant to be open to the world. It’s meant to run locally on a laptop or a desktop, and be accessible via localhost:8888 — hence its default setting is to listen on the localhost interface.

Some advanced folks want Jupiter to automatically start when their instance is up, and discover that it’s now not easy to find the Jupyter console output and the URL with the access token they need to enter on the web interface. They chose to disable token. They might also chose to disable password-based access, thus leaving no authentication methods for their notebook. Some people even go as far as running their Jupyter server as root user, because they could not properly configure e.g. a service unit for systemd.

Your Jupyter notebook is powerful enough to produce terminal damage to your system (be rooted, turned to a bot, or wiped out), since you can run any system command from a Jupyter cell, and as people often run Jupyter under their only account on the host system, this account has sudo privileges. Thus, if your Jupiter notebook gets hacked, the attacker has sudo/root access to your system.

I am not going to judge vulnerability of Jupyter — it doesn’t matter, since we know that it is possible to hack pretty much anything, given enough interest from hackers. Since the number of cloud Jupyter users increase, and hacking gives complete access to a hacked instance, it’s possible that it becomes profitable to hack Jupyter and thus hackers could eventually crack Jupyter and automate the process.

Here I show a simple and secure way to run your Jupiter notebook using an ssh tunnel. Let’s assume the following prerequisites/conventions:

  1. You configured an Ubuntu instance and have your ssh keys downloaded, for example, notekey.pem . If you are on Linux, this key needs to be converted to an rsa-ssh format: openssl rsa -in notekey.pem > notekey

What needs to be done:

If you added port 8888 to your security group, go back and remove it — it’s not needed.

If you are on Linux desktop/laptop, open a terminal and run the ssh with an extra option:

ssh -i /path/to/key/folder/notekey -L 9999:localhost:8888 ubuntu@instance.address

New option -L makes your ssh client to listen on a port 9999 on your desktop or laptop, but anything you send to this port will be tunneled to your cloud instance’s port 8888 on the localhost interface. It is not strictly necessary to use a different port — you could map a remote port 8888 to a local port 8888. I changed the local port in order to avoid conflict in case you run a Jupyter instance on your laptop/desktop.

If you are on Windows 10, ssh is now available to you in a command shell as well.

If you are on other Windows versions, use Putty:

Configure your sessions

Configure key access

Configure the tunnel

Do not forget to go back to the session screen and save the session with a “Save” button!

Now, once you logged in, you can run Jupyter (I run it from a conda environment here):

Copy the URL from the last line, and paste it into your browser’s address line, but don’t hit the “Go” button yet. Edit the URL to replace port 8888 with 9999, since this is the port exposed to you by the ssh tunnel. Then hit the Go button. Voilà:

In the next publication I’ll show a proper way to configure Jupiter notebook to start up automatically when you launch your instance.

--

--