Set-up a remote machine for Deep Learning with Jupyter Notebook.

Miguel Romero Calvo
8 min readJun 2, 2019

--

Introduction

Some people doing Deep Learning may need to use a remote machine with a GPU. Sometimes the hardest part is the set-up. When I started using this I spent two days fighting with the configuration just to get a nice conda set-up and a remote Jupyter Notebook up and running. This blog aims to make it less painful for those in the same position I was back then.

You do not need to read the whole blog post. Find what you are looking for in the index and just jump to that section.

As a quick summary, this post has the following sections:

1. BASIC SET-UP

1.1. Installing Anaconda3

1.2. Setting up Jupyter Notebook

1.3. Installing PyTorch

2. TMUX

3. WHAT COOL KIDS DO: EXTRAS

3.1. Custom Password

3.2. Ip address -> name (MAC)

3.3. SSH TUNEL

4 RUNNING & OPENING THE NOTEBOOK

In Section 1 the basic set-up is explained. In Section 2 we show a recommended way of running your Jupyter Notebook using Tmux. In Section 3 we close up with some recommendations that can make your life easier when working in a remote machine. Lastly, in Section 4 we review how to open Jupyter notebook after launching it.

Throughout the blog post, I am going to use IP to refer to the IP address of the remote machine, for instance, 127.83.75.12 ( or the ec2…. in case you are using AWS ) and user to refer to the username you have to access to access the remote machine, for example, miguel.

I will be using

$

if the code is run in the terminal. We will be using vi as a file editor. Let's review some basic functionality that we are going to use:

  • To use the "edit" mode press i and to exit it press esc.
  • To save and exit press esc and type in :wq

I am assuming that you have your ssh private key in ~/.ssh/ and you are able to connect to the remote machine by running

ssh user@IP

in the terminal.

Note for those using AWS machines: Take in account that you are going to add an inbound rule for the port where the jupyter notebook is running. We will set the specific port in sections 1.2. and 3.3.

1. BASIC SET-UP

  • Ssh to your remote machine
ssh user@IP

1.1. Installing Anaconda3

If you are using AWS check first if the instance you are working on has anaconda installed. You can check with:

$ conda --version
  • Download Anaconda3 from one of those .sh links. Here, I will assume the remote machine is a Linux-x86 64.
$ wget -c https://repo.continuum.io/archive/Anaconda3-5.3.0-Linux-x86_64.sh
  • Execute the script
$ bash Anaconda3–5.3.0-Linux-x86_64.sh

While executing you are going to need to answers some questions. Here I am giving the answers that you are (most often) going to want.

  • Say yes or press enter to have your anaconda installed in the remote home directory (default).
  • You want to initialize it in .bashrc. Say yes to that.
  • Say no to install vscode.

The installation will finish. To "refresh" your current terminal window run

$ source ~/.bashrc

Congratulations, Anaconda3 has been installed!

1.2. Setting up Jupyter notebook

I am assuming you are in your home remote directory.

$ cd ~/

Let's move on to setting up Jupyter Notebook. First things first, let's make sure you have the latest version.

$ conda update jupyter notebook

Once this has finished refresh the current terminal window.

$ source ~/.bashrc

Now it's time to tell Jupyter Notebook that you want to use it remotely. To do so we have to create a configuration file for Jupyter Notebook.

$ jupyter notebook --generate-config

The configuration file has been created. Now let's edit it.

$ vi .jupyter/jupyter_notebook_config.py

Now that you are inside it. Add the following lines at the top of the file (without the >).

c = get_config()
c.NotebookApp.ip = '0.0.0.0'
c.NotebookApp.open_browser = False
c.NotebookApp.port = 8888

Exit the script and return to your home remote directory.

Great! You have set up your Jupyter Notebook to work remotely!

1.3. Installing PyTorch

This is straight forward. They provide a nice GUI for you to know what command to run. You want to use the command from the "start locally" tab. The main things you need to know are:

  • Am I using an Nvidia GPU? If so what version? You can check by running the command below and looking at the top-right of the table that appears.
$ nvcc --version
  • What python version you are using? You can check it by running the command below.
$ python — version

2. TMUX

To explain why tmux is useful, let me first give you some context. If you ssh to your remote machine and start a Jupyter Notebook, run a script or spin-up any process they may work great.. but… they are killed if your ssh breaks for any reason.

Imagine that you have been 7 hours training a model, only 30 min to finish and all of a sudden your internet connection suddenly goes down 😭. You just lost those 7 hours!

Not cool, no?

Tmux works great in those scenarios. It allows you to create sessions so if you run your Jupyter Notebooks, scripts or anything in them they won't be killed if you are disconnected from the ssh.

From my experience, it is usually installed but if it's not please look at this webpage to see how to install it.

Let me review some basic tmux commands:

  • To create a new session k. Where k is the index of the session you want to create. For example, k can be 0,1, 2, ….
$ tmux new -s k
  • To exit a session without killing it. Press ctrl+b (at the same time) and then press d (for detach).
  • To go/attach a session already created run the command below where k is the index of the session.
$ tmux attach -t k
  • To kill a session run the command below where k is the index of the session to be killed.
$ tmux kill-session -t k

3. WHAT COOL KIDS DO: EXTRAS

In this section, some extra configuration will be provided. By no mean the following sub-sections are necessary to run your Jupyter Notebook. However, in my experience, they are really handy if you work with the same machine on a daily basis.

3.1. Custom Password

Up until now, you had to copy paste the URL with the token whenever initializing Jupyter Notebook. I find that to be annoying. There is a much better way to deal with it. You can set Jupyter to work with your custom password rather than a token. Let's see how to do so.

I am assuming you are in your home remote directory.

Let's first generate a hash password from your custom password. To do so run the following:

$ ipython

Then a python console will appear and you have to run the following:

[0] from notebook.auth import passwd

And then run the following command to get the hashed version of your CUSTOM_PASSWORD.

[1] passwd(CUSTOM_PASSWORD)

A string such as ‘asfsadfasdfdgasdf’ will appear. You to copy it and paste it afterwards in the Jupyter configuration file. Let's do that step by step. Assuming that you have already copied the string your first need to exit the console.

[3] exit

Assuming that you are in your remote home directory run the following.

$ vi .jupyter/jupyter_notebook_config.py

Then, insert the following line somewhere under c = get_config().

c.NotebookApp.password = u’asfsadfasdfdgasdf’

where ’asfsadfasdfdgasdf’ is the string you have copied (i.e. your hashed password).

Note: The hashed password is wrapped around u' '

Now save and extit the configuration file and run the following line in the terminal.

$ source ~/.bashrc

Done!

3.2. Ip address -> name (MAC)

Every time you were accessing the remote machine you had to do the following:

ssh user@IP

I don't know you, but I am pretty bad at remembering IP addresses and I don't find it that useful. Thankfully, we can replace the IP by some name, for example "gpu" after some minor configurations. Then, every time we access the remote machine we do the command below rather than the one above.

ssh user@gpu

I am going to assume you are using a mac as a remote machine. You have to run:

$ sudo chmod 777 /etc/hosts
$ vi /etc/hosts

then, assuming that you want to use the name "gpu" and your IP is 172.28.38.12, add the following line at the bottom (you can use any name you like and you are going to need to change the IP for your own). Even if Medium does not allow me to, there should be 3 spaces between the ip address and the name.

172.28.38.12 gpu

Great, now save the file and exit. And run the following command.

$ chmod 400 /etc/hosts

Done!

3.3. SSH TUNEL

Here we assume that you have successfully completed 3.1.

Up until now, you had to access your Jupyter Notebook typing:

https://IP:PORT

where PORT is the port where the Jupyter Notebook is running (most often 8888). I find that to be less handy than "having" it in my localhost:8888. To get around that we can use what it's call an ssh tunnel.

The idea of an ssh tunnel is that you are going to see in one of your local ports (e.g. localhost:8889 ) what is happening in a specific port of another remote machine where you can ssh to. In our case, this is translated to you being able to type

localhost:8889

in your browser and see the notebook that is being executed in the remote machine.

How do we do that?

A requirement of doing that is that the port in the remote machine has to be always the same. If you are the only one using the machine you can leave the port 8888 in your Jupyter Notebook configuration. Otherwise, we have to change the port. To change the port ssh to your remote machine and edit the Jupyter Notebook configuration file.

$ ssh user@IP$ vi .jupyter/jupyter_notebook_config.py

And replace 8888 in the line

c.NotebookApp.port = 8888

with some other number. One of the things to take into account when chosing that number is that you don't want that port to be used by anyone else or any other process but your Jupyter notebook. For this example assume that we choose port 1234. Then the above line should look as follows:

c.NotebookApp.port = 1234

Save the changes and exit the script. Then run:

$ source ./.bashrc

Now we are going to swap and use the local machine (i.e. you should be in your local terminal). We want to edit your .bash_profile or .bashrc for Mac or Linux users respectively. For those who are not familiar with them, those are scripts that are being executed every time you open the terminal.

To automate the tunnel creation we will edit those .bash_profile or .bash_rc

$ vi ./.bash_profile

and add the following lines

ssh -N -f -L localhost:LOCAL_PORT:localhost:1234 [user]@[IP/NAME(if configurated)]clear

Where LOCAL_PORT is the port we want to use when tying

localhost:LOCAL_PORT

in the browser and 1234 is the remote port we previously set up.

Note: When opening the terminal for the first time it may take some time but it gets faster after that.

Done!

4 RUNNING & OPENING THE NOTEBOOK

After launching Jupyter notebook

$ jupyter notebook

the way in which you open it depends on how many of the previous steps you have completed.

If you haven't done 3.1 you have to copy the URL that appears in the output of running the command above and replace the localhost by the IP of the remote machine (or ec2-…. in case of using AWS).

Note: Make sure you are doing https when accessing that URL.

If you have done 3.1. you don't need to copy the token. and if you have done 3.3. you only have to type localhost:LOCAL_PORT and you are good to go.

--

--

Miguel Romero Calvo

MS in Data Science at USF. Expected graduation date June 2019. Currently open to full-time positions.