Work remotely with PyCharm, TensorFlow and SSH
Wouldn’t it be awesome to sit at a café with your laptop, creating large neural networks in TensorFlow, crunching data with speeds of several terraFLOPS, without even hearing your fan spinning up? This is possible using a remote interpreter in PyCharm, and you get almost the same experience working remotely as working locally.
However, this is currently only possible in PyCharm Professional (Community Edition will not do). If you are a student your University should have an arrangement so you can download it for free, otherwise you’ll have to buy it. Here is how I set it up from scratch (you may want to skip some of the steps):
Remote data crunching machine
This is your stationary remote machine, perhaps fitted with one or several state-of-the-art GPU:s from Nvidia! (I don’t like the current deep learning monopoly, but TensorFlow can only use Nvidia GPUs). First let’s install the latest Ubuntu, I recommend the desktop version, you can always kill the GUI-service later to free up graphics memory. Connect it to Internet and check you LAN IP-address by opening up a terminal typing
ifconfig. I will assume it is
192.168.0.1 in the instructions later.
In order to be able to communicate with your crunching-machine, you need to install SSH on it. Open up a terminal on your stationary computer and get it:
sudo apt-get install ssh
Enable SSH X11-forwarding so that you can plot things, open the configuration file like this.
sudo gedit /etc/ssh/sshd_config
Then locate the row that says
# X11Forwarding yes
Simply remove the hash-sign to uncomment the line, save and close the file.
Next install the graphics drivers, they are usually proprietary, so you need to add a new repository to your package manager. What package you’ll need depend on your graphics card and Ubuntu version. As of writing nvidia-367 is the latest one, see more on this page.
sudo add-apt-repository ppa:graphics-drivers/ppa
sudo apt-get update
sudo apt-get install nvidia-367
Cuda and cuDNN
Now it’s time to install Cuda toolkit and and cuDNN, which are required to run TensorFlow. They are available from Nvidia’s webpage, and to download cuDNN you are required to register. As of writing Cuda 8.0 and cuDNN 5.1 are the latest versions. For Cuda I prefer using the built in package manager, it makes it easier to keep track of what you have installed:
sudo dpkg -i cuda-repo-ubuntu1604_8.0.44-1_amd64.deb
sudo apt-get update
sudo apt-get install cuda-toolkit-8.0
Make sure that the symlink is set up correctly:
readlink -f /usr/local/cuda
This is how to extract the cuDNN headers and copy them into the Cuda folder, and make them readable in the terminal (some of the filenames may be different for you):
tar xvzf cudnn-8.0-linux-x64-v5.1.tgz
sudo cp cuda/include/cudnn.h /usr/local/cuda/include
sudo cp cuda/lib64/libcudnn* /usr/local/cuda/lib64
sudo chmod a+r /usr/local/cuda/include/cudnn.h /usr/local/cuda/lib64/libcudnn*
Finally add the environment variables you will need, append them to your
.bashrc file and then source it:
echo 'export LD_LIBRARY_PATH=”$LD_LIBRARY_PATH:/usr/local/cuda/lib64:/usr/local/cuda/extras/CUPTI/lib64"' >> ~/.bashrc
echo 'export CUDA_HOME=/usr/local/cuda' >> ~/.bashrc
Python and TensorFlow
Install some required Python libraries:
sudo apt-get install python-pip python-dev build-essential python-numpy python-scipy python-matplotlib
And then install GPU enabled Tensorflow, check the version you need on this page (
TF_BINARY_URL is different for different systems):
pip install --ignore-installed --upgrade $TF_BINARY_URL
Verify that the installation is working by typing the following in your terminal:
You should get output similar to this if you have installed it on a GPU enabled system:
>I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcublas.so locally
>I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcudnn.so locally
>I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcufft.so locally
>I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcuda.so.1 locally
>I tensorflow/stream_executor/dso_loader.cc:111] successfully opened CUDA library libcurand.so locally
Did it work? Great! Let’s move on to your laptop
Super sleek ultrabook
Open up your laptop and connect it to the same local network as your stationary machine.
Install Homebrew and Cask:
/usr/bin/ruby -e “$(curl -fsSL https://raw.githubusercontent.com/Homebrew/install/master/install)"
brew tap caskroom/cask
Get what you need, including the PyCharm IDE.
brew install cask ssh-copy-id python
brew cask install java pycharm xquartz
Generate a SSH key-pair by executing the command below and then walk trough the guide (if you haven’t done this already):
ssh-keygen -t rsa
Now copy the key to your remote machine so you can connect to it without typing a password every time. On the first time doing this you need to authenticate yourself with the password of your remote machine:
ssh-copy-id [remote username here]@[remote Ip here]
Enable compression and X11-forwarding (useful for plotting data) by appending this to your
config file on your local machine.
echo 'ForwardX11 yes' >> ~/.ssh/config
echo 'Compression yes' >> ~/.ssh/config
Verify that everything is working by connecting to your remote machine from your laptop.
ssh [remote username here]@[remote Ip here]
While still logged in, you should disable password login on your remote machine for security reasons. Open the configuration file with your favorite command-line editor.
sudo vim /etc/ssh/sshd_config
And uncomment the following line by removing the hash-sign:
Restart your SSH server while still logged in on your remote (you have to authenticate yourself again).
service ssh restart
The final thing you should do while still logged in with SSH on your remote is to find your display environment variable. This will be used later for plotting, I usually get
Remember the output of this command, we will use it later.
Remote interpreter in PyCharm
This is the funny part, how we can set up the remote interpreter so you execute the scripts on your remote machine. Let’s get started, start up PyCharm and create a new Python project.
Open “Preferences > Project > Project Interpreter”. Click on the “Dotted button” in the top-right corner and then “Add remote”.
Click on the “SSH Credentials” radio-button and input your information. Select “Key pair” on the “Auth type”, and select the “Private Key file”. It should be located in
Click on “OK > Apply”. Notice the “R” for remote on the Project Interpreter.
The remote interpreter can not execute a local file, PyCharm have to copy your source files (your project) to a destination folder on your remote server, but this will be done automatically and you don’t need to think about it! While still in the “Preferences” pane, open “Build, Execution, Deployment > Deployment > Options”. Make sure that “Create empty directories” is checked. This way PyCharm will automatically synchronize when you create folders:
Now go back to “Build, Execution, Deployment > Deployment” and click on the “Plus button”, select “SFTP” and give a name to your remote. Click on “OK”:
Set up the connection by first typing the IP of your remote in “SFTP host”, then selecting “Key pair” on the “Auth type”, and finally selecting the “Private Key file”. It should be located in
/Users/<your username>/.ssh/id_rsa, as shown in the screenshot below. You may then click on “Test SFTP connection”. Given that you can successfully connect you should set up mappings. If you’d like you can click on “Autodetect” beside the “Rooth path”, it will then find the place of your home directory on the remote. All paths you specify after this will be relative to this home path. Then go to the “Mappings” tab.
As soon as you save or create a file in your local path, it will be copied to the “Deployment path” on your remote server. Perhaps you want to deploy it in a
DeployedProjects/ folder as shown below. This will be relative to your “Rooth path” specified earlier, so the absolute deployment path will in our case be be
Now we are finished with the preferences, click on “Apply” > “OK”, and then click “Tools > Deployment > Automatic Upload” and confirm that it is checked:
To do the initial upload, right-click on you project folder in the project explorer and click on “Upload to remote”:
You should get a “File transfer” tab on your bottom pane where you can see all the progress:
Then click on “Tools > Deployment > Browse Remote Host”. Drag and drop the window just beside the Project tab to the left. That way it will be really simple to switch between your local and remote project.
These deployment settings will work seamlessly as soon as you save and run a file, it is done so quickly you won’t even notice it.
Setup the Console
Open “Preferences > Build, Execution, Deployment > Console > Python console” and select the “Python interpreter” to be your remote one. Next click on the “Dotted button” and input the required environment variables that we added before to
~/.bashrc when we set up the server. Notice that we also added a value to the “DISPLAY” variable we found out earlier when connecting to the server with SSH:
Then go back to “Build, Execution, Deployment >Deployment > Console” and select “Always show the debug console”. It will be very handy when we’re debugging:
Create a run configuration
Create a simple test-file called
test.py in your project, just containing this.
print "Tensorflow Imported"
Now go to “Run > Edit Configurations…” Click on the “Plus button” and create a new Python configuration. Name it and select the script to run:
Now enter the required environment variables as before. Tips: You can copy them all from the console settings we specified earlier, by using Ctrl+A and then the copy/paste buttons in the lower left corner. You access them by clicking the “Dotted button” just to the right of the “Environment variables” line.
Click on “OK > OK”. It’s time for testing!
Testing the setup
Now we should be all done, it’s time to test our setup. First open a terminal and make sure that you have at least one SHH channel with X-forwarding connected your server. If you have had a connections open for a while, you may have to exit and restart them:
ssh [remote username here]@[remote Ip here]
Then open the “Python Console” in the lower bar in PyCharm and type
import tensorflow. Then you may type
ls / to verify that you are actually executing the commands on your server! This is what the output should be:
Now go over to your
test.py script and select “Run > Run…” from the top toolbar. Select your newly create run configuration “Test”. It should output something like this:
Let’s do some plotting, change your
test.py file to this:
import matplotlib.pyplot as plt
import numpy as np
print "Tensorflow Imported"
And then run it again with your run configuration “Test”, you should get this plot.
The plot is actually done on your remote server, but the window data is forwarded to your local machine. Notice that we changed the backed with
matplotlib.use('GTXAgg'), because it’s a X11-supported display backend. You can read more about Matplotlib backends here. You can also change the default behavior in your
matplotlibrc-file. Remember that you need to have at least one open SSH-connection in a separate terminal to get this to work, with the correct value of the
DISPLAY environment variable. If it didn’t work try to restart your SSH connection.
Finally do some debugging, click on the left bar to put a breakpoint, then go “Run > Debug…” and select the “Test” configuration. You will see that the execution has halted and you are debugging your script remotely.
In order to access your machine over the internet you have to forward ports on you home router, that is different for different vendors. I recommend forwarding a different port than 22 on your router. There are plenty bots out there trying to hack in, and they will check that port by default, and might slow your connection (although you are pretty secure since you have turned of password authentication). So you could perhaps forward port 4343 on your router to port
22 on IP
192.168.0.1 (the default IP of our remote in this tutorial). Also to speed up the plotting you may change to a faster encryption.
Next, let’s do some more TensorFlow, perhaps experimenting with matrix multiplication on the CPU and GPU? (coming soon)