Developing in the Cloud from your local IDE

A walkthrough to set up a live development environment running on a Virtual Machine from within your local IDE

Marnix Koops
Oct 28 · 5 min read
Photo by israel palacio on Unsplash

Intro

Anyone interested in modelling large amounts of data or doing some kind of Deep Learning work will quickly reach the limits of their local machine. Once you have written some code, executing it on a Virtual Machine (VM) for one-time results or scheduled production purposes is relatively easy. However, remote development and live experimenting with code changes within a VM requires a couple steps. A straightforward solution is to work in your browser through a Jupyter or Google Colab Notebook connected to the remote machine. Notebooks are great for a quick one-time analysis but does not offer much in terms of software engineering best practices. More specifically, robust and reproducible Machine Learning products or experiments require a better foundation for writing production-grade code than a notebook.

This leads to the simple wish to work in an Integrated Development Environment (IDE) of choice while using the computational power and resources of a Virtual Machine. I quickly discovered RStudio and PyCham Professional both have options for remote development and figured how hard could it be? Yet, I was not able to find a clear end-to-end write-up of the required bits and pieces. I decided to document these steps for my future self and hope someone else will find it useful.

In this example, I will be developing in Python using my favorite hackable editor Atom powered by Hydrogen running in a machine on Google Cloud Platform (GCP). There should be many of similarities in the steps below for alternative set ups. If your weapon of choice is GCP with RStudio, PyCharm (or another JetBrains product) your life is a bit easier.

Set up the VM

To communicate with a Virtual Machine we need some form of authentication. This can be done with any SSH tool. On a Mac we can create an SSH keypair with the includedssh-keygen through the terminal:

$ cd ~/.ssh/
$ ssh-keygen -m PEM -t rsa -C "GCP_username"

Make sure to place it somewhere logical (like ~/.ssh), give it a filename and think of a passphrase. You can view the generated public SSH key with:

$ cat ~/.ssh/filename.pub

Spin up the VM with your bash script or terminal commands and ensure Jupyter is installed on the machine:

$ pip install jupyter

Next, make sure the VM accepts authentication. On your local machine open the browser and go to the GCP console. Navigate to your VM: Compute Engine → VM Instances → MetaData → SSH Keys. Click edit and paste your public SSH key in the field, save it.

To access the VM with an external SSH connection we need to assign it an external IP address. Click the name of your VM → Edit → Network Interfaces → External IP. Don’t forget to click save all the way in the bottom of the page.

We can check if the connection and authentication is okay so far by SSHing into the Virtual Machine from a local terminal:

$ ssh -i ~/.ssh/place_of_ssh_keys gcp_username@external_ip_address

Create Jupyter Kernel & SSH tunnel

While connected to the VM in a terminal launch a Jupyter kernel on a port of your choice, I use 8888. Also, no monitor means we don’t need to launch a browser in the VM ;)

$ jupyter-notebook --no-browser --port=8888

To communicate with the Jupyter kernel running on the VM we use port forwarding from our local machine to the VM. This can be achieved by creating a SSH tunnel. In a new local terminal window write:

ssh -i ~/.ssh/filename -N -L localhost:8888:localhost:8888 gcp_username@external_ip_address

In my case no messages show up in the terminal which is perhaps a bit confusing, just keep the terminal window open.

Connect to Kernel from IDE

Open Atom and install the Hydrogen package. Go to Preferences → Packages → Hydrogen → Settings and add the following to the Kernel Gateways field:

[{
"name": "Jupyter Remote Kernel",
"options": {"baseUrl": "http://localhost:8888"}
}]

In your code file open the Atom command palette (cmd+shift+p)Hydrogen: Connect to Remote Kernel. Your Jupyter Remote Kernel should show up. You can connect with the token displayed in your terminal displayed when launching the Jupyter kernel in the VM. If you are working in multi-file projects you can simply connect these files to the kernel with Hydrogen: Connect to Existing Kernel. You can launch all kinds of Jupyter Kernels to work in your favorite language such as R or JavaScript.

In order to sync, edit or upload files between the local machine and the VM we need some rights. In a terminal connected to the VM go to your code directory in your user account and change the owner from root to the logged in user:

$ sudo chown -R gcp_username:gcp_username path/to/code/folder

You can check ownership and rights of all folders and files the current directory with:

ls -l

Set up File Sync

To develop locally and run code on the VM we need our code changes to reflect in real time. This can be done by setting up a SSH file sync between our local machine and the VM. I use the Atom packageRemote FTP which works great, but you can use any other SSH package or software of your choice.

Configure the package for syncing, I use the following settings in my .ftpconfig file:

{
"protocol": "sftp",
"host": "external_ip_address",
"port": 22,
"user": "gcp_username",
"pass": "",
"promptForPass": false,
"remote": "path/to/remote/code/folder/on/vm",
"local": "path/to/local/code/folder",
"agent": "",
"privatekey": "~/.ssh/filename",
"passphrase": "your_passphrase",
"hosthash": "",
"ignorehost": true,
"connTimeout": 10000,
"keepalive": 10000,
"keyboardInteractive": false,
"keyboardInteractiveForPass": false,
"remoteCommand": "",
"remoteShell": "",
"watch": [],
"watchTimeout": 500
}

You can enable file-syncing on save which will reflect all the changes in your code from local to the VM in real-time for live development.

That’s it, you should be good to go!

Miscellaneous

You can make life easier for yourself and add some aliases to your ~/.bashrc file to speed up the terminal commands if you want to do this more often. For example:

alias gcp_ssh = 'ssh -i ~/.ssh/filename gcp_username@external_ip_address'alias port_forward = 'ssh -i ~/.ssh/filename -N -L localhost:8888:localhost:8888 gcp_username@external_ip_address'alias give_me_the_power = 'sudo chown -R gcp_username:gcp_username path/to/code/folder'

In case of authentication errors related to authenticated hosts after re-launching your VM you can regenerate your known_hosts file by removing it and SSHing into the machine again:

$ cd ~/.ssh/
$ rm -f known_hosts
$ ssh -i ~/.ssh/filename gcp_username@external_ip_address

If you want to run your Python project as a package or for importing you can install the local code through pip on the VM:

$ cd path/to/code/folder
$ python3 -m pip install -e .

You can browse the folder structure on your VM through the Jupyter kernel on your local machine by going to localhost:8888in your browser and logging in with your token or password. You can download folders and files from the VM back to you local machine with gcloud commands or any SSH client.

You can also download to local from within the Jupyter browser. If you want to download complete folders instead of just files you can launch a terminal window. New → Terminal:

$ zip -r path/to/folder name_of_zipfile

This creates a zip of your folder which you can select and download through the Notebook UI like any other file.

If you run into any issues feel free to drop me a message!

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com

Marnix Koops

Written by

Analytics Vidhya

Analytics Vidhya is a community of Analytics and Data Science professionals. We are building the next-gen data science ecosystem https://www.analyticsvidhya.com

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade