Jupyter Notebook on GCP for Pythoners

It has been a long time since my last writing. Hope this is not my other bad writing that can confuse you. Okay, let start it..

I want to talk about Google Cloud Platform or GCP if you prefer, which has spread their existence in Indonesia since last year (click here if you don’t know what it is). So, I try out the free trial version of this product that has been used­ by many big companies (such as Spotify). GCP has many products from computing, storage, and machine learning.

Jupyter on GCP

On this article, I will focus on the cloud VM (or Google Computing Engine/GCE), how to create it, how to configure it and prepare it until we can operate anaconda on it then use it for running jupyter notebook from there.

The instruction that will be given directly from the platform, because I assume, all of you already has the GCP account and ready to run it. You can go to this link to pre-preparation of GCP free trial. (PS: it requires a Credit Card to activation)

SPOILER ALERT, it’s not going to be different with the installation in linux or other OS, but the hardest part is the setting up the GCE.

Step 1: Create the Virtual Machine

The instructions start with building the VM on VM Engine in GCP. Go to this link then Click Compute Engine -> VM instances. This page we called “VM Instances Page” (because the term instances would be use in other GCP product).

first step

After that you can see this UI and then click on “Create Instance” button. This is just one of ways to create an instance on GCE, you can create it with command line or API. However, I’m use this GUI based for sharing knowledge purpose.

create the instance

Step 2: Configure the Virtual Machine

GCE provides various type of VM, with various CPU type, operating system and memory (PS: if you use Windows as your OS, you have to pay the license every month). As you can see on the picture, I’m using 2vCPUs with 7.5 GB Ram and the OS is Ubuntu 17.04 with 20 GB standard disk (because you can choose SSD). As a note, the Zone choices is crucial because every different has different capability, you can click here to the details.

GCE Configuration

Because its flexibility, we can change the VM configuration anytime we want. Thus, we shouldn’t worry with the wrong configuration in the beginning. Except the connection configuration and the access to the GCP API, such us HTTP and HTTPS.

Easy Configuration

Because we want to try Jupyter in the future, so we allow all network type traffic to get the Jupyter Notebook and make the IP static. (Click on “Management, disks, networking, SSH keys” -> network tab)

Static IP

After all the configuration, just click “Create” button and we all ready to go. Easy to create (PS: the green and the checklist on our instance means our instance is active. if the instance always “active” whether we use it or not, we still get charge from google. Thus, “deactivate” our instance after we have been done our work).

Ready to start

Step 3: Check the connection

Before we go to use our VM, its a good thing to check the connection. First of all, check the IP addresses and make sure it is a static IP.

IP addresses

Go to Networking -> External IP addresses -> search the IP that connect to your VM -> look into its type -> Make it static if its not.

Static IP

Secondly, we have to check the Firewall rules to makes our jupyter easier to be connected.

Firewall Rules

You can go to Networking -> Firewall rules -> Edit default-allow-http

Edit Firewall Rules

Click on default-allow-http and edit it like this.

Firewall Rule Details

Step 4: Install Anaconda into VM

This steps basically use the terminal on the linux that we have installed on our instance in GCE. You can find this basic installation of anaconda on linux in many resources and its not that hard. But, I will try to make it easy to be understood for you newbie (because I am a newbie as well).

Terminal Pops-up

Okay, we start with back to our “VM Instances Page” and click “SSH” button on the right of our VM. Then, the terminal pops-up.

After Loading

First we create a new folder called “installer”, then change our position into that directory.

mkdir installer
cd installer

After that, we download anaconda package from the web. I choose anaconda 4.3.0 with python 2.7 64 bit.

wget http://repo.continuum.io/archive/Anaconda2-4.3.0-Linux-x86_64.sh
Download Anaconda “Installer”

Next step is bash the installer called “Anaconda2–4.3.0-Linux-x86_64.sh”. You can check it with type “ls” on the terminal which will show all the file on your directory.

ls
bash Anaconda2–4.3.0-Linux-x86_64.sh

Follow the instructions, such as read the lisence and type “yes” then wait for the installation.

Install The Anaconda

At the end of the installation there would be another question that you must be answered by “yes”. If you answer “no” you can’t use command “conda” or “python” or “pip” in different direction. To complete the configuration type this command,

source ~/.bashrc

Check your installation by type this command,

conda list

The command will show all the package that installed on our anaconda.

Anaconda Package List

If your anaconda have successfully been installed on the VM we can play with the Jupyter Notebook.

Step 5: Start our Notebook

Before we start our notebook I would like to create new folder on our VM called “my_files” in the same directory as “installer” to make our VM has a great structure. After that start our Jupyter in the “my_files” folder, but this step is optional because we can change the directory through Jupyter.

mkdir my_files
cd my_files
jupyter notebook --ip=0.0.0.0 --port=8888 --no-browser

You will get this IP addresses on terminal, block the whole IP addresses (with the token as well), GCE will automatically “cut” it. Paste it on our web browser, and change “0.0.0.0” with our VM static external IP that is written on the “VM Instances Page”.

Jupyter Address

For example with my static external IP, the Jupyter address would be

http://146.148.111.153:8888/?token=(token code)
External IP

Copy-paste the address to your browser, it should be like this.

The Jupyter

Jupyter Notebook is ready to be used.

In conclusion, GCP has provided an easy way to create any kind of VM with high flexibility. You can create high performance machine in minutes to finish your script on notebook or python script in GCP without buy an highly price equipment. Even you can create a GPU VM machine to do parallel job. Keep practice and see the documentation here or other resources, then you can optimize GCE VM with other GCP products.

(PS: However all of this steps can be summarize if you use a Google Datalab. You can check the documentation here.)