Google Colab — The Beginner’s Guide

Whether you are a student interested in exploring Machine Learning but struggling to conduct simulations on enormous datasets, or an expert playing with ML desperate for extra computational power, Google Colab is the perfect solution for you. Google Colab or “the Colaboratory” is a free cloud service hosted by Google to encourage Machine Learning and Artificial Intelligence research, where often the barrier to learning and success is the requirement of tremendous computational power.

Benefits of Colab

Besides being easy to use (which I’ll describe later), the Colab is fairly flexible in its configuration and does much of the heavy lifting for you.

  • Python 2.7 and Python 3.6 support
  • Free GPU acceleration
  • Pre-installed libraries: All major Python libraries like TensorFlow, Scikit-learn, Matplotlib among many others are pre-installed and ready to be imported.
  • Built on top of Jupyter Notebook
  • Collaboration feature (works with a team just like Google Docs): Google Colab allows developers to use and share Jupyter notebook among each other without having to download, install, or run anything other than a browser.
  • Supports bash commands
  • Google Colab notebooks are stored on the drive

If you prefer to read more before getting started, I recommend the Google Colab FAQ, Google Colab Documentation and Code Snippets, and advice from the helpful community of users on Stack Overflow.

Let’s Begin!

Create a Colab Notebook

  1. Open Google Colab.
  2. Click on ‘New Notebook’ and select Python 2 notebook or Python 3 notebook.

OR

  1. Open Google Drive.
  2. Create a new folder for the project.
  3. Click on ‘New’ > ‘More’ > ‘Colaboratory’.

Setting GPU Accelerator

The default hardware of Google Colab is CPU or it can be GPU.

  1. Click on ‘Edit’ > ‘Notebook Settings’ > ‘Hardware Accelerator’ > ‘GPU’.

OR

  1. Click on ‘Runtime’ > ‘Hardware Accelerator’ > ‘GPU’.

Running a Cell

  1. Make sure the runtime is connected. The notebook shows a green check and ‘Connected’ on the top right corner.
  2. There are various runtime options in ‘Runtime’.

OR

  1. To run the current cell, press SHIFT + ENTER.

Bash Commands

Bash commands can be run by prefixing the command with ‘!’.

  • Cloning a git repository
!git clone [git clone url]
  • Directory commands !ls, !mkdir.
!ls

This command outputs the folders /content and /drive (if it has been mounted). Run the following snippet to change the current folder.

import sys
sys.path.append(‘[Folder name]’)
  • Download from the web
!wget [url] -p drive/[Folder Name]

Installing Libraries

Although most of the commonly used Python libraries are pre-installed, new libraries can be installed using the below packages:

!pip install [package name]

OR

!apt-get install [package name]

Upload local files

from google.colab import files
uploaded = files.upload()

Select the files for upload

For multiple files, the individual key names can be obtained by looping through the uploaded files.

for file in uploaded.keys():
print('Uploaded file "{name}" with length {length} bytes'.format(name=file, length=len(uploaded[file])))

Mounting Google Drive

Run the following code.

!apt-get install -y -qq software-properties-common python-software-properties module-init-tools
!add-apt-repository -y ppa:alessandro-strada/ppa 2>&1 > /dev/null
!apt-get update -qq 2>&1 > /dev/null
!apt-get -y install -qq google-drive-ocamlfuse fuse
from google.colab import auth
auth.authenticate_user()
from oauth2client.client import GoogleCredentials
creds = GoogleCredentials.get_application_default()
import getpass
!google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret} < /dev/null 2>&1 | grep URL
vcode = getpass.getpass()
!echo {vcode} | google-drive-ocamlfuse -headless -id={creds.client_id} -secret={creds.client_secret}

Click on the link and enter the api key.

!mkdir -p drive
!google-drive-ocamlfuse drive

Your drive is now mounted. You can use any files and folders in your drive by using the path as follows

!ls /content/drive/[folder name]

/content is the root folder of Google Colab and has to be appended to all paths used in the notebook.

Importing from existing .py scripts

Upload any existing .py scripts to a folder on drive. Consider a script ‘script.py’ uploaded to folder ‘Project’.

To import any module

import sys
sys.path.append(‘Project’)
import script

Run an existing .py script

To run a script

!python3 /content/drive/Project/script.py

Check CPU and RAM specifications

!cat /proc/cpuinfo
!cat /proc/meminfo

Check GPU specifications

from tensorflow.python.client import device_lib
device_lib.list_local_devices()

Colab provides the Tesla K80 GPU.

This should get you started with Google Colab. Feel free to ask questions!