Leverage Google Colab GPU Runtime for Non-Notebook Python Projects

Guide to help you run non-notebook python projects in Google Colaboratory

Photo by Joseph Greve on Unsplash

Some months ago I was working with members and contributors a project, where I had the need to run things with GPU and the idea of utilizing the Google Colab for the same was suggested by a few friends.

Google Colab provides a notebook interface for executing python code. It provides CPU, GPU or TPU based runtime free of charge.

The project had been written as a conventional command line python scripts and in order to utilize the Colab infrastructure we need to make a notebook from which non-notebook code can be executed. Being able to make a notebook quickly to leverage Google Colab’s GPU and runtime can be a boon with GPU intensive projects. In this article I go through a walk through running a dummy project through colab, that will help with making that happen when you want to do it next time.

Let’s take this dummy python Hello World! project

Directory structure of the project from my terminal

main.py is a simple python script which prints Hello World!. In a practical scenario when wanting to leverage the GPU runtime, this would be the file acting as the main file for execution of the project.

With a practical project structure there would be lot of code from lot of files being used in main.py. We want to run the project from a colab notebook. One way to do would be create a notebook and copy all the code from the project into the the notebook and refactor them. While you were reading the previous line you would have realized that it would be a tedious thing to do.

Rather, get the project into Colab runtime

We can avoid that by utilizing the Google drive integration[3]. We can upload the project into a google drive folder, and utilize the ability to mount google drive folder into the runtime.

So let’s do that.

Project directory uploaded into My Google Drive as a folder

From this stackoverflow Q&A we can see how to mount google drive into your colab notebook.

Mounting google drive into colab runtime. It uses an OAuth link for providing colab access to your drive.

With that, your My Drive is mounted at /content/drive in the runtime.

So the project’s main file would be at path /content/drive/My \Drive/py-hello-world-run-from-colab/main.py

Dealing with package dependencies well

When we do all this we would be doing it for a slightly complicated project with non-inbuilt package dependency. Use package management standards, to make them available in runtime, e.g. with a requirements file as below.

Package dependencies with requirements and %pip magic command[2]

The runtime changes from one session to another and hence managing dependency this way helps us make repeated package installment simple.

Running the main file

We want to run from notebook a python file the same way we would run in a computer terminal. We can use the ability to run command line arguments from the notebook and run the main.py file like we would run normally from a terminal

Voila! now we are able to run the project on colab and utilize the advantages of the runtime with a minimum amount of refactoring.

An alternative to !python3 is to use the %run magic command[2]. This is especially useful if you want to have the ability to use the variables from the program in the notebook after.

Making the notebook usable outside Colab

There will be cases when we want to be able to run the notebook in a local jupyter environment. From what we did before there are two things that would be different

  1. We don’t need to mount gdrive
  2. The path would hence differ.

We can use the ability to detect the Colab runtime[4] to make the drive action conditional. We can use the same condition to deal with the path to main.py and requirements.txt

You can find the dummy project and a notebook encompassing the things for the same in the repository linked below.

References

[1] Welcome to Colaboratory, Colaboratory

[2] Built-in magic commands, IPython documentation

[3]External data: Local Files, Drive, Sheets, and Cloud Storage, Colaboratory

[4] Test if notebook in running on Google Colab, Stack Overflow

--

--