Primer for Learning Google Colab

elvis
DAIR.AI
Published in
7 min readSep 8, 2018

Google Colaboratory is a free, in-the-browser, collaborative programming environment that provides an interactive and easy to use platform for deep learning researchers and engineers to work on their data science projects.

If you are here because you want to use Google Colab and don’t know where to start, this primer will provide you with a very simple guide to get you rolling with your data science projects and research ideas. You don’t need any knowledge of notebook technology, and all you need is a little python programming. Even if you have used Colab before, there is something for you in this guide as well. So let’s get started!

Launching a Google Colab Notebook

The first thing you need to do is to open a new tab in your browser and head over to this website: https://colab.research.google.com.

Once you have made an account, you will be asked to grant permission to Colab to access your Google Drive. Go ahead and do that since you can store datasets in it to make the most out of that empty free space in your drive😉.

You will see the following spaces available to the Colab environment:

The “Examples” tab is a great place to start since there are many guides there. I will highly recommend that you first read this notebook from Google Colab; the guide will provide you with a brief journey on how to use basic python commands within the Colab environment. It will also show you how to use the notebook features and the built-in Colab notebook tricks. My guess is that after you have gone over this notebook, you should be ready to start coding in Colab in no time. However, there are a lot more things that you may want to know about Colab in order to efficiently use the environment.

Markdown

Similar to Jupyter notebooks, Google Colab offers rich Markdown functionalities. If you spend time documenting your code, which I always highly advise data scientists to practice, you can exploit Colab’s neat preview function. The preview function allows you to see a live preview of your Markdown code and makes things faster to document. However, do note that Colab doesn’t offer a spell-checker which is something that you can live without. Take a look at the Markdown preview in action in the figure below:

Importing Data

There are various ways to import data into Google Colab such as Google Cloud, Google Sheets, Google Drive, among other well-known choices. However, the simplest way is to upload data directly from your computer. You can import datasets by typing the following lines of code into your Colab environment. Ensure that you have added a “CODE” cell rather than a “TEXT” cell so that the code runs using the Python runtime. Below is a demonstration of how to upload your data.

Once you have uploaded your data now it’s time to start your data exploration — it’s that fast and easy. I call this “Data Science on Steroids”. Google Colab does go the extra mile to make this process as smooth as possible. But keep in mind that you can also run terminal commands on the Colab environment itself. For instance, let’s say we want to know if the file we have just uploaded is available to us within the environment. We can do this using the ls command as shown in the figure below:

Inline Commands

One of the great benefits of Google Colab is that it already pre-packages several notable scientific computing libraries like Numpy, Pandas, and TensorFlow. However, if you want to use something that’s not available in your environment, you can just go ahead and install it directly on Colab using the !pip install name_of_library command. For instance, if we wanted to install PyTorch, another great deep learning toolkit, you would install it using the following command:

Let me tell you the good news! You are ready to start rock and rolling with Google Colab. You just learned how to import data and how to install a computing library like PyTorch. The next step is to start the data exploration and start building your deep neural networks.

Cloning

I won’t get into code examples in this guide, but if you want to get started with data exploration, neural networks, and recurrent neural networks (RNNs) using Google Colab and PyTorch, here are a few code tutorials for you to get started right away. You can clone any notebook to your drive through the “File” command and then clicking the “Save a copy in Drive” option. The notebooks below will show you how to take advantage of Google Colab to build powerful neural networks for classification from scratch, and how to build recurrent neural networks for image classification:

  • Building Your First Wordcloud with Google Colaboratory and Python (Medium | Colab)
  • A Simple Neural Network from Scratch with PyTorch and Google Colab (Medium | Colab)
  • Building RNNs is Fun with PyTorch and Google Colab (Medium | Colab)

GPU all the Way!

As I see it, one of the most powerful features of Google Colab is the GPU capability that it offers for free. In the animation below, I demonstrate how to enable it:

Imagine this! You can upload data from your computer to Google Colab in the blink of an eye. You can install your python packages of choice. And you have access to a free GPU. I have to say that there is no reason to not get started with data science.

Say hello to TPUs!

It appears that Google Colab is now offering TPUs. So you can play around with that as well.

Other Fun Stuff To Know

With all these instructions that I have shared with you so far in this guide, you are ready to dive deep into the world of coding for good. There are other use cases for which Google Colab is useful. For instance, its super portable since you don’t really need to install anything to fire a Colab notebook. All you need is the browser. It also has a commenting feature, which I don’t see people taking advantage of, but can serve as a nice way to provide feedback to teams or learners for in case you are using it to teach. By the way, while you are it, check out this incredibly popular GPU tutorial for Colab users.

What I would Love To See In The Future

As a heavy user of Google Colab, I would like it to become more alive in the sense that it becomes an active social platform for developers and learners. It already has everything it needs for the developer but since it is accessible in the browser this tool has the potential to become an important way to document, share, and teach data science topics. I feel that in that regard it is just not there yet but I hope that the Colab team is working hard to make this happen.

I would also love more support for other deep learning and data science toolkits such as PyTorch. If you are a PyTorch user like me you may face some troubles with the default CUDA version in Colab, in which case you can go over here to learn how to fix these issues.

Other Google Colab Developers

I have seen some great Google Colab influencers around the web. Here is a compacted list of developers who are frequent users of Google Colab:

Be sure to check out some of their amazing work. If you know any other frequent Colab developers or you are one of them, please comment below and I will incorporate the names here.

Notable Colab Notebooks

Besides the notebooks I shared above, here are a few other impressive and notable Colab notebooks I have come across:

If you know of any other Colab gems, comment them below and this should help others to find the cool stuff quickly.

Final Words

Congratulations! You have made it to the end of the Google Colab primer. Believe me! If you have clicked on every resource I shared above and followed every tutorial, you are definitely on the right track to becoming a Colab wizard. One last note, I will be updating this primer regularly, considering Colab updates in the future, Colab tutorials, and community suggestions. So be sure to bookmark this article 🔖 and check back regularly to find out more about the exciting work that’s going on behind Google Colab. To make this easier for you, I will post updates on my Twitter @omarsar0.

--

--