How to use an IDE in Google Colab (and Kaggle Kernels !) instead of Jupyter or a simple script.

Skander HADJ ROMDHAN
Analytics Vidhya
Published in
5 min readSep 16, 2020

The data world is filled with a lot of constraints that all data practitioners are facing daily. One struggle I had past years is the hardware. It’s not easy to find good hardware for deep learning and also cloud solution can be expensive. Also, many other practitioners could face a lack of viability for DL oriented hardware in some countries…

Kaggle and Google Colab Logos

Kaggle and Google came with Kaggle Kernels and Google Colab respectively to face this issue and make hardware available for people to practice or experiment using their datasets. These products provide the user with the ability to use high compute capability units such as GPUs and TPUs that are essential to training models in a quicker way.

Available runtimes in Google Colab
Available runtimes in Kaggle kernels

The screenshots above show the runtimes available in the two platforms, note that in Kaggle kernels you can choose to write a single script instead of a notebook, link Google Cloud Services (BigQuery, Google Storage and AutoML). Also, in Kaggle Kernels you can choose to code in an R environment that supports only GPU runtime. Provided runtimes also have high RAM (above 12GB in most cases).

These free hardware solutions are providing us with a developing environment as shown above but we can’t use an IDE there and enjoy its features.

An IDE can provide a lot of benefits for a developer :

  1. Code into many files and structure the project as the developer wants
  2. Easy connection to Git and version control
  3. Have access to a debugger
  4. Easy access to a terminal

Now, everything changed with colabcode, a python module released by Abhishek Thakur a data scientist and Kaggle X4 GrandMaster. With this package, we can have access to VSCode in our web browser that is connected to the environment provided by Colab or Kaggle Kernels and so benefit from the hardware with an IDE. The package has been made to have a continuous running process that will keep your session active and don’t let the environment shut down until the maximal quota is reached (8h for the free version of Google Colab for example)

Hands-on!

Now enough talking, let’s see how to make this solution work and what are the features of colabcode.

The module works with only two lines of code but first, you have to install it.

Simple usage of colabcode

With only these lines you will be provided with a link (https://XXXX.ngrok.io) to access your VSCode and enjoy your new experience in your dedicated virtual machine.

Drive path and GPU availability check

Note that mount_drive variable will work only in Google Colab as there is a way to link your Google Drive to your virtual machine. By this way, added to the compute power of your virtual machine you will benefit from permanent storage. As the screenshot above shows, we are using code-server (which is a server version of VSCode available: Here) with access to a GPU (Nvidia T4) and access to my google drive folder with the path: “/content/drive/My Drive/”

This module can even be used for remote working, if you have a workstation and you want to access it remotely, you can let a script run that provides you access to VSCode via any web browser. You just need the link and the password. The python module has new features that make you run it with a simple command line.

Usage in terminal

To end this article about colabcode I want to thank the module author for open sourcing it and also to summarize the benefits of using it in Google Colab and Kaggle kernels.

Benefits from using CodeColab in Google Colab & Kaggle Kernels

With this game-changing module, hardware requirements for data enthusiasts became minimal, as you will only need a machine that can run a web browser.

Usage tips :

  • While using colabcode, please wait for the extensions to load. At the moment when I’m writing this article, Python extensions and Jupyter Notebooks are pre-installed. So you may need them to load for 40 seconds. (The extensions are activated after opening the first .py / .ipynb file)
Extensions loading indicator
  • When working in Colab, try to mainly use Google Drive storage if you mounted your drive. As data can be lost after 8 hours if you use the default runtime storage.
  • This is a free solution based on free version ngrok which is an application that enables the exposure a local development server to the Internet with minimal effort. This version offers limited requests per minutes. So whenever you see Ngrok error 702 don’t panic just wait for 1 minute or 2 and refresh the page or another solution I found is to wait initially for 1, 2 minutes before using it. This error is due to the first load that demands a lot of requests.
Ngrok error screenshot
  • When you need a high RAM in Google Colab you may use a trick to jump from 12GB to 25GB RAM. You can check the solution here
High RAM runtime (Source: https://towardsdatascience.com/upgrade-your-memory-on-google-colab-for-free-1b8b18e8791d)

Sources :

1- https://github.com/abhishekkrthakur/colabcode

2- https://www.youtube.com/watch?v=7kTbM3D02jU&feature=youtu.be

3- https://towardsdatascience.com/upgrade-your-memory-on-google-colab-for-free-1b8b18e8791d

4- https://amitness.com/vscode-on-colab/

--

--

Skander HADJ ROMDHAN
Analytics Vidhya

Data Scientist passionate by solving computer science problems