How to use Google Colab with GitHub via Google Drive
In this tutorial, we will be discussing on how Google Colab can be used with GitHub for our Data science, Machine Learning projects and also use Google Drive as a cloud data storage. Let me introduce these products.
- Colaboratory, or Colab for short, is a product from Google Research. It allows anybody to write and execute arbitrary python code through the browser, and is especially well suited to machine learning, data analysis and education.
- GitHub is a code hosting platform for version control and Collaboration. It lets you and others work together on projects from anywhere. Thus allowing seamless collaboration without compromising the integrity of the original project.
- Google Drive provides file storage and synchronization service, which allows users to store files on their servers, synchronize files across devices, and share files. It offers 15 GB of free storage to users.
Below picture depicts how these product interact with each other,
Step 1: Use colab notebook as a Shell
- Visit Google Colaboratory website
- Click on New Notebook button. A blank notebook is initialized and opened
Step 2: Mount Google Drive to Google Colab Notebook
- Run the below script to mount your Google Drive
from google.colab import drive
drive.mount('/content/drive')
- Click the link to authenticate user Google account
- Select the respective Google Drive account on which you want to mount and click on sign in
- Copy and Paste the authentication code into the input cell
- Congrats! Your Google Drive is mounted,
Step 3: Change present working directory
- Below shell command will set the present working directory to,
/content/drive/MyDrive/Github
%cd /content/drive/MyDrive/Github/
Note: Your Google Drive’s Home directory is at, /content/drive/MyDrive/
Step 4: Generate GitHub Access Token
Now its time to generate your GitHub token, that can be used to access the GitHub API.
- Visit GitHub website and login to your account.
- Go to Settings, navigate to Developer settings and then click on Personal access tokens. Your page should look something like,
- Click on Generate new token button on top right corner of the page.
- Click the repo checkbox under Select scopes section as shown,
Learn more about scopes of access token.
- Now, click on Generate token button at the bottom of the page. Now, the page should look like,
You have successfully generated access token for your GitHub account. We will be using this token to access GitHub API. Note: Do not share your access token to the public as I did, I’ll be using for the purpose of this tutorial only.
Now, there arises two different scenario.
- Create a new git repository from scratch
- Clone an existing git repository from GitHub
Based on your requirements, follow the below steps,
Step 5.A: Create a new Git repository
Follow the below steps to create a new git repository from scratch directly in your Google Drive,
Step 5.A.1: Initialize new Git repository
- Initialize git using
git init <directory>
. In this tutorial, we will be using titanic repository. - Change your working directory to the created repository.
- List the files and folder using
ls
command.
Step 5.A.2: Working with Git repository
- It’s time to add files and folders to your working directory.
git status
to view the state of the working directory and the staging area.git add
to add changes in the working directory to the staging area.
Learn more about How to download Kaggle datasets to Google Colab.
- After adding files and folder as per your requirements, commit your work using
git commit -m "message"
.
Step 5.A.3: Create a new repository on GitHub
Once you are satisfied with your work and want to save your commits to GitHub, follow the below steps,
- Visit GitHub official website and login with your account.
- Create a new repository in GitHub. Note: Do not initialize the repository with
README
,.gitignore
orlicense
file. This empty repository will await your code. Don’t worry, you can create those files in Google colab (local machine) or after your firstgit push
on GitHub.
Step 5.A.4: Upload your commits to GitHub
Before pushing your commits to your GitHub account, you need to configure your Google Drive git repository. To do this follow the steps,
- Create a set of variables from your GitHub account,
https://github.com/<username>/<repository>
username
— Your GitHub username. In our case, itsMohammedIsmailP
.repository
— Created repository. In our case, itstitanic
.git_token
— Your personal access token (Do not share to public).
- Add
remote
to your git from the above variable as,
git remote add <remote-name> https://{git_token}@github.com/{username}/{repository}.git
- Push your commits using
git push
command as,
git push -u <remote-name> <branch-name>
Congratulations!! You have successfully pushed your commits to GitHub. Let’s verify it in GitHub.
Step 5.B: Clone an existing Git repository
Follow the below steps to clone an existing git repository from GitHub into your Google Drive,
- Go to your GitHub repository to clone the repository.
- Click on Code button and copy the url as shown,
- You’ll need your GitHub access token before cloning your GitHub repository. Also extract set of variables from your GitHub account.
username
— Your GitHub username. In our case, itsMohammedIsmailP
.repository
— Created repository. In our case, itstitanic
.git_token
— Your personal access token (Do not share to public).
git clone https://{git_token}@github.com/{username}/{repository}
Congratulations!! You have successfully cloned your GitHub repository. You can start working on your repository and save your works by committing them using git commit -m "message"
.
Once you are satisfied with your work and want to save your commits to GitHub, you can simply push your commits to GitHub using git push
command.
Congratulations!! You have successfully pushed your commits to GitHub. You can verify them in GitHub.
That’s it! Our tutorial ends here!! We have successfully learned ‘How to use Google Colab with GitHub via Google Drive’. I hope you enjoyed doing along with this tutorial. But wait, we never learned ‘How to run notebook stored in Google Drive’. Don’t worry, I’ll cover this as well.
How to run notebook stored in Google Drive
To run your notebook stored in Google Drive, follow the steps.
- Open your Google Drive and login with your account.
- Locate your notebook in Google Drive and right click on the notebook to open actions menu, hover over Open with, and select Google Colaboratory as shown,
- Desired Notebook will be opened in Google Colab in a new tab.
- Mount your Google Drive to Google Colab (refer step 2).
- Change the working directory (refer step 3).
- Then execute your code.
That’s it! Simple, isn’t it? Hope, it helps. Happy Learning!!
Reference
- Scopes for OAuth Apps
- How to download Kaggle datasets to Google Colab
- Learn more about Git and GitHub from Git Handbook