How To Use Google AI Platform Notebooks For Your Data Science Team
In this article, I want to share my personal experience about how to use Google AI Platform Notebooks to reduce cost and increase the data scientist productivity.
Background
Jupyter Notebook is one of my team’s essentials tools. 90% of the projects starts with a prototype in a Jupyter Notebook .
Previously, we run a single Jupyter Notebook server to accommodate our need.
This approach come with a few drawbacks: (1) We can’t run multiple training process in paralel due to resource limitation (2) The backup procedure is complicated (3) Maintaining and hardening the security of jupyter notebook server is daunting task.
Using Google AI Platform Notebooks, each person in my team can easily spawn new instance based on the project requirements.
To reduce the cost, instance is temporary, it can be deleted without data loss. All notebooks data are stored in gitlab repository. For large data, we store it in Google Cloud Storage.
A Guide
These are the step-by-step to use Google AI Platform Notebook.
Create New Git Repository
Create new git repository on Github or Gitlab. We will use it to store all your team notebooks. You can also use existing git repository in your organization.
Launch New Instance
Go to the Google AI Platform dashboard.
Choose Notebooks in the sidebar.
Click New Instance
Choose the instance type based on the project requirements
Follow the instruction and launch the instance.
Open & Setup The JupyterLab
Click Open Jupyterlab
You will redirected to unique URL like the following:
https://RANDOM_STRING-dot-asia-southeast1.notebooks.googleusercontent.com/lab
Don’t worry, it’s secure by default.
Only authenticated user can access the Jupyter Lab.
The next step is to setup the Git repository.
Select Git → Clone Repository to clone the git repository.
Click Clone a Repository.
Enter the gitlab/github URL.
If you are using https
then enter your gitlab/github account email & password.
For ssh
you need to setup the SSH first.
You will have access to your repository inside the file explorer:
Then create new notebook and work on your projects.
To backup your notebooks, see below.
Backup Your Notebooks
Select Git, then pull the latest changes first.
See Untracked, for example:
Track your notebook file with hover the filename and click “+” icon.
Ignore the files that you don’t want to backup.
Write the commit message then, Click Commit.
Delete Your Instance
If you have finished the projects, it’s best practice to delete your instance immediately.
All notebooks are stored in Git and can be used for any instances. So, don’t worry you can spawn new instance and use existing notebook.
Conclusion
Using approach that we described above, you will only pay what you use and your experiments/prototype are safely stored in Git repository.
You can easily spawn new instance and use existing notebook in the Git repository.
Good Luck.