Using Google Cloud AI Platform Notebooks as a Web-Based Python IDE

Michael Sherman
Apr 10, 2019 · 18 min read

This post contains instructions and advice on how to set up and use Google Cloud AI Platform Notebooks as a development environment. It is targeted to Software Engineers who want to know how to use and find common IDE features, but is relevant to anyone interested in learning how to use Google Cloud AI Platform Notebooks or JupyterLab.

Table of Contents

Why Use a Data Science Tool as an IDE?

Google Cloud AI Platform Notebooks are built on JupyterLab. JupyterLab is web-based development environment that includes a Jupyter notebook editor, a file browser, a terminal, a text editor with code highlighting, customizable themes, a drag-and-drop tiled layout, and support for custom themes and extensions. All of these elements work side-by-side in an extremely responsive webpage.

A year ago I switched from PyCharm to JupyterLab (running on Google Cloud Deep Learning VM Images, which form the basis of AI Platform Notebooks) as my main development environment and I have not looked back.

In my day-to-day work as Machine Learning Engineer working with Google Cloud customers, I continually had challenges with my previous setup of PyCharm running on a Macbook. Most of my work requires customer data, which cannot leave the cloud. This meant I spent a lot of time in vim on a VM, relied heavily on managed services (which are great for large jobs but too slow when you just want to poke around), or struggled to create suitable synthetic data. Even when I could fully utilize PyCharm I’d run into RAM problems if a dataset was more than a couple gigabytes, or I’d encounter security problems as I traveled and made API calls to cloud projects from mysterious hotel IP addresses.

I also work a lot with data scientists, who have their own preferred tooling and workflows. Trying to get data scientists to use software engineer tools and workflows is a struggle (and hurts their productivity). But asking software engineers to switch to data science tools is absurd — you cannot build software with notebooks. And forget about trying to get a development team to unify their workflows if they are using fundamentally different tools.

There are some PyCharm features I miss, mainly advanced code navigation features. But I gained far more than I lost:

  • Faster ramp up on new projects. It used to take a few days to connect my local IDE to a properly secured project, and then a few more days to get permission from the right people to take data samples off the cloud. It now takes me 15 minutes to start working in a new (to me) cloud project. Since I’m entirely in the cloud now, working securely requires much less effort.
  • A massive dev machine, but only when needed. I’m really lazy — I don’t want to code in a distributed framework like Beam or Spark because a dataset is 10 GB and my dev PC has 16 GB of RAM. Now I can be lazy and frugal with datasets up to 10s of GBs. I write my code on a small sample of data, resize my modest dev VM to a monster VM with 96 GB of RAM, run the code on the entire dataset, then resize back down. If I’m slightly less lazy and use Python packages that handle multiple cores well, I resize to dozens of cores and 100s of GB of RAM and handle datasets up to 100s of GBs. This applies to GPUs as well — once I know my TensorFlow code runs on two K80s, I switch to eight V100s for the full training job.
  • No more installing the Nvidia driver stack. AI Platform Notebooks have the correct Nvidia drivers and CUDA version for your chosen deep learning framework pre-installed. Installing and managing a GPU-enabled deep learning stack is hard, and I’m glad I don’t need to do it.
  • Full productivity with a tiny computer on airplane wifi. JupyterLab is very responsive even with a spotty, high-latency connection. As long as the wifi isn’t completely dead, some of my most productive coding time is when I’m crammed into an aluminum tube banging away at my Chromebook.
  • Increased security and easier security compliance. The safest place for data is where it already lives, which is usually somewhere on the cloud. Now I no longer need to copy a data sample to my local PC to use my main development environment. I also limit my local PC’s cloud connection to a single VM or web endpoint, so the cloud project can be locked up tighter than a development project normally could.
  • Everything is already on the cloud. Things are just easier. The SDKs are installed, moving data around is faster, and I’m less affected by the firewall or security policies on my local internet connection.
  • Better collaboration with data scientists. I don’t like notebooks, but I can’t stop data scientists from using them. Notebooks obviously work great in JupyterLab, and getting JupyterLab to work a bit more like a traditional IDE lets software engineers use the same tooling as the Data Scientists. This makes it easier for software engineers to support data scientists when things break, and better positions software engineers to steer data scientists to more mature development practices.

Step 0: Before you Begin

To run AI Platform Notebooks, you need a Google Cloud Platform project (with an attached billing account if you want to use GPUs) with the Compute Engine API enabled. Instructions are here. If you’re already a Google Cloud user you can probably skip this step.

Many of the commands in this post use the Google Cloud gcloud command-line tool. Most of these steps are possible with the AI Platform Notebooks UI, but using gcloud gives more control and better reproducibility.

You can install gcloud on your local machine, but if you don’t already have a machine with gcloud just use Cloud Shell, a web-based terminal session with everything you need already installed. If you already have a Google Cloud project and are logged in, click this link to launch Cloud Shell.

If you are using gcloud for the first time, run gcloud init to authorize. You can specify project/region/zone defaults if you wish, but be aware AI Platform Notebooks and GPUs are not available in all regions yet. Currently, the best way to see what zones everything is available in is to look at the regions available when creating a new Notebooks instance.

Step 1: Create an AI Platform Notebooks VM Instance

First, create an AI Platform Notebooks VM instance. You can do this in the Notebooks UI but using gcloud gives more options:

You may want to change the command to meet your specific needs.

gcloud compute instances create $INSTANCE_NAME: all VM creation gcloud commands start with this.

--zone=$ZONE: the zone you want your VM created in. Only some zones have GPUs, but any zone in a region that supports AI Platform Notebooks is fine. Currently, the best way to see what zones everything is available in is to look at the regions available when creating a new Notebooks instance.

--image_family=$IMAGE_FAMILY: specifies the image to use to create the VM. An image family is a group of related VM images; pointing to an image family instead of a specific image ensures the most up-to-date image in the family is installed. The tf-latest-gpu image family contains VM images for running TensorFlow on GPUs — all the necessary tools (including JupyterLab) are preinstalled and key binaries are custom built for increased performance. This image family is part of Google Cloud Deep Learning VM Images, the product Notebooks is based on. Deep Learning VM image families are available for popular deep learning frameworks, no deep learning framework at all, CPU-only, etc. If you want to use a specific version of an image, use --image instead of --image-family and specify a valid Deep Learning VM image version.

--machine-type=$INSTANCE_TYPE: determines the RAM and cores of your VM. Many different configurations are available. Note you can easily change this later.

--image-project=deeplearning-platform-release: in what project to find the specified --image or --image-family . The deeplearning-platform-release project holds images provided by Google Cloud Deep Learning VM Images, don’t change this if you’re creating a Notebooks VM.

--maintenance-policy=TERMINATE: what happens to your VM during a maintenance event. Most VMs can be live migrated but if GPUs are attached to your VM live migration does not work. If GPUs are not attached to your VM (and don’t intend to ever attach GPUs) you can leave this line out.

--accelerator='type=nvidia-tesla-v100,count=2':the type of GPU to attach and how many to attach, see the documentation for available GPUs/counts. If you are not using GPUs, you can leave this line out.

--no-boot-disk-auto-delete: the default behavior on VM deletion is to delete the boot disk, this overrides that default and does not delete the boot disk when the VM is deleted. This means if you accidentally delete your VM you can still recover your work. But it also means you need to delete the disk separately from your VM if you want to remove everything.

--boot-disk-device-name=$INSTANCE_NAME-disk: creates a disk based on the name of the VM.

--boot-disk-size=500GB: adjust to your taste, 100GB or above is best.

--boot-disk-type=pd-ssd: makes your disk an SSD for better performance.

--scopes=https://www.googleapis.com/auth/cloud-platform: gives your VM the ability to connect to Google Cloud Platform APIs. This means you will be able to use services like BigQuery, Cloud Storage, AI Hub, etc. from your VM. It is also required to create a direct connection URL to your VM. You can name only the scopes you need, use service accounts, or leave this line out and use the Compute Engine default service account. But be aware with reduced scopes you will not get a direct connection URL to your VM, and then you’ll need to use an SSH tunnel to connect.

--metadata=’install-nvidia-driver=True,proxy-mode=project_editors’: metadata entries are used by Deep Learning VMs to pass parameters to installation and startup scripts. If you want to use GPUs install-nvidia-driver=True installs the driver, you can leave this out if you are not using GPUs. proxy-mode=project_editors creates a direction connection URL to your VM and adds it to the Notebooks UI. You can forgo this line as well and use an SSH tunnel to connect to your VM.

One optional flag to add, especially if you want a beefy VM now or later and are not using GPUs, is --min-cpu-platform=Intel\ Skylake. This ensures your VM is on the Skylake CPU platform (if your zone supports it), which allows VMs with more cores and RAM than other platforms. You can also use other CPU platforms.

Step 2: Connect to JupyterLab on Your VM

Your VM will be created a few minutes after the gcloud command finished. A few minutes after that, your VM is assigned a URL for direct connection and appears in the AI Platform Notebooks UI:

The AI Platform Notebooks UI

You can use the Notebooks UI to connect to your VM by clicking “OPEN JUPYTERLAB”, but you can also get the URL using gcloud:

You’ll see something like this: numbersandletters-dot-datalab-vm-region.googleusercontent.com.

This URL provides a secure web connection to your VM. Just put it in your web browser and you’ll connect to your JupyterLab session. Be aware other project users with high privileges can use this URL as well — it is not accessible only by the VM creator.

If you created your VM without proxy_mode=project_editors or did not set the https://www.googleapis.com/auth/cloud-platform scope, you need to connect to your VM via an SSH tunnel.

Once you’re connected to JupyterLab, you’ll see the default AI Platform Notebooks JupyterLab UI:

When you first open JupyterLab, you’ll see something like this.

Step 3: Get to Know the AI Platform Notebooks JupyterLab Interface

The Jupyterlab interface is powerful and highly customizable:

Out of the box, JupyterLab comes with a Jupyter notebook viewer/editor, a terminal, viewers for many file types, a text editor with syntax highlighting, and a file browser. It’s worth browsing the entire JupyterLab User Guide to learn about the features, but at least check out the basics of the interface.

  • UI and shortcut key customization options are found under the Settings menu.
  • To get autocomplete when working in the text editor, you need to “connect” the text editor to a code console by right clicking in the text editor and choosing the kernel (runtime) you want. This gives you an autocomplete menu when you press tab, and runs the current text editor line in the code console when you press shift+enter. One caveat of autocomplete: you must import a module to get autocomplete functionality for that module. For example, to get autocomplete for numpy, type import numpy as np and press shift+enter. After that when you type “np.” and hit tab you will get the expected autocomplete functionality.
  • Help (including method signatures) is available in the code console, (including an attached code console) through the inspector. Right click in the code console input box to open the inspector, and then as you type in the console input box the inspector will update. To get method definition information, you have to put the ( after the method name before help appears. This image shows all of this:
The Inspector tab, up top, is opened from the popup menu when you right click in the console input. Once the inspector is open, it will update as you type in the console input box. You usually need to put the “(“ after a method definition for the inspector help to update.

AI Platform Notebooks include additional functionality beyond base JupyterLab thinks to a few pre-installed extensions: git, tensorboard, and nbdime (notebook diffs).

  • The git extension provides a lightweight GUI with basic git functionality —cloning, tracking and staging changes, push/pull, making commits, creating and changes branches, and viewing commit history. If you’re working with data scientists or researchers who are not experienced with git, it’s a great tool for steering them towards better version control practices.
The git extension lives in the left sidebar, and appears when you click the git icon.
  • The tensorboard extension adds the TensorBoard UI directly into the JupyterLab UI. Navigate the file browser to a summary logdir written by TensorFlow, and use the launcher (the plus button in the top left) to open a TensorBoard UI in a tab. Note the tensorboard extension only appears if you’re running a Notebooks VM with TensorFlow pre-installed.
Note the directory in the file browser on the left. If you’re in a TensorFlow logdir and you click the “Tensorboard” button in the launcher, everything “just works”.
The TesorBoard is created in a new JupyterLab tab.
  • If you are working with Jupyter Notebooks nbdime is a must. Standard text diffs on .ipynb files are noisy due to the large amount of metadata. The nbdime JupyterLab extension displays useful notebook git diffs in the JupyterLab UI. Once you edit and save a .ipynb file in a git repo, use the “git” button to view a workable diff:
The “git” button is at the top right of the toolbar in a notebook tab.
You can clearly see the different cells between the local notebook and the notebook in the git repo. You can also browse deeper into other kinds of changes.

Step 4: Learn How to Stop, Start, and Resize your VM

When you are not actively working on your VM, stop it to save money and project resources. When your VM is stopped your settings, environment, data, and work are still intact and waiting for you when you start the VM again.

You also need to stop your VM to resize it. Resizing your dev VM is a quick and dirty way to run something that would otherwise take too long or not fit in memory, or to attach more GPUs for a large training job. You can change your VM machine type with gcloud:

To attach, unattach, or change GPUs, you need to use the Notebooks UI.

Once you’ve changed your VM configuration, start your VM:

The ability to resize and respec stopped Notebooks VMs is extremely useful. You can prototype your model training on a pair of cheap GPUs and then switch to eight P100s once you know everything works. You can work effectively with datasets up to 100s of GB with standard Python packages (just make sure your packages can take advantage of multiple CPU cores; for Pandas on multiple cores check out modin). And when you no longer need the extra power you can save money by reducing the specs of your VM.

Note there are restrictions on VM size. Different CPU platforms support different machine types, and there are additional limits on cores/RAM when GPUs are attached to a VM.

Step 5: Get Root Access to Your VM via SSH

Update 4/27/19: the latest VM images should allow the jupyter user to sudo from the JupyterLab terminal.

When you’re working on your VM, you are logged in as user jupyter. For better security, this user has the permissions you’ll need for day-to-day development work but has no root access to the VM or sudoer privileges.

At some point you will need sudoer privileges. You can get sudoer privileges by connecting to the VM through an SSH connection. You can do this with the Compute Engine UI, or with gcloud:

When you connect via SSH you connect as your default Google Cloud Platform user. This gives you sudoer privileges.

Depending on how your project is set up, you may not be able to SSH to your VM because port 22 is closed by default — this is a good security practice. To enable SSH, you’ll need to open port 22 by creating and applying a firewall rule.

First, create a high priority firewall rule that opens port 22:

allow-ssh is the name of the rule, --allow=tcp:22 opens TCP port 22, priority=0 makes this firewall rule take precedence over other firewall rules (lower priority=high precedence), --description is how your firewall rule is described (optional but useful), and target-tags are the VM tags that identify what VMs this rule applies to. If you’re extremely security conscious, consider using the --source-ranges parameter to only open port 22 to select IP addresses.

Next, apply the firewall rule to your instance by tagging your instance appropriately:

Now you are able to connect via SSH. Once you’re done, keep your VM more secure by untagging your VM until you need SSH access again:

Step 6: Manage Your Python Environment

Python environment management is an important consideration in a Python developer workflow. When using AI Platform Notebooks, you may not want to independently manage your environment. There are some under-the-hood optimizations you will lose (including optimized binaries of some packages) and you may run into problems with version compatibilities as you install the packages you need. You may find it easier to maintain separate VMs for each of your projects rather than separate environments.

But sometimes you need a tightly controlled environment, and the default Notebooks environment has many packages pre-installed (run pip list to see).

The most popular tools for managing Python environments are pipenv, virtualenv, and virtualenvwrapper (which extends virtualenv). If you have a preferred tool, you can use it with Notebooks (but you may require SSH to set it up properly, see Step 5 above) and you may want skip the content on virtualenv basics and skip down to “Creating a Jupyter Kernel for a Virtual Environment”.

virtualenv is the entry-level tool for environment management, and is pre-installed when a Notebooks VM is created. The general idea of virtualenv is you create a “virtual environment” with its own version of Python and its own installed packages. When you want to work in that environment you “activate” it, and then whenever you usepip only the active environment changes. When you’re done working you “deactivate” your environment. Virtual environments live on your disk in folders. You can have as many virtual environments as you want and you manage them yourself.

Creating a virtualenv virtual environment is as simple as running the virtualenv command followed by a target directory where your virtual environment will live:

This example creates a virtual environment in the venv directory.

Activate the virtual environment by running bin/activate in the environment directory:

You’ll see the shell prompt changes to have (venv) in front of it. This lets you know you’re in the virtual environment. Now anything you do with pip only changes the activated virtual environment.

When you’re done working in the environment, run the deactivate command to leave the environment:

A popularvirtualenv usage pattern is to create a virtual environment in every project you work on in a venv directory in the project root. If you do this, consider putting venv in your .gitignore file so you do not check it into a remote git repo. Another popular approach is to keep a single ~/envs directory with subdirectories for your different environments. And once you understand the basics of virtual environments, its worth checking out virtualenvwrapper which automates virtual environment management.

If you want to remove your virtual environment permanently delete the directory:

virtualenv also supports the creation of virtual environments with different Python versions:

Note that whatever comes after -p is the path to an executable Python interpreter. On a Notebooks VM python2 and python3 are in the path and point to the latest versions compatible with the pre-installed packages, but to create an environment with a very specific version of Python you may need to install that version yourself.

To make your code portable, create a requirements.txt file from inside your environment with pip freeze and share it with your code.

You may want a Jupyter kernel associated with your virtual environment. This lets you run notebooks and consoles from your virtual environment. Run this code in your virtual environment to install the ipykernel package and create the kernel for the jupyter user. KERNEL_NAME is the internal iPython name, but DISPLAY_NAME is what you’ll see in the interface:

You’ll need to refresh the JupyterLab web page before notebook and console icons for the new kernel appear in the launcher.

Be aware of your TensorFlow flavor when using virtual environments. TensorFlow has separate packages for GPU vs. non-GPU. Installing either version into your virtual environment should work, but make sure to install the right package to match your configuration (GPU vs. no GPUs). This also means you should keep separate virtual environments for both GPU and non-GPU if you plan to switch between the two. Cloud AI Platform Notebooks should handle switching automatically in the default environment (and install the correct binaries and Nvidia stack as appropriate), but you’ll need to handle it yourself if you are managing your own environments.

Step 7: Explore Extensions

JupyterLab supports custom extensions (and themes) and provides detailed documentation to developers to promote the creation of new extensions. Extensions are a rapidly evolving aspect of the JupyterLab experience and are prone to instability. Other than the pre-installed extensions, they are not officially supported by AI Platform Notebooks. Also note that the JupyterLab built in extension manager is currently not available in AI Platform Notebooks

There is a simple extension worth adding called go-to-definition which jumps to the definition of a variable or function if the definition is in the same file. This is especially useful when trying to work through long or disorganized notebooks or .py files. Note this extension is not officially supported, and like all 3rd party code you should review it before installing.

You will need an SSH connection for sudoer access(see step 5). Once connected, run the following:

After the extension is installed refresh the JupyterLab webpage. Then you can hold ALT in any notebook or .py file and click to jump to a definition in the same file. ALT+O jumps backwards from the definition to where you first clicked.

There are many more extensions available, in various states of completion. There is also a tutorial and excellent documentation if you are interested in making your own extensions or fully customizing your theme.

Step 8: Learn and Do More

The tutorials folder is in the home folder.

I want to thank Viacheslav Kovalevskyi for answering my technical questions about AI Platform Notebooks and championing an excellent product, and Wesley Turner and Praneet Dutta for reviewing my post and offering excellent feedback.

Google Cloud - Community

Google Cloud community articles and blogs

Google Cloud - Community

A collection of technical articles and blogs published or curated by Google Cloud Developer Advocates. The views expressed are those of the authors and don't necessarily reflect those of Google.

Michael Sherman

Written by

Machine Learning Engineer at Google

Google Cloud - Community

A collection of technical articles and blogs published or curated by Google Cloud Developer Advocates. The views expressed are those of the authors and don't necessarily reflect those of Google.