Step by step guide to get started with Google Cloud Platform for Data Scientists

Vivi E
6 min readJun 25, 2018

--

Hi, this tutorial is meant for data scientists (or anyone) who wants to start using Google Cloud Platform’s cloud storage (buckets), and GPU.

Google Cloud Platform Account

Create an account

It is very easy to create a GCP account. Sign up for a gmail account and you are ready to go. You can also use your existing gmail account, if you have any.

GCP Signup for Free Trial

Go TRY IT FREE. At first try, you have to agree with GCP’s terms and conditions, and provide necessary details, etc. Continue and finish the setup part.

At some point in the registration, you will be asked to provide your credit card details. You can skip this part for now if you are certain that you do not need GPUs, or just skip that part if you are still unsure with what to do with your GCP account. Upon completion of the registration, you agreed to avail of their free trial account, and is now entitled to 300USD free credits.

Free trial account

Here are some limitations of the free trial account that I find relevant to us at this point

  • Free trial credits will expire 12 months after your GCP signup
  • Your project can only have at most 8 cores (virtual CPUs) running at the same time
  • You cannot request for GPU
  • Read more here for more FAQs on limitations

Hello, GCP Console

Upon successful GCP signup, you will now be redirected to the console. The console will serve as the dashboard of everything-google-cloud-related stuff. Please feel free to TOUR CONSOLE, if you need to.

Successful GCP signup

Create a project

All GCP requests from this time on will be under a project, so we really need to create one in order to proceed. For more information about creating and managing GCP projects, read up here.

Create project

Graphics Processing Unit (GPU) Resource

GPUs are essential to a specific kind of Machine Learning called Deep Learning. I encourage you to read more about GPUs, deep learning, and Tensorflow to know if this is the right tutorial for you. Read more about the pricing details here.

Request for GPU quota

How to go to Quotas page

To use the GPU resources of GCP, you first need to request for an increase in quota (starts at 0).

Go to
> Navigation Menu
> Compute Engine
> Quotas

You will then be asked to direct yourself to the IAM & Admin Quotas page.

As you might notice, the page requires us to Upgrade account. The most important thing to know about upgrading from your Free Trial account is that when you exceeded the free 300USD credits, you will be automatically charged on your credit card. One consolation to this is that GCP can compute your current bill on a daily basis, so you still have a chance to check your bill often.

To upgrade your account, click the Upgrade account button. This will prompt you to give your credit card details, if you have not yet. On the other hand, if you already did give your credit card details, you should refresh your page and the upgrade-account-notice should be gone.

IAM & Admin Quotas page — Free trial account

The Quotas page lets you choose for the Quota type, Service, Metric, and Location. It is important to know that while Google is doing a good job in virtualising their resources, these resources still have a hardware equivalent installed on some Google server around the world.

IAM & Admin Quotas page — Upgraded account
Quotas Page — Metric

Set your other settings to:

Quota type: All quotas
Service: Compute Engine API

A short explanation to Preemptible GPUs is that these GPUs are offered on a cheaper rate but the resource can be taken away from you when there is a demand for it. You can read more here.

I will request GPU K80 because this is the cheapest and most sensible to use for first timers. The location is us-west1 because GPUs are not available on most areas in Asia.

Select for resource for quota increase

Enter quota limit, and request description

Quota request details

Upon submitting request, the request ID will be shown to you. Wait for the corresponding email which might take days to process. There are times when Google will ask you to pay a certain amount when requesting for GPU resource. These payments are usually added to your credits for future use.

Request processed successfully

An email for a granted quota increase looks like the image below (the original quota request for this image is 4)

Quota for NVIDIA K80 GPU granted

You can now start using the GPU for whatever purpose you have — might this be GCloud ML Engine, or your own compute engine instance (with GPU).

Cloud Storage (Buckets) Resource

Buckets are the cloud storage of Google. These will be used significantly when using other GCP resources because they will house both the data input and output. You can check the pricing here.

gsutil Tool

The gsutil tool is a python application that will help you access the cloud storage using the command line. You can read about it here.

To setup, install the Cloud SDK from here. Make sure to download the one appropriate to your OS, and local machine. You also need to have Python installed in your machine, I recommend that you train yourself to use virtual environments, from which you can read more about here.

Cloud SDK

Extract the google-cloud-sdk-xxxx.tar.gz file from anywhere. To replace an existing installation, delete the existing google-cloud-sdk directory and extract the sdk file on the same location.

To use the gsutil tool anywhere, you can use the install script provided by running:

./google-cloud-sdk/install.sh
gsutil install script

Continue with the installation by answering some questions, and/or going with the default settings

Successful gsutil installation

After the successful installation, my ~/.bash_profile now has some added entries to it

# The next line updates PATH for the Google Cloud SDK.if [ -f '/Users/vivi/Documents/google-cloud-sdk/path.bash.inc' ]; then source '/Users/vivi/Documents/google-cloud-sdk/path.bash.inc'; fi# The next line enables shell command completion for gcloud.if [ -f '/Users/vivi/Documents/google-cloud-sdk/completion.bash.inc' ]; then source '/Users/vivi/Documents/google-cloud-sdk/completion.bash.inc'; fi

For changes to take effect, open a new terminal and use it.

Create folder from google cloud console

> Google cloud console
> Choose a project
> Navigation menu
> Storage
> Browser

> Create a bucket

> Fill up the details and create your bucket

gsutil cp: from local terminal to bucket

gsutil cp path/to/local_file.txt gs://my-bucket-name/path/to/folder/

gsutil cp: from buckets to local computer

gsutil gs://my-bucket-name/path/to/folder/file.txt path/to/local/

For more commands, check this out.

Sources

--

--