Manage and set up Google Cloud VM Instances Effectively For High End Deep Learning

In This post i will go over how to cost effectively manage and optimize your google compute engine VM instances for high end deep learning tasks.

In case you are hungry for GPU resources and want to participate in Kaggle and other competitions and do not have big budget to spend on multiple GTX 1080 ti, Google Compute engine is an good option.

Check the following blog for a single 1080 ti machine cost, which wont get you too far if you planning to follow cavier deep learning approach of training multiple model in parallel.

I use multiple Google compute engine VM instances Few with CPU and Few with GPU and TPU’s as well.

I will go over very basic setup you can do and keep your monthly bill below $200 or so on average if you are beginner.

Steps:

  1. Create one CPU VM Instance
  2. Create on GPU VM Instance
  3. Set up CPU Instance And GPU Instance
  4. Code on CPU Instance and Train on GPU Instance

Details:

  1. Create a VM Instance here i am showing a config with 4 vCPU, 100GB Standard persistence disk, with Ubuntu 16.04.

For more info refer to the below link.

2. Create one more VM Instance but lets use SSD and also add GPUs.

Tips: Zone is very important not every GPU is available in every zone.

Refer to the link below and choose your compute zone wisely, Tesla V100 is only available in the below zones, us-west1-a, us-west1-b, us-central1-a, us-central1-f, europe-west4-a, europe-west4-c, asia-east1-c

3. Following links should provide enough information about setting up your GPU Machine with Cuda and cudnn.

My combination tensorflow 1.10, cuda 9, cudnn 7.1.

4. Now once you are done coding in the CPU machine and ready to train you can transfer files from one vm instance in google compute to another vm instance using gcloud compute scp.

You need to follow the below steps to transfer files from one google compute vm instance to another vm instance.

Check the following stack overflow link for the steps.

Just make sure to login using same login in both cpu vm instance and gpu vm instance yourusername@gmail.com.

gcloud auth login

Now one can seamlessly transfer files from the CPU to GPU VM instance or vice verse.

gcloud compute scp hello.txt to-instance:/home/yourusername

gcloud compute scp -r dir to-instance:/home/yourusername

gcloud compute scp dir.tar to-instance:/home/yourusername

Sharing a persistence disk between two VM instances would have made life much easier, with the condition only one instance can run at a time, but its not possible as of now check the below link.

Happy Coding, and hope this article help in managing your cloud resources effectively.

Like what you read? Give Samrat Saha a round of applause.

From a quick cheer to a standing ovation, clap to show how much you enjoyed this story.