How to Create a GPU-enabled VM on GCP to Train Your Neural Networks

Nadav Rosen
Analytics Vidhya
Published in
12 min readAug 18, 2020
Credit: https://unsplash.com/@urielsc26

The biggest boom in deep learning happened back in 2009, when neural networks were trained with NVIDIA GPUs. Training neural networks with capable GPU(s) is particularly helpful in cutting down the time it takes for training, so it is an absolute necessity to be able to do so. Unfortunately, even if a machine has capable GPU(s), the necessary software and packages must be installed before training can commence. When I first started training neural networks on cloud-based virtual machine (VM) instances, I had a difficult time installing the correct software and packages and spent way too much time until everything finally worked. The purpose of this guide is to help anyone who would like to train his/her neural networks on a cloud-based VM instance on the Google Cloud Platform (GCP). When you go through the guide, please do not overlook the notes I have written. They contain some important information.

Step 1: Initial GCP Account Setup

At the time of writing this guide, any new GCP customer gets a $300 credit in his/her account, so I recommend creating a new Gmail account that will be linked to your GCP account. This GCP account will be free for 12 months from the moment you start using it. After you sign up for a new Gmail account, go to GCP (https://cloud.google.com/) and click on “Go to console”. Follow the prompts and enter the necessary information to use GCP. You do need to enter a valid credit card number. Unless you go over the $300 credit, your credit card will not be charged*.

*NOTE: Make sure that you do not over-utilize your GCP account, as the free credit can be gone very quickly.

Step 2: Creating a VM Instance

To create a VM instance, click on the top left button in the GCP console page:

Go to “Compute Engine” and click on “VM Instances”:

In your initial setup, you will have to enable billing on your account, so click on “Enable Billing”:

After billing is enabled, click on “Create”:

The following screen will appear:

Here, you can see how much free trial credits you have remaining and the monthly and hourly estimates for the machine you are about to create. When creating a VM, set it up as you would like. Remember, the more powerful the VM is, the more costly it will be to use. In our case, let’s create a simple VM. To add a GPU to your machine, click on “CPU platform and GPU” to expand these options.**

**IMPORTANT NOTE: You do need to create some activity in your account to be able to create a machine with GPU(s) on it, as you need to submit a request to change your GPU quota in your account. To do so, simply create a VM or two and leave it turned on for a few days, so that there will be some activity on the account. You do not really need to use these instances. You only need them to be “turned on”. The account will incur some costs as a result, but these costs will be deducted from your free credit. To see how you can increase the GPU quota on your account, go to the bottom of the guide.

To add a GPU to your instance, simply click on “Add GPU”:

Select the GPU type and number you desire to use (each GPU has different merits, so please research the different types to find more information about them). For now, we will select “NVIDIA Tesla P4” and “1”.***

***NOTE: Depending on the type, you can choose up to 8 GPUs, but it may cause some issues with the training, as I have experienced before. Also, you must have more than 1 GPU in your quota to request more than 1 GPU.

Under “Boot disk”, click on “Change”:

Change the “Operating system” to “Ubuntu”, the “Version” to “Ubuntu 18.04 LTS”, and increase the size of the boot disk as you would like:

The next step is only necessary if you would like to connect to your instance from an SSH client, such as PuTTY, which I highly recommend.

Step 3: Download and Install PuTTY

Download PuTTY from the following link: https://www.chiark.greenend.org.uk/~sgtatham/putty/latest.html

After you download and install PuTTY, you need to generate public and private keys. These will allow only your computer to connect to your GCP VM instance from an SSH client. To do so, run PuTTYgen. This program should be installed automatically when you install PuTTY. When you run PuTTYgen, the following screen will appear:

To generate a public/private key pair, click on “Generate”. Soon after, move your mouse for a few seconds. This will generate the key pair and in the “Key” box, your public key* will appear:

Copy everything in the “Public key for pasting into OpenSSH authorized_keys file” box to an empty notepad. We will use it later. As for your private key, click on “Save private key” to save it.

*NOTE: If you would like to have a specific username to be used later, change the “Key comment” to whatever you like. Here, I changed it to “nrosen”.

Now, we go back to our VM instance setup screen. To enable connection via SSH client on your VM instance, click on “Management, security disks, networking, sole tenancy” to expand these options:

Click on “Security”:

Copy and paste your public key from your notepad to the “Enter public SSH key” box:

Lastly, your “username” will appear to the left of the box where you pasted your public key. Please make a note of it, as it will be used later when you use PuTTY and WinSCP to connect to your GCP VM instance. Next, click on “Create”.

Once you have created your machine, you will be transferred to the following page:

In case your machine has been created successfully, you should see a green check mark symbol next to the name of your VM instance. If not, you will see a red symbol, which will have a small explanation of the error as to why it was not created successfully. Please make a note of the external IP here, as we will need it to connect to our VM instance later.

Step 4: Connecting to Your VM Instance with PuTTY

Now, we are ready to connect to our VM with PuTTY. First, run PuTTY:

Second, expand the “SSH” tab under “Category” by clicking on the “+” sign next to it. Once it is expanded, click on “Auth”.

Here, click on “Browse” and select the private key you previously saved. After you are done, scroll back up to the “Session” tab and click on it. Enter the “username” from before, followed by “@” and the external IP of your VM instance under “Host Name (or IP Address)”, like so:

In my case, it is “nrosen@34.70.66.96”. To avoid doing this step again in the future, you can save this session by typing a session name under “Saved Sessions” (“GPU-GCP” here), and clicking on “Save”.* To load the session, click on “Load”.

*NOTE: Every time you create a new VM instance, the external IP address will change, so that will be the only item that needs to be changed to connect to your VM.

Once you click on “Open”, an SSH connection will be established. A PuTTY Security Alert pop-up will appear. Click on “Yes” to continue.

In case a similar window to the one above appears, it means you have successfully connected to your VM instance using PuTTY! Great job. Now, we will continue with the most fun part of the guide, which is installing the proper software and packages to enable training neural networks using GPU(s).

Step 5: Installing the Necessary GPU Software and Packages

We are now ready to install the necessary software and packages to use the instance’s GPU(s) to train our neural networks! Good job on making it so far.

To do so, access your VM via PuTTY, and enter the following commands to install CUDA:

wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-repo-ubuntu1804_10.0.130-1_amd64.deb

sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub

sudo dpkg -i cuda-repo-ubuntu1804_10.0.130-1_amd64.deb

sudo apt-get update

sudo apt-get install cuda-10-0

We also need to install cuDNN. To do so, go to https://developer.nvidia.com/rdp/cudnn-download and click on “Join now” to create a new account. After signing up, log in and access the link again. After clicking on “I Agree to the Terms of the cuDNN Software License Agreement”, click on “Download cuDNN v7.6.5 (November 5th, 2019), for CUDA 10.0” and download the following three files:

cuDNN Runtime Library for Ubuntu18.04 (Deb)

cuDNN Developer Library for Ubuntu18.04 (Deb)

cuDNN Code Samples and User Guide for Ubuntu18.04 (Deb)

Now, we need to transfer these files to our VM instance using WinSCP. First, download WinSCP (https://winscp.net/eng/download.php). After installing WinSCP, we need to take similar steps to the ones in step 4 of this guide to allow file transfer between our computer and our VM instance.

After installing WinSCP, run it. To login to your VM instance, first click on “New Site”:

Your “Host name” is the external IP of your VM instance and the “User name” are the same as before. Click on “Advanced”. Under the “SSH” tab, select “Authentication”, click on the three dots under “Private key file”, and select the same private key as before. When you are done, click on “OK” and click on “Login” in the “Login” window.

If everything was done correctly, you will get a Warning window. Once you click on “Yes”, you will be connected to your VM instance. Now, you can transfer the cuDNN files you downloaded. The left side is your computer and the right side is your VM instance. To transfer the files, access the folder where they are stored in your computer and drag them to the computer on the right.

Depending on the speed of your connection, it may take a little bit of time to upload the files to the VM. Once you are done, you can go back to PuTTY and install cuDNN using the following commands:

sudo dpkg -i libcudnn7_7.6.5.32-1+cuda10.0_amd64.deb

sudo dpkg -i libcudnn7-dev_7.6.5.32-1+cuda10.0_amd64.deb

sudo dpkg -i libcudnn7-doc_7.6.5.32-1+cuda10.0_amd64.deb

To test cuDNN’s installation, run the following commands:

cp -r /usr/src/cudnn_samples_v7/ $HOME

cd $HOME/cudnn_samples_v7/mnistCUDNN

make clean && make

./mnistCUDNN

If everything was installed correctly, “Test Passed!” should appear after running the last command.

Great! Now, we can install python and its necessary libraries to test whether or not your neural networks are trained using the GPU(s) of your VM instance. To install python 3, run the following:

sudo apt-get install python3-pip

To install tensorflow-gpu & matplotlib, run the following:

pip3 install tensorflow-gpu==2.0.0a0

pip3 install matplotlib

Next, let’s create a sample keras script called “test.py”. To do so, run the following:

nano test.py

Here, you will paste a sample keras script. I used one from the following website: https://www.machinecurve.com/index.php/2020/02/09/how-to-build-a-convnet-for-cifar-10-and-cifar-100-classification-with-keras/

Copy the full model code onto “test.py”. To paste it, click on the right mouse key when you are in the “test.py” editing window.

Lastly, open a new session via PuTTY to monitor whether or not the GPU is being properly used when training the sample neural network script we created. After opening up a new session, run the following command:

nvidia-smi -l 1

If everything was installed correctly, the following should appear:

In your original session, run the following to start training your sample neural network:

python3 test.py

Check if the GPU is being used efficiently by switching back to the new PuTTY session we created. If everything went smoothly, you should see the following:

As you can see, a process called “python3” is almost fully utilizing the GPU’s memory. Success! We were able to create a GPU-enabled VM instance on GCP that can be used to train neural networks. If “No running processes found” is written under “Processes”, it means something went wrong along the installation process. Here is a comparison between the training speeds of one epoch of the sample neural network we created:

Without a GPU, it takes approximately 1 minute for 1 epoch.

With a GPU, it takes less than 10 seconds for 1 epoch.

That’s it! Please do not hesitate to post any questions you may have.

Supplemental information: GPU Quota Requests

To request a GPU quota increase, click on the button on the top left corner in your GCP screen, hover the mouse to “IAM & Admin” and click on “Quotas”.

In the following screen, type “GPUs (all regions)” in “filter table” and click enter:

To edit the quota, click on the box next to “Compute Engine API” and then “EDIT QUOTAS”:

A tab on the right side of the screen will appear. The name and email should be filled in automatically. You will only need to enter your phone number:

After clicking on “NEXT”, fill in the “New limit” (it is better to start low with 1 and increase it as needed later) and the “Request description” (something along the lines of wanting to train your neural network on a GPU-capable VM) in the following screen:

After clicking on “DONE”, your request will be submitted. You should get an approval or rejection email relatively quickly. The result depends on your account history. If you have not build some history, it will be rejected, so I recommend you to create a VM instance and leave it turned on for a few days to show some activity in your account. If you do, do not forget about it, as you will be charged in case your free trial credits get depleted.

--

--