Installing TensorFlow 2 with NVIDIA GPU on Google Cloud Instance
Artificial Intelligence, Deep Learning and Machine Learning are scorching terms now a days and will surely dominate next few decades.
When you talk about various programming languages, you have options to choose from lots of matured frameworks to develop and deploy your apps.
Artificial Intelligence and Machine Learning are growing up with effervescence. Numerous libraries and models are being developed across the world.
Most of the ML Developers face challenges while installing and setting up a Machine Learning Platform. We have created this guide while working on AI and ML projects at Quantrium using TensorFlow 2 on Google Cloud Compute Instances to train and serve our Machine Learning and AI Solutions.
TensorFlow is an open source platform for Machine Learning maintained by Google. It provides you a comprehensive and flexible ecosystem to develop and deploy your ML models.
Well, due to vast amounts of data and computation greedy nature of Machine Learning, the use of GPU’s to train the models has become necessary.
Thanks to cloud computing, as they provide various options and opportunity to try and select suitable GPU satisfying your requirements. You can pick the one which suits best for your use case.
In this guide, we will walk you through the process of installing TensorFlow 2 on Google Cloud Compute Instance with NVIDIA Tesla T4 GPU on Ubuntu 18.04 LTS. I am assuming that you know how to create a GCP instance with GPU.
First, download the driver for your NVIDIA GPU. To download the drivers, open the following URL in your browser
Download drivers for NVIDIA products including GeForce graphics cards, nForce motherboards, Quadro workstations, and…
and select the details as mentioned in the following screens.
Next it will redirect you to the following page, click the download button.
This will open the following page right click on the “Agree & Download” button and copy the url.
You can ssh into your instance and execute the following commands.
$ sudo apt-get update$ sudo apt-get upgrade
Go to your instance terminal, download the file and install the drivers as follows:
$ wget http://us.download.nvidia.com/tesla/418.126.02/NVIDIA-Linux-x86_64-418.126.02.run $ sudo chmod +x NVIDIA-Linux-x86_64-418.126.02.run$ sudo ./NVIDIA-Linux-x86_64-418.126.02.run$ sudo reboot
Now let’s download and install NVIDIA package repositories and drivers.
$ wget https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/cuda-repo-ubuntu1804_10.1.243-1_amd64.deb$ sudo apt-key adv --fetch-keys https://developer.download.nvidia.com/compute/cuda/repos/ubuntu1804/x86_64/7fa2af80.pub$ sudo dpkg -i cuda-repo-ubuntu1804_10.1.243-1_amd64.deb$ sudo apt-get update$ wget http://developer.download.nvidia.com/compute/machine-learning/repos/ubuntu1804/x86_64/nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb$ sudo apt install ./nvidia-machine-learning-repo-ubuntu1804_1.0.0-1_amd64.deb$ sudo apt-get update$ sudo apt-get install --no-install-recommends nvidia-driver-430$ sudo reboot$ sudo apt-get update$ sudo apt-get upgrade
Now let’s check for our GPU information by running the following command.
Now let’s install development and runtime libraries.
$ sudo apt-get install --no-install-recommends cuda-10-1 libcudnn7=126.96.36.199-1+cuda10.1 libcudnn7-dev=188.8.131.52-1+cuda10.1$ sudo apt-get install -y --no-install-recommends libnvinfer6=6.0.1-1+cuda10.1 libnvinfer-dev=6.0.1-1+cuda10.1 libnvinfer-plugin6=6.0.1-1+cuda10.1
Let’s keep it clean, remove the obsolete and unnecessary libraries from your system by running the
$ sudo apt autoremove
Let’s create a directory for our TensorFlow project and install the python virtual environment and TensorFlow 2 in it.
$ mkdir tfproject$ cd tfproject/$ sudo apt-get install python3-dev$ sudo apt-get install python3-venv$ python3 -m venv tfenv$ source tfenv/bin/activate
Upgrading the pip and installing
setuptools are necessary before installing TensorFlow.
$ pip install --upgrade pip$ pip install -U setuptools$ pip install tensorflow
Let’s check that our TensorFlow is installed properly and running fine by the following command.
$ python -c "import tensorflow as tf;print(tf.reduce_sum(tf.random.normal([1000, 1000])))"
Hey! That’s wonderful, you have successfully configured your Machine Learning platform. Soon I will come up with a basic implementation of a Machine Learning application using this TensorFlow environment.