All things GCP: Machine Learning Decision pyramid

Understand which Google Cloud tools matches best for you.

Gaurav Chauhan
Analytics Vidhya

--

Google Cloud ML Decision Pyramid

Note: this article is inspired by Sara Robinson on her tweet and a short 5 min video explanation is uploaded by Google Cloud Platform on Youtube.

This article will explain the tools and their usefulness in detail.

Introduction

Most of the cloud providing services provide numerous tools and services to use which benefits users to start their development process. Although these tools varied from many levels of abstraction as per user preference, this can also be overwhelming to the users.

Many of the users find it hard to find the best tools for their needs and this can best be seen in the Machine Learning stage.

Many users want Machine Learning in their products and cloud providers like GCP, AWS Azure, etc knows about this. Because of this demand, as per the user's needs and expertise they can provide machine learning tools out of the box and users can work on complex Machine Learning tasks by a few lines of code or no code at all!

Understanding of Google Cloud Platform ML tools on based on expertise

For this article I have selected GCP, and why not. For ML tools perspective, Google Cloud has the most useful ML toolkits that support a wide range of expertise.

Let’s jump into understanding each stack starting from the bottom.

* ML that require Data Scientists

ML Frameworks

  1. Frameworks like Tensorflow, Pytorch, Sklearn, XG Boost, etc have been popular with Data Scientists for quite a while.
  2. These frameworks help to use out-of-the-box code/functions to create Machine Learning or Neural Networks by few lines of code.
  3. Use of documentation for the most popular framework and many tutorials have made it go-to stack for Data Scientists.
  4. Also if needed you can easily modify the code and can make the custom framework as per the needs.

Most popular GCP frameworks to support

Google Compute engine

  1. You can create your own Virtual Machine Instance and launch a system as per your needs.
  2. You will have the granular power of your OS, its version, python version, which framework you needed, etc.

To get started

Google compute engine starting of an instance
GCP compute engine creation of a new instance

Deep Learning VM Images

  1. You will get pre-installed frameworks to use with Python or R Environments configured.
  2. You can select VM images based on your frameworks or environments.
  3. Like a compute engine, you can specify the number of cores and RAM for your instance.

4. If needed support of GPU and JupyterLab is also provided out of the box.

To get started

Deep Learning VM images — Creation of the new instance with configurations

Kubeflow

  1. This is mostly utilized when we need to deploy our training models to production.
  2. Because of different configuration in our local env. and production env. many of our ML tasks breaks when moving to production env. for deployment and to serve our ML model.
  3. This issue was addressed before by developers beforehand in applications other than ML and they created Kubernetes.
  4. Because of the demand for orchestration platforms like Kubernetes on ML, Kubeflow was formed.

To get Started

Kubeflow UI

Cloud ML Engine

  1. Similar to Deep Learning VM Images, GCP has AI Notebooks. A one-click easy to deploy AI environment with the notebook.
  2. This AI notebook can seamlessly pull data from BigQuery, use Cloud Dataproc to transform it, and leverage AI Platform services or Kubeflow for distributed training and online prediction.
  3. You can run a Notebook instance on a container of your choice.

To get started

AI Notebook — One-click access to JupyterLab

* Some Understanding about the data

BQML

  1. Here you have the data (if not Big Query has plenty of public datasets to play with). You have loaded your data to BigQuery, now you want to perform ML from the data but all you have learned is basic SQL. Here’s BQML comes to rescue.
  2. Inside BigQuery, we have BQML (Big Query Machine Learning) which can easily train and predict ML models from BigQuery itself.

3. A sample code to run ML model for training will look like

https://gist.github.com/2796gaurav/1663d6f7a2b3e366c256d0b2efa84508

4. You can evaluate your model metric and performance and if all looks good, you can use it for prediction inside of BigQuery console.

To get Started

BigQuery training using SQL

* Not Much understanding at all

AutoML

  1. In AutoML you have to provide the data in which you want to perform ML and that's all!
  2. AutoML removes the complications even of BQML and provides start-of-the-art ML models that you can use for your dataset.
  3. AutoML has many tools in its bucket such as

vision detection features within applications, including image labeling, face, and landmark detection, optical character recognition (OCR), and tagging of explicit content.

classify shots and segments in your videos according to your own defined labels or track multiple objects in shots and segments

build and deploy custom machine learning models that analyze documents, categorizing them, identify entities within them, or assessing attitudes within them.

lets you create your own, custom translation models.

build and deploy state-of-the-art machine learning models on structured data at massively increased speed and scale.

To get started

How Auto ML works

ML APIS

  1. These are Google’s provided API’s which will return ML prediction tasks when provided appropriate input.

2. ML Api’s include

To get started

Some GCP ML Api’s

Conclusion

As more and more development and research are been done in ML/ AI space, more people are interested to take leverage of it.

And as people from different background are interested in this field, it is no doubt, ML service providers will make more and more simple “sophisticated” tools that will tap to a wide range of users.

If we take a step back in all of these we can also think, if Machine Learning is becoming simple day-by-day, there is a possibility that companies/organizations/ individuals won't be relying on Data Scientists to make ML models and the “hype” this sector has created among individuals can burst.

This is just a speculation and if true we are very far away to completely rely on a computer for ML tasks… or are we?

For any clarification, comment, help, appreciation or suggestions just post it in comments and I will help you in that.

If you liked it you can follow me on medium.

Also, you can connect me on my social media if you want more content based on Data Science, Artificial Intelligence or Data Pipeline.

--

--