Serverless Machine Learning GCP

Published in

Google Cloud Platform by Cloud Ace

6 min readJan 2, 2020

#Serverless #Applied Machine Learning #GCP #BigQuery ML #DNN #TensorFlow #Keras

This article discusses an overview on how to build a machine learning model in a serverless manner with GCP. The brief explanation about machine learning concepts and how to implement it using BigQuery Machine Learning or TensorFlow and Keras are also will be covered here.

These following topics are covered in this article:

Understand The Main Terms
Build The Machine Learning Project
Build a ML Model with BigQuery Machine Learning (BQML)
Build a ML Model with TensorFlow and Keras

1. Understand The Main Terms

Serverless Concepts : Serverless means you don’t have to worry about provisioning computer instances to run your jobs. The services are fully managed, and you pay only for the resources you consume. It lets you write code your way without worrying about the underlying infrastructure.
Machine Learning : If you familiar with Artificial Intelligence (AI), then Machine Learning (ML) is one of it subset. ML is the scientific study of algorithms and statistical models that computer systems use to perform a specific task without using explicit instructions, relying on patterns and inference instead. Deep Learning (DL) or Deep Neural Network (DNN) is the subset of ML that has more layers and weight connections. The relationship between AI, ML and DL is pictured in figure below.

Google Cloud Platform (GCP) : Google Cloud Platform (GCP) offered by Google, is a suite of cloud computing services that runs on the same infrastructure that Google uses internally for its end-user products, such as Google Search and YouTube. GCP is an ideal place to run machine learning data because machine learning needs a lot of on-demand compute resources and lots of training data.

2. Build a Machine Learning Project

In this tutorial we aim to build a ML model using NYC taxicab dataset. Project in GCP is needed to build a ML model. If you don’t have one, you can sign up for free here.

First Step : Launch AI Platform

Navigate to AI Platform on the side menu bar and select the Notebooks. If you familiar with Jupyter Notebook or have been using Google Colab, this Notebook is using exactly the same concepts.

Click New Instance and select TensorFlow 2.x without GPU. Wait for a minute then click on open Jupyterlab to open Notebook environment.

Second Step : Cloning from Github

JupyterLab Environment— Terminal option on the bottom

You can start to write your own code or cloning a project from Github. For cloning a repository from Github you can use Terminal and type following command

git clone \ REPO_PATH

For this project you can clone from Google training code here.

There are two ways discussed on this tutorial to build ML model, by using BQML or using TensorFlow and Keras.

3. Build a ML model BigQuery Machine Learning (BQML)

In this project BMQL can be used for two things. First, to use BMQL to explore dataset, create ML datasets, create benchmark. Second, to use BigQuery ML to create first ML models.

First Step : Deal with the dataset using BQML

To deal with dataset In AI Platform, navigate to
training-data-analyst/quests/serverlessml/01_explore/solution
and open explore_data.ipynb.

Code for preparing the dataset can be found inside the explore_data.ipynb. Clear the output by clicking the clear button on Toolbar. Change the region, project, and bucket setting in the first cell based on your project. By clicking the Run button you will be able to see how to:

Access and explore a public BigQuery dataset on NYC Taxi Cab rides
Visualize your dataset using the Seaborn library
Inspect and clean-up the dataset for future ML model training
Create a benchmark to judge future ML model performance off of

The data visualisation inside explore_data.ipynb

Second Step : Build a ML model using BQML

BigQuery ML provides a fast way to build ML models on large structured and semi-structured datasets. To build our first models for taxifare prediction, navigate to
training-data-analyst/quests/serverlessml/02_bqml/solution
and open first_model.ipynb.

Clear the output by clicking the clear button on Toolbar. Change the region, project, and bucket setting in the first cell based on your project. By clicking the Run button you will be able to see how to:

Train a model on raw data using BigQuery ML in minutes
Evaluate the forecasting model performance with RMSE
Create a second Linear Regression model with cleaned up data
Create a third model using a DNN
Evaluate model performance against our initial benchmark

Root Mean Square Error (RMSE) indicates the accuracy of a model. The lowest the RMSE the better the model performances.

4. Build a ML model with TensorFlow and Keras

First Step : Learn how to read large datasets using TensorFlow

First we need to build data pipe line as the input of our Keras model and then construct the model. To build the data pipeline, navigate to
training-data-analyst/quests/serverlessml/03_tfdata/solution
and open input_pipeline.ipynb.

Use tf.data to read CSV files
Load the training data into memory
Prune the data by removing columns
Use tf.data to map features and labels
Adjust the batch size of our dataset
Shuffle the dataset to optimize for deep learning

Second Step : Create Keras DNN and wide-and-deep model

The next step is build the DNN model using Keras to predict the fare amount for NYC taxi cab rides. Navigate to
training-data-analyst/quests/serverlessml/04_keras/solution
and open keras_dnn.ipynb.

Use tf.data to read CSV files
Build a simple Keras DNN using its Functional API
Train the model using model.fit()
Predict with the model using model.predict()
Export the model and deploy it to Cloud AI Platform for serving

The Architecture of DNN we build with 465 trainable parameter

In this model, there are 4 layers of Neural Network. The first layer is input network with 5 feature nodes. Second layer has 32 hidden nodes with 192 parameter to train. Parameter means weights and biases. In the Third layer it has 8 hidden nodes with 264 connection. It connected to output layer with single output nodes and 9 parameters to train.

The number of iteration or epoch is only set into 5, increase the number of iteration is recommended. The other thing can be done to improve the model is through feature Engineering. After a training, the model can be deploy using gcloud ai-platform command which will take 5–10 minutes.

Prediction can be done using gcloud ai-platform predict command. Before doing the prediction, the input should be written as a json file that consist of 5 input features as shown in figure Visualisation of DNN Model in Keras. The output shown that the fare will be $11.43.