Deploying ML models on Google Machine Learning Engine
Cloud ML Engine is a managed service to build and run machine learning models in the cloud.
In this article, I’ll walk you through the steps of serving and manually deploying a Tensorflow model to Google Cloud Machine Learning Engine.
I’m using MNIST dataset to build handwritten single digit classification model, I won’t be discussing the model here, refer to this article for more details https://www.tensorflow.org/tutorials/estimators/cnn
Preparing & Exporting the model
The MNIST model expects an image input with the following shape 28x28x1.
This means you’ll need to preprocess your images before being able to make predictions, eventually you would want the preprocessing to be done on the server side instead of the client side.
This is where ServingInputReceiver function comes into play, Tensorflow tf.estimator.LatestExporter and export_savedmodel expect a serving_input_receiver_fn function parameter.
The role of this function is to tell your model what data it has to get from the user and in our case it’s an encoded image.
This is where we define what inputs our model is expecting, and here the model is expecting a string input parameter “image_bytes”.
The function will then decode the input and preprocess it to a float tensor of shape 28x28x1 as the model expects its input to be. Without this functionality, the model won’t be able to make prediction on the encoded image for 2 reasons, firstly it has not been preprocessed to a tensor (you would need to do this on the client/caller side)and secondly it expects float tensor while the provided input is an encoded string (input has to be serialized before passing it over the web).
The above function will be used by the LatestExporter method to export the model that’s ready to be served as shown in the below screenshot.
To upload the model to google platform, we first need to create a storage bucket, which is done next.
Create the Bucket
Let’s go ahead and create a project and a bucket to store our saved model:
- Login to Google Cloud Platform https://console.cloud.google.com create a new project and note your project ID
- Make sure your project is selected and navigate to Storage > Browser https://console.cloud.google.com/storage/browser
- Create a bucket with default settings
Now we are ready to train our model and upload it to the bucket. That’s what we will do next.
Train the model
Since this is a small model and I wanted to explore serving a model that accepts images through Google Cloud MLE, the training was done on a Jupyter Notebook, you can run this on your local machine or in the cloud.
The below code will call the train and evaluate function defined previously:
and here I’m using the gsutil tool (“a Python application that lets you access Cloud Storage from the command line”) to upload the saved model to the bucket.
Create the model
Creating the model is very easy, we can do this via the browser or the gsutil tool.
Follow the below steps to create the model via the browser:
- Navigate to models page and make sure your project is selected https://console.cloud.google.com/mlengine/models
- Click on New model and give a name for your model and hit create
- Select your model and click on “New Version”, enter the version details as below and set the Model URI to the bucket folder that contains your model
- Finally under Online prediction deployment, change scaling to manual and number of nodes to 1 and click Create.
Test your model
Once your model is created and ready, serialize an image and create JSON variable with the following format as indicated here:
{'image_bytes': {'b64': base64.b64encode(jpeg_data).decode()}}
Finally run the predict_json (source code) function to get a prediction for your image, this function accepts project name, model id, json content and the model version parameters:
And that’s it!
You can check out the entire code (Notebook and Python code) for this post on GitHub.