Serverless Transfer Learning with Cloud ML Engine and Keras

Image classification using deep learning is widely known today, but you hear only small numbers of successful news for image classification. The reason why, I suppose because it needs enormous data and time consuming to achieve good results.

Transfer Learning solves such a problem in a sophisticated way. 
Transfer learning is a technique that uses pre-trained network and change little bit at the end of the network, so you re-train only the part changed. This network still have ability to extract features of images and can be fine tuned for dedicated image classification, also need less time and fewer datasets to re-train. However, there are still many troublesome aspects of training and operation, so here this article show you how to use transfer learning easily on Cloud ML Engine .

TL;DR. Read my notebook :)

Transfer Learning using Keras

Keras is the easiest way to construct transfer learning model. As mentioned above, transfer learning model uses pre-trained network, and Keras has already some great pre-trained models.

Inception-v3 in Keras
It is quite easy to use a pre-trained model in Keras, only two lines as follows.

from keras.applications.inception_v3 import InceptionV3
model = InceptionV3(weights=’imagenet’)

This model was pre-trained with ImageNet’s datasets, which has one million images and 1000 classes.

Inception-v3 diagram via Google Research Blog

Let’s classify following two images with this model. Since Inception-v3 model accepts RGB 299x299 image as input, you must convert your image before classify it. Keras has also helpful modules to do this.

from keras.preprocessing import image
from keras.applications.inception_v3 import preprocess_input, decode_predictions
import numpy as np

# Make input data from Jpeg file
img_path = 'seagull.jpg'
img = image.load_img(img_path, target_size=(299, 299))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)

# Classify image
preds = model.predict(x)

# Print predicted classes
for p in decode_predictions(preds, top=5)[0]:
print("Score {}, Label {}".format(p[2], p[1]))

Images to classify and results;

Score 0.965535342693, Label Indian_elephant
Score 0.0246694963425, Label tusker
Score 0.000626200810075, Label African_elephant
Score 0.000182053816388, Label Mexican_hairless
Score 0.000138055766001, Label hippopotamus
Score 0.278156191111, Label albatross
Score 0.0422638729215, Label drake
Score 0.0255650430918, Label goose
Score 0.0211290325969, Label red-breasted_merganser
Score 0.019902639091, Label lakeside

The model classified elephants correctly, but failed to classify a seagull. The reason is simply because datasets for training the model doesn’t include “gull”, so that it classified similar candidates instead. You never get results out of the list, and that’s why transfer learning is needed.

Visualize intermediate layer outputs
Before going to transfer learning, let’s visualize intermediate layer outputs. To show list of layers, run the code below.

import pandas as pd

The output is as follows.

308  <keras.layers.merge.Concatenate object at 0x7f...
309 <keras.layers.core.Activation object at 0x7fb7...
310 <keras.layers.merge.Concatenate object at 0x7f...
311 <keras.layers.pooling.GlobalAveragePooling2D o...
312 <keras.layers.core.Dense object at 0x7fb7a1a5b...

We want to visualize outputs of layer 311, GlobalAveragePooling2D, so let’s construct a model to output the intermediate layer outputs.

from keras.models import Model

# The model which outputs intermediate layer features
intermediate_layer_model = Model(inputs=model.input,

To extract features and visualize, run the following code.

features = intermediate_layer_model.predict(x)
pd.DataFrame(features.reshape(-1,1)).plot(figsize=(12, 3))

The outputs of GlobalAveragePooling2D are 2048 dimensions features. Inception-v3 model classifies 1000 classes by using Dense layer at the end of the network, which uses these features as input. But now, we would like to classify “other” classes. So let’s remove this layer and put another one.

Add Dense layers for fine tuning
Let’s add dense layers, if we want to classify two-classes, the code would be something like this.

from keras.layers import Dense

# Connect Dense layers at the end
x = intermediate_layer_model.output
x = Dense(1024, activation='relu')(x)
predictions = Dense(2, activation='softmax')(x)

# Transfer Learning model
transfer_model = Model(inputs=intermediate_layer_model.input, outputs=predictions)

At this moment, the model trains all its variables. But we want to train only the dense layers we added, so let’s freeze untrained layers.

# Freeze all layers
for layer in transfer_model.layers:
layer.trainable = False

# Unfreeze last dense layers
transfer_model.layers[312].trainable = True
transfer_model.layers[313].trainable = True


Done! Now we can fine-tune this model for dedicated two-classes classification.

Fine tuning for two-classes classification
Let’s classify the images below. the datasets is named Opera-Capitol datasets that I made, which includes Opera house and Capitol 100 images for each. You can download the code to make this datasets.

Load dataset
The dataset is compressed as NumPy format and stored in GitHub, you can use it as follows.

import requests

url = ''
response = requests.get(url)
dataset = np.load(BytesIO(response.content))

X_dataset = dataset['features']
y_dataset = dataset['labels']

Let’s split the dataset into for train and for test, here I split it 80% for train and 20% for test.

from keras.utils import np_utils
from sklearn.model_selection import train_test_split

X_dataset = preprocess_input(X_dataset)
y_dataset = np_utils.to_categorical(y_dataset)
X_train, X_test, y_train, y_test = train_test_split(
X_dataset, y_dataset, test_size=0.2, random_state=42)

By the way, what would be the result if normal Inception-v3 model classifies the dataset? Let’s see how it goes.

x = X_dataset[0]
x = np.expand_dims(x, axis=0)

preds = model.predict(x)
for p in decode_predictions(preds, top=5)[0]:
print("Score {}, Label {}".format(p[2], p[1]))

The first image of the dataset is Opera house. and here is the result.. wreck. Again, because original ImageNet dataset doesn’t include Opera house.

Score 0.110657587647, Label wreck
Score 0.0671983659267, Label lakeside
Score 0.0309968702495, Label seashore
Score 0.0249739717692, Label breakwater
Score 0.0229569561779, Label fountain

Fine tuning the model for Opera-Capitol
To train the transfer learning model, just call fit function. After that, let’s evaluate the model how it predict correctly., y_train, epochs=20,
validation_data=(X_test, y_test))
loss, acc = transfer_model.evaluate(X_test, y_test)
print('Loss {}, Accuracy {}'.format(loss, acc))

The evaluation result for test data is..

Loss 0.112133163214, Accuracy 0.975

Accuracy reached 97.5%!

Training the model on Cloud ML Engine

The transfer learning model takes approx. ten to twenty minutes for training on local machine without GPU, if you run on Cloud ML Engine.. only a minute. (plus a few minutes for staging)

Making package
To train your model on Cloud ML Engine, you must make a package of your code first. In this example we use keras, h5py, and Pillow as external libraries, so you must include these libraries in your

from setuptools import setup
if __name__ == '__main__':

Cloud ML Engine from Jupyter Notebook
Packaging, uploading to Google Cloud Storage, running ML Engine job are really tiring. Don’t you want to just run your Jupyter Notebook code on ML Engine? I made an extension to do that!

Online Prediction on Cloud ML Engine

Now you want to serve your trained model, how to do that? Implement HTTP server, setup TensorFlow or Keras on the server, and Load balancing etc etc.. That’s touch work. On Cloud ML Engine, only things you need to do is to upload your model to GCS (Google Cloud Storage). It serves your model, accept prediction requests by REST API, and of course auto scaling.

Build a graph that converts image
Since Keras model accepts only raw image array as input, we should convert Jpeg or Png format to raw image array, otherwise the REST API requests’ payload would be too big.

with tf.Graph().as_default() as g_input:
input_b64 = tf.placeholder(shape=(1,),
input_bytes = tf.decode_base64(input_b64[0])
image = tf.image.decode_image(input_bytes)
image_f = tf.image.convert_image_dtype(image, dtype=tf.float32)
input_image = tf.expand_dims(image_f, 0)
output = tf.identity(input_image, name='input_image')

# Convert to GraphDef
g_input_def = g_input.as_graph_def()

Next we convert the Keras model to tf.GraphDef, so that we can connect the above graph.

sess = K.get_session()

from tensorflow.python.framework import graph_util

# Make GraphDef of Transfer Model
g_trans = sess.graph
g_trans_def = graph_util.convert_variables_to_constants(sess,

Here is the combined graph, it accepts base64 encoded Jpeg or Png file, then outputs classified result from transfer learning model.

with tf.Graph().as_default() as g_combined:
x = tf.placeholder(tf.string, name="input_b64")

im, = tf.import_graph_def(g_input_def,
input_map={'input:0': x},

pred, = tf.import_graph_def(g_trans_def,
input_map={ im,
'batch_normalization_1/keras_learning_phase:0': False},

Before uploading the model to GCS, we must convert the model to SavedModel format. The following code converts the graph to SavedModel format, and save it to GCS directly.

with tf.Session() as sess2:
inputs = {"inputs": tf.saved_model.utils.build_tensor_info(x)}
outputs = {"outputs":tf.saved_model.utils.build_tensor_info(pred)}
signature =tf.saved_model.signature_def_utils.build_signature_def(

# save as SavedModel
b = tf.saved_model.builder.SavedModelBuilder('gs://{BUCKET}/mdl')
signature_def_map={'serving_default': signature})

Let’s register the model to ML Engine. Just type the following two commands. It may take a few minutes.. so have a cuppa and it will be done.

gcloud ml-engine models create OperaCapitol
gcloud ml-engine versions create v1 \
--model OperaCapitol \
--runtime-version 1.2 \
--origin gs://{BUCKET}/mdl

Classify an image by Online Prediction

To classify an image by Online Prediction, just call its REST API. If you are requesting in Python environment, using discovery api library is much easier. Here is how to initialize discovery api for Online Prediction.

from oauth2client.client import GoogleCredentials
from googleapiclient import discovery
from googleapiclient import errors

projectID = 'projects/{}'.format(PROJECTID)
modelName = 'OperaCapitol'
modelID = '{}/models/{}'.format(projectID, modelName)

credentials = GoogleCredentials.get_application_default()
ml ='ml', 'v1', credentials=credentials)

Let’s classify an image. The converter graph doesn’t have any resizing function, you must resize the image to 299x299 by yourselves. Also don’t forget to encode the image to base64.

with open('opera.jpg', 'rb') as f:
b64_x =
import base64
import json

b64_x = base64.urlsafe_b64encode(b64_x)
input_instance = dict(inputs=b64_x)
input_instance = json.loads(json.dumps(input_instance))
request_body = {"instances": [input_instance]}

request = ml.projects().predict(name=modelID, body=request_body)
response = request.execute()
except errors.HttpError as err:

Here is the response from Online Prediction. The list of “outputs” represents confidence of Opera House and Capitol respectively.
99.7% for Opera House, that’s correct!

{u'predictions': [
{u'outputs': [0.9974665641784668, 0.00253341649658978]}

You can find the code of this article here;

Enjoy Cloud ML Engine!