How to deploy your own ML model to GCP in 5 simple steps.

Daria Nguyen
Feb 26 · 7 min read

One day, you face the need to deploy a machine learning model on GCP. At first glance, you may assume that Google Cloud Platform (GCP) is not super-friendly for external models built with sklearn or xgboost, because they are not “Google-native“. And even if you have to deploy a pre-trained TensorFlow model, you may envision to face some difficulties and/or compatibility problems between TensorFlow versions.

However, GCP is very friendly and you have actually very little work to do. In addition, GCP recently launched a BETA version of Continous Evaluation Tool which lets you evaluate and track the performance of your model right after you deployed it. Just notice that the Continuous Evaluation Tool is still in BETA-mode. Also, if you want to track quality metrics before using the model in production, you can use MLflow in the meantime.

And now deploy your model to GCP in 5 simple steps:

Step 1: Package your model properly

Step 2: Create a Google Cloud Storage Bucket

Step 3: Upload your packaged model to a Cloud Storage Bucket

Step 4: Create an AI Platform Prediction Model Resource

Step 5: Create an AI Platform Prediction Version Resource

Also, don’t forget a few useful practical tips which will help you to make your model deployment as smooth and as pleasing as possible.

Step 1: Package your model properly

You have multiple options to do so. At the moment, in case you deploy your model or pipeline using the procedure described here, GCP accepts joblib, pickle or protobuf formats. So, to say that you have here some freedom with regards to a library/framework choice.

And a few pieces of code in Appendix 1, will show you:

1.1 How to package your sklearn model

1.1.1 With joblib, also in case of the pipeline

1.1.2 With pickle

1.2 How to package your xgboost model with pickle

1.3 How to package your TF model as a file in protobuf format

GCP expects a model packaged as a single file or multiple files (in case of TensorFlow).

GCP expects nothing else, but files with name model.*. It means, that you should always call your model file like: model.joblib, model.pkl,…precisely!

Step 2: Create a Google Cloud Storage Bucket

As you can see, once packaged your models are nothing else but single or multiple files. In order to make them accessible by Google Cloud Platform resources, you should upload them to GCP and store properly: the standard solution offered by GCP is to store your model files in buckets and here you need to create one. An easy way to do so is to pass the following gsutil command:

$gsutil mb -l <region_name> gs://<bucket_name>

Keep in mind that all buckets’ names share a single global Google namespace and thus your bucket name should be unique. For other details and instructions on bucket creation using WebUI, please, refer to the simple instruction on Creating storage buckets.

Step 3: Upload your packaged model to the Cloud Storage bucket

3.1 In case you have built your model with TensorFlow 1.x

In case your TensorFlow model is saved as a file in protobuf format, just copy your file into a dedicated bucket on GCP and proceed to the subsequent steps.

Alternatively, if you did not package your model as a file in protobuf format, you should upload the whole directory, for example, using gsutil. For example, your TF model was saved as a timestamped subdirectory of a base export directory, like your-model-export-dir/14879122020 you can upload the directory with the most recent timestamp as shown below:

>SAVED_MODEL_DIR=$(ls./your-model-export-dir|tail-1) gsutil cp -r $SAVED_MODEL_DIR gs://your-bucket

3.2 In case you already moved to TensorFlow 2.x

There are multiple ways to deploy your TensorFlow 2.x model:

  • Using Cloud AI Platform Prediction
  • Using a Compute Engine cluster with TF serving installed
  • Creating and using Cloud Functions

This article is focused on the use of Cloud AI Platform Prediction which corresponds to the case of TensorFlow 1.x (refer to the section above and the subsequent steps described below) and does not cover the cases based on “TF Serving” and “Cloud Functions”.

Step 4: Create an AI Platform Prediction model resource

AI Platform Prediction model is a container for the versions of your machine learning model. It manages computing resources in the cloud to run the model so that any app can request a model prediction.

Step 5: Create an AI Platform Prediction version resource

Versioning of the model is a way to organize your work. It lets you develop and upgrade your model in an iterative way. Thus, any time you improve or change your model, for example, switching from RandomForest to XGBoost, you can manage to keep multiple versions without the need to change other application structures you may have built around your model.

Of course, depending on which framework/library your model employs, e. g. TensorFlow or sklearn, you should choose an appropriate value for your framework and the corresponding version as well.

You are all set now!!! You can see your model deployed and ready for use.

And finally, there are a few tips for you to make sure your experience with model deployment on GCP will be smooth and help you to avoid some bumps on this road:

TIP 1 — Model size is limited: The total file size of your model directory must be 500 MB or less if you use a legacy (MLS1) machine type or 2 GB or less if you use a Compute Engine (N1) machine type (beta).

Take a look at Machine types for online prediction.

TIP 2 — GCP AI Platform relies on you to care about accessibility of your model: If you’re using a bucket in a different project, you must ensure that your AI Platform Prediction service account can access your model in Cloud Storage. Without the appropriate permissions, your request to create an AI Platform Prediction model version fails. See more about granting permissions for storage.

TIP 3 — Organize models versioning well: When you create subsequent versions of your model, organize them by placing each one into its own separate directory within your Cloud Storage bucket….

For example: Once you started your project, you deploy a single version of a model to predict a score of advertisement campaign. You call you version simple_scoring_model. Later on, an additional version of the scoring was demanded by the customer and a new model got deployed under a folder advanced_scoring_model. Thus you can either maintain two versions of application referring to two different versions or easily switch in between two deployed versions without updating or resetting anything within your app infrastructure, access settings, etc.

TIP 4 — Check Google lib support for your region: When you are choosing the bucket location for your model, do not forget to review the availability of XGBoost or TensorFlow for various regions. Some may not have all those resources or versions of libraries that you need, pre-installed! This Guide on GCP Regions will help you to find what you are looking for. Also, pay attention to the availability of a certain version of the libraries/frameworks. For example, the highest version available at the moment for scikit-learn is 0.20.4 and you won’t be able to deploy a model trained with the most recent version 0.22.x by default. However, it is always possible to upload any additional packages you need during the version creation under the “Custom code and dependencies“ section.

With all this, I wish you a happy deployment of your fantastic ML models on Google Cloud Platform.


1.1 Package sklearn model

1.1With joblib

also in case of pipeline:

1.1.2 With pickle

1.2 Package XGBoost model

1.3 Package your TensorFlow Model

Once created and trained, a typical TensorFlow model contains four files:

  1. model-ckpt.meta: This contains the complete serialized MetaGraphDef protocol buffer, that describes the data-flow, annotations for variables, input pipelines, and other relevant information.
  2. This contains all the actual values of variables (weights, biases, placeholders, gradients, hyper-parameters, etc).
  3. model-ckpt.index: It’s an immutable table (tensoflow.table.Table). Each key is the name of a Tensor and its value is a serialized BundleEntryProto. Each BundleEntryProto describes the metadata of a Tensor.
  4. checkpoint: All checkpoint information

Once we know where all our model files are located, we can package the model. For the example below, assume the model is saved in the./my_model/ directory.

Note that for my example below I created and trained a classifier for Titanic and saved my model right after it got trained, using a TensorFlow.saver.

Publicis Sapient Engineering

Publicis Sapient Engineering (anciennement Xebia) est la…

Daria Nguyen

Written by

Data Scientist/Data Engineer at Publicis Sapient Engineering.

Publicis Sapient Engineering

Publicis Sapient Engineering (anciennement Xebia) est la communauté Tech de Publicis Sapient, la branche de transformation numérique du groupe Publicis.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade