Custom Machine Learning Model Training using Vertex AI and BigQuery

Amit Srivastava
Brillio Data Science
3 min readJun 28, 2022

Vertex AI is a Google Machine Learning platform that helps organizations to develop, deploy and manage their ML projects easily. Vertex AI enables both AutoML and Custom Container based training, giving flexibility to engineers to control the training parameters. With Vertex AI we can train a model on image, tabular, text and video datasets.

BigQuery ML supports multiple machine learning models such as regression, classification, clustering, dimensionality reduction and time series forecasting. It also has feature like pre-processing, model creation, hyperparameter tuning, inference, evaluation and model export.

Custom Model Training

Vertex AI has a notebook feature (Vertex AI Workbench) where we train and build our models. To perform custom training, we will be using various components from Vertex AI, BigQuery for datasets and Google Container Registry.

Vertex AI Components:

· Workbench

· Training

· Experiments

· Model

BigQuery ML Feature:

· Datasets

· CREATE MODEL

· ML.PREDICT

I will be elaborating on the steps that we had taken to understand the Vertex AI components and BigQuery ML features of GCP :

How to Perform Model Training

In order to experiment with these modules , we used open source data ( tabular ), which was stored in BigQuery table. The training code takes care of loading this data to be further used for the model building. Below snippet shows the BigQuery table integration :

We then build the model using a TensorFlow based training code on Vertex AI workbench and containerize it using pre-built container service that includes all dependencies required for training.

Below code snippet shows how to create a docker image of training code and push it into the container.

Command to push docker image

IMAGE_URI=”gcr.io/$PROJECT_ID/test:b1"

docker build ./ -t $IMAGE_URI

docker push $IMAGE_URI

Post containerizing the codes, navigate to Vertex AI training console for model training. In the dataset section select No Managed Dataset as training method because we are pulling the dataset from BigQuery. In the training container option select the custom container, provide the previously built container image path, and provide the path for model output directory (google cloud storage) where model artifacts will get exported and start training.

Once model training gets completed navigate to Vertex AI model feature to view the built model.

Model Inferencing — BigQuery ML

BigQuery ML can import only TensorFlow models and evaluation features of BigQuery ML cannot be utilized on such imported models. To perform inferencing, we need to import the model first and then perform inferencing. Import the model using BigQueryML CREATE MODEL statement as mentioned below:

CREATE MODEL

‘model name’

OPTIONS

(MODEL_TYPE=’TENSORFLOW’,

MODEL_PATH=’exported model path- google cloud storage’)

Using ML.PREDICT function we can perform the inference on the imported models.

--

--