Step-by-Step Guide to Creating and Deploying Custom ML Pipelines with GCP Vertex AI (Part 2)

Warda Rahim
11 min readMar 5, 2023

--

In the first part, we went through some of the prerequisites for deploying machine learning (ML) models using custom Vertex AI pipelines, including how to build custom training and serving container images. Once all these things are in place, we can head over to our Jupyter Notebooks and start creating our custom ML pipeline.

We are using House Prices- Advanced Regression Techniques discussed on Kaggle as our use-case. Data consists of train.csv and test.csv files. The test data does not contain the target, therefore we will split the train data and use part of it for model training and part for model evaluation, and finally we will train the model using the entire data in train.csv.

—Import Required Libraries
— Define
PIPELINE_ROOT Variable
— Creating Pipeline Components
1-Data Ingestion
2-Data Preprocessing
3-Train-Test Split
4-Training
5-Evaluation
6-Model Deployment
— Pipeline Compilation and Run
-Define the Pipeline
-Compile the Pipeline
-Run the Pipeline
— Manual Deployment
Predictions
— Conclusion

Import Required Libraries

The first step would be importing the following libraries in your Jupyter Notebook:

We are using kpf.v2since it is the new version of Kubeflow Pipelines SDK and is compatible with Vertex AI. dslstands for ‘domain specific language’ — it is one of the main modules of Kubeflow Pipelines SDK and is used to define and interact with pipelines and components. Next, we import Dataset, Model, Artifact, etc.which are used to pass objects between components. Lastly, we import the compiler from kfp.v2 and pipeline_jobsfrom google.cloud.aiplatform.

Define PIPELINE_ROOT Variable

The next step is using the bucket name we created in Part 1 to define the PIPELINE_ROOT variable. This is where your pipeline will be stored:

Creating Pipeline Components

Now, we move to the main part which is creating pipeline components and understanding how artifacts are passed from one pipeline component to another.

Pipeline Architecture:

We will create a pipeline with the following components:

1. Data Ingestion
2. Data Preprocessing
3. Train-Test Split
4. Model Training
5. Model Evaluation
6. Model Deployment to Vertex AI Endpoint

We define below BASE_IMAGE variable which would refer to the custom training docker container image that we created in Part 1,

1- Data Ingestion

Now let’s try to understand this code. The first part with @component decorator defines the docker image (container with all the packages needed for running the code) and the output file which will be created when the function is run.

The function is not returning anything, but it has two types of arguments:

- filepath: str an input string defining the path to the input file

- dataset_train: Output[Dataset]. This means that when the function is run, we will be able to access an Output object of the class Dataset with path and metadata attributes.

At the end of the code, we are saving our pandas dataframe on a specific path. After the object component is called, we will be able to access its output attributes using

train_dataset = get_houseprice_data(filepath).outputs["dataset_train"]

This would allow us to use the output of this function into our other components.

2- Data Preprocessing

After data ingestion, we need to clean our data before we move to model training step,

Like data ingestion code, we first define the decorator @component and then the function. The function takes as input an Input object of class Dataset. When we define the full pipeline, these input arguments will refer to the dataset created by calling the data ingestion code. The function also takes as argument an Output object and that is where it will put its output.

3- Train-Test Split

After data preprocessing, we split the dataset into train and test sets so that we can perform hyperparameter tuning and model training using our train set, and model evaluation using the test set before the final model is trained using the entire dataset.

Now, in addition to Input and Output objects of the class Dataset, we are passing test_size (type float) as an argument. It determines the test dataset size in sklearn’s train_test_split.

4- Training

The fourth step is a model training component which would take as inputs the datasets outputted from the previous step and will be putting its output into an Output object of class Model. With this, we can save our model on a specific path and can easily access it.

This training component is also taking as input the test dataset (from train_test_split) due to some evaluation steps that are performed within the HousePriceModel class from train.py. This was done to first train the model using the training dataset and then evaluate using the test dataset, followed by a final training using the entire dataset (train+test) and then we can predict on the unseen data in test.csv. Therefore, the inputs and outputs that your components would be taking are very much dependent on how you have structured your code.

Also, note that in addition to output objects of class Dataset and Model, we also have output objects of class Markdown and HTML. I have added them to demonstrate that we can also save plots and tables on Vertex AI UI.

Custom Visualisation Artifacts

To evaluate the results of a pipeline job using custom visualisation artifacts, Vertex provides two approaches namely Markdown and HTML files.

We are outputting the best hyperparameters by defining the component with Output[Markdown] artifact and then writing markdown content to the artifact’s path. We can also export custom visualisation plots to an HTML file, as shown above, where we create a shap summary plot and write the image data to the html artifact’s path.

5- Evaluation

The model evaluation component is taking as input the model object from our training step and is outputting baseline, train and test metrics. Note the way our training code is written, the metrics for training and validation datasets were obtained during the training step and here we are simply using that dictionary.

For a training component where no evaluation is done during the training step, you can define your evaluation function like below:

The inputs would be the training and test datasets from the train_test_split component and the model from the training component. We then use the function to make predictions for train and test datasets to evaluate the model.

6- Model Deployment

The final component is model deployment to Vertex AI Endpoint. Before you can deploy your model, you must have a serving container image uri. The model deployed to Vertex AI endpoint will be wrapped within this serving container. We just need to specify the container image uri when uploading the model to Vertex AI model registry.

The component is using the same base image as other components. The function takes 6 arguments as input:

- uri of serving container image
- display name for the model
- display name for the endpoint
- GCP project name where you are running Vertex
- region
- the trained model

The outputs are vertex endpoint and vertex model deployed to that endpoint.

Before we investigate the code, you must familiarise yourself with the difference between Model and Endpoint. Model refers to all the comments that are related to model serving, such as run environment, ML framework, saved model, etc. On the other hand, Endpoint would be the computer resource where our model is running and where we request predictions. Therefore, to accept prediction requests the Model must be deployed to an Endpoint.

Now let’s breakdown the code. Model deployment is a 3-stage process.

  • The first part is to create an endpoint. create_endpoint() function first checks if there is an existing endpoint, if so it will use the most recently created one, otherwise it will create a new one.
  • Next, we upload the model to Vertex AI Model Registry. The upload requires model display name, model artifact uri, serving container image uri, region, health check path and inference path. In cases where we already have a model uploaded to Vertex AI Model Registry, we can use the argument parent_model, which allows us to upload a model with a version.
  • Finally, we deploy the model to the endpoint. For this, we need to specify the compute resources we would need. Above we are specifying the machine type and traffic-splitting. The default machine type is n1-standard-2. {“0”: 100} means that all the traffic to the model endpoint is routed through the current version of the deployed model — key 0 assigns traffic for the newly deployed model and value 100 is the traffic percentage to it. You can also specify the number of replicas, type, and number of GPUs, etc.

Pipeline Compilation and Run

Define the Pipeline:

Now we have all our components created, each describing the inputs, outputs, and implementation of the component. These components will form steps in our pipeline. Below we stitch together these components to define our ML pipeline.

It consists of six operators corresponding to the components we defined above. When the pipeline is run, the execution of each step (within a dockerised container environment) produces outputs defined for that component, which are then being passed to the downstream components.

Compile the Pipeline:

The next step after defining the pipeline is to compile it. The compiler takes our pipeline and outputs the compiled pipeline in JSON format with all information required to run it.

Run the Pipeline:

Once the pipeline is compiled, we can run it using the code below,

You can check your running pipeline by going to Vertex AI -> Pipelines and then clicking on your pipeline name, which will display the graph shown below.

Vertex AI pipeline in the console

After each step, you can see small boxes — these are artifacts generated from the steps. Clicking on Expand Artifacts at the top will give you more information about them.

Vertex AI pipeline in the console with expanded artifacts

You can click on an artifact to know more details (shown on the right-hand side). Each artifact will be sitting in the bucket you defined in the code. For example, if your artifact is a dataset, you can click on its URI which will take you to its bucket location from where you can download it.

Click on the artifact to view its details on the right-hand side

A successful pipeline run would have all its steps executed successfully — indicated by green ticks as in the image above. If any step fails, it will appear red instead of green. You can click on the component and then on the right-hand side, click on View Logs to check the error message.

Click “View Logs” to check the error message for a failed pipeline component

If the above pipeline ran successfully, the model will be uploaded to Model Registry and also deployed to Vertex AI Endpoint. Clicking on the artifact vertex_model and then on its uri on the right-hand side will take you to Model Registry where you can see all your models and their versions. Similarly, clicking on vertex_endpoint and then on its uri, we can see all the models along with their versions deployed to this vertex endpoint.

Model Registry with uploaded models and their versions
Vertex AI Endpoint with deployed models and their versions

Manual Deployment

Instead of deploying your model to the endpoint as part of the Vertex AI pipeline, you can also deploy it manually provided you have uploaded your model to Model Registry and have created an endpoint.

Click on your model’s name in Model Registry,
click on the version you want to deploy,
and then click on DEPLOY AND TEST.

Here you can deploy your model by clicking on DEPLOY TO ENDPOINT.

Once you have deployed your model to the endpoint, you can also test it here by submitting JSON requests (the basic format being a list of data instances) and getting back your predictions as JSON responses. Refer to Google guidelines on formatting inputs for online predictions.

Predictions

We now have a model deployed to Vertex AI endpoint, so how we can use it to get some predictions. We can call the endpoint using Python API within our Jupyter Notebooks. You would need your endpoint uri and then running the following lines of code will give you predictions:

Note that the input to endpoint.predict() is an array of instances fulfilling the requirements we mentioned in Part 1 and the response would be a JSON containing the predictions for all those instances.

Google’s custom container requirements mention that the response cannot be more than 1.5MB in size. Therefore, if we want to predict for a large dataset, we might have to send multiple responses — you can run the prediction job as a batch prediction.

Most machine learning models are deployed as either on-demand (near real-time) prediction service or in batch prediction mode. The latter has the advantage of lower processing power and less dependency on external data sources. It is also easier to debug an offline model. On the other hand, we can use web services to provide live predictions and the model can be made available to other applications through API calls. Running the model on a cloud service also makes CPU power less of an issue. Model deployment to endpoint makes more sense in case of online predictions where you would be making synchronous requests to a model endpoint. On the other hand, batch predictions are asynchronous requests which do not need an immediate response and in such a case, you can request predictions directly from the model without deploying it to an endpoint.

Conclusion

In this article, I shared how to build a custom Vertex AI pipeline and deploy a custom-built model on Vertex AI using the custom container approach. Vertex AI allows us to have a much more convenient and better life for all things related to MLOps. Remember that custom pipelines and containers, as the name suggests, are meant to be custom, so make this pipeline and container your own and customise it the way you want. We outlined a general setup and it would vary depending on your use case and code. Now, it’s time for you to experiment with Vertex AI pipelines and I hope in that journey you find this article useful! Good Luck!

References:

View the Code on Github:

Hi 👋 if you found this article useful, please support by buying me a coffee here. Thank you 😀

--

--