Deploy, Don’t Delay: Streamlining model deployment with Vertex AI Deployer

Ivan Nardini
Google Cloud - Community
10 min readJan 15, 2024

written in collaboration with Angel Montero.

Among the various stages of the MLOps life cycle, Model Deployment stands as one of the most challenging steps. Once a model has been successfully trained, validated, and registered to the model registry, it becomes ready for deployment. The deployment process entails packaging, testing, and deploying the model to a designated serving environment. This may necessitate the construction of a CI/CD pipeline, facilitating progressive integration and delivery.

Constructing a model deployment pipeline on Vertex AI using Cloud Build is a widely adopted pattern. This release pipeline’s purpose is to deploy registered models to Vertex AI. Using Cloud Build for model deployment introduces certain limitations.

  • It does not provide the ability to approve the model release.
  • Implementing deployment strategies like canary deployments is not straightforward.
  • Additionally, Cloud Build lacks support for leveraging Vertex AI Model Registry aliases, which hinders the effective management of the deployment process.

To overcome the challenges, Google Cloud Deploy recently introduced custom target type support, now in Public Preview, allowing for extending Cloud Deploy beyond container based runtimes. Along with this launch, 5 sample custom target types were provided, including one for Vertex AI.

This blogpost provides an overview of Cloud Deploy and the Vertex AI custom target, and then guides you through the process of getting started with the Vertex AI Deployer sample. By the end of this reading, you will know how to build a more robust deployment pipeline by leveraging both Cloud Deploy and Vertex AI.

Introduction to Cloud Deploy and the Vertex AI Deployer

Cloud Deploy is a managed continuous delivery service. With Cloud Deploy the user covers the main steps of a Continuous Delivery process. In particular, the user can:

  • Define a delivery pipeline and configures an ordered sequence of one or more targets, which correspond to a particular deployment environment.
  • Create a release, which is associated with a specific version of software.
  • Start a rollout, a deployment of that release to a specific target in the pipeline sequence.
  • Promote this release to the next target in the sequence.

The YAML-based declarative Cloud Deploy language defines and creates resources when interacting through the gcloud CLI. In fact, Cloud Deploy performs deployments by applying declarative definitions to a destination target. When you create a release, you provide a file that defines the state you want your destination target to look like. This configuration file may include variables or other placeholders which are filled in by Cloud Deploy during a Rendering Step. Finally, during the Deploy phase, the final rendered file is applied to the target resource. The end-goal for the deployment is for the resource configuration to match the configuration file.

Natively, Cloud Deploy was designed to deploy only to specific target types. The Custom Targets extend the Cloud Deploy capabilities much beyond and gives you more flexibility. Custom Targets take over the rendering and deployment process, giving you the power to deploy to almost any kind of deployment environment including to a Vertex AI Endpoint. Furthermore, you are still taking advantage of Cloud Deploy features such as approvals, promotions, and rollbacks. To better understand the relationship between Vertex AI and the Cloud Deploy Custom Target sample, you have a representation of this relationship in the following figure.

Figure 1 — When Cloud Deploy meets Vertex AI

Cloud Deploy offers a collection of open-source custom target examples, available on GitHub. These examples can be tailored to meet your specific requirements. For instance, let’s explore the Vertex AI Model Deployer, an example which demonstrates how to leverage Cloud Deploy custom target features to deploy a model in Vertex AI Endpoint.

Get started with Vertex AI Model Deployer

Assuming you have successfully trained your model, you can leverage Vertex AI Deployer to automate its deployment for online serving. To do this, you need to register the model in Vertex AI Model Registry. This registry offers a centralized, browsable repository that facilitates the management of your model’s lifecycle, including versioning and evaluation.

You can register a version of your model using Vertex AI Python SDK as shown below.

from google.cloud import aiplatform as vertex_ai

MODEL_ID = 'test-model'

registered_model = vertex_ai.Model.upload(
model_id=MODEL_ID,
display_name='model_to_deploy',
artifact_uri='gs://model-bucket/model/',
serving_container_image_uri='xx-docker.pkg.dev/xx/tf_opt-cpu.nightly:latest'
)

print(registered_model)

# resource name: projects/<your-project-id>/locations/<your-location>/models/<your-model-id>

where artifact_uri is the GCS directory path that contains the Model artifact and any of its supporting files. And serving_container_image_uri is the URI of the Model serving container which is necessary for deploying the model on Vertex AI. This example uses the optimized TensorFlow runtime on Vertex AI.

To implement the model for obtaining real-time predictions, you also need to create a Vertex AI Endpoint, which is a managed service for hosting ML models. The following code presents an example of creating a Vertex AI Endpoint by leveraging the Vertex AI Python SDK.

from google.cloud import aiplatform as vertex_ai

ENDPOINT_ID = 'prod_endpoint'

endpoint = vertex_ai.Endpoint.create(
endpoint_id=ENDPOINT_ID,
display_name='target_endpoint',
project='prod-project',
location='prod-region',
)

print(endpoint)

# resource name: projects/<your-project-id>/locations/<your-location>/endpoints/<your-endpoint-id>

Once you have successfully registered a model and created an associated endpoint, you can leverage the Vertex AI Deployer sample available on Github. This sample enables you to seamlessly deploy your model to the endpoint utilizing Cloud Deploy.

When you use the Vertex AI Deployer to deploy a registered model, you start by building and registering the container image associated with the Custom Target Type for Vertex AI. The image includes the functionality to deploy the model to the custom target, in this case the Vertex AI Endpoint. For this, Vertex AI Deployer offers a wrapper build_and_register.sh that helps to automate the process, as you can see in the following printout.

cd cloud-deploy-samples/custom-targets/vertex-ai/quickstart
./build_and_register.sh -p $PROJECT_ID -r $REGION


# Created Cloud Deploy resource: projects/prod-project/locations/prod-region/customTargetTypes/custom-target-id

Below you can see the resulting target in the Cloud Deploy UI.

Figure 2 — Cloud Build Custom target

After you create and register the Cloud Deploy Custom target image, you can define the deployment process using both deployment parameters and custom actions.

When creating a Cloud Deploy Release, the Vertex AI Model Deployer expects a YAML representation of a DeployedModel resource that specifies the model to deploy, the endpoint to deploy it to, and any resources required for serving. See the Deploy Parameters List of supported deploy parameters for details.

dedicatedResources:
maxReplicaCount: 9

In addition to deployment representation, the Vertex AI Model Deployer sample supports custom actions through Skaffold. Custom actions provide the ability to implement deployment strategies, such as canary configurations. These configurations facilitate endpoint traffic splitting between the newly deployed model and the previously deployed one. Below you can find the Skaffold file in the Vertex AI Model Deployer sample.

apiVersion: skaffold/v4beta7
kind: Config
customActions:
- name: add-aliases
containers:
- name: add-aliases
image: $REGION-docker.pkg.dev/$PROJECT_ID/cd-custom-targets/$_CT_IMAGE_NAME@$IMAGE_SHA
args: ["/bin/vertex-ai-deployer", "--add-aliases-mode"]

So far, you defined the custom target and the deployment process. The next step involves creating a delivery pipeline on Cloud Deploy to orchestrate the deployment workflow. The example follows:

apiVersion: deploy.cloud.google.com/v1
kind: DeliveryPipeline
metadata:
name: vertex-ai-cloud-deploy-pipeline
serialPipeline:
stages:
- targetId: prod-endpoint
strategy:
standard:
postdeploy:
actions: ["add-aliases"]
---
apiVersion: deploy.cloud.google.com/v1
kind: Target
metadata:
name: prod-endpoint
customTarget:
customTargetType: vertex-ai-endpoint
deployParameters:
customTarget/vertexAIEndpoint: projects/$PROJECT_ID/locations/$REGION/endpoints/$ENDPOINT_ID
customTarget/vertexAIConfigurationPath: "prod/deployedModel.yaml"
customTarget/vertexAIMinReplicaCount: "3"
customTarget/vertexAIAliases: "prod,champion"

The delivery pipeline incorporates a sequence of stages, enabling the deployment of the registered model to each target across different environments. Also it includes the specification of the custom target, comprising its name, type, and certain deployment parameters associated with Vertex AI Endpoint. You can add stages and custom target specifications according to the number of deployment environments you have.

Similarly with Cloud Deploy Custom target image, it is noteworthy that the Vertex AI Deployer sample offers a default delivery pipeline. And, the replace_variable.sh script can be utilized to substitute the placeholders in the Cloud Deploy and Skaffold configuration values with the actual images.

./replace_variables.sh -p $PROJECT_ID -r $REGION -e $ENDPOINT_ID

In either situation, you can apply the Cloud Deploy configuration defined in clouddeploy.yaml.

gcloud deploy apply - file=clouddeploy.yaml - project=$PROJECT_ID - region=$REGION
# Created Cloud Deploy resource: projects/prod-project/locations/prod-region/deliveryPipelines/delivery-pipeline-id
# Created Cloud Deploy resource: projects/prod-project/locations/prod-region/targets/target-id

Here you have the resulting delivery pipeline in the Cloud Deploy UI.

Figure 3— Cloud Deploy model delivery pipeline

As noted in the introduction, you have to create a Cloud Deploy release before beginning the rollout. The release allows you to set changes you wish to deploy in the Cloud Deploy configuration. In our case, you might want to alter the default deployment settings, such as the machine type, the number of replicas, and more, which are all tied to your endpoint. Here’s an example code that shows the creation of a release for rolling out the model on Vertex AI.

export RELEASE_ID = 'release-001'
export DELIVERY_PIPELINE = 'vertex-ai-cloud-deploy-pipeline'
export MODEL_ID = 'test_model'
export DEPLOY_PARAMS ="customTarget/vertexAIModel=projects/${PROJECT_ID}/locations/${REGION}/models/${MODEL_ID}'


gcloud deploy releases create $RELEASE_ID \
--delivery-pipeline=$DELIVERY_PIPELINE \
--project=$PROJECT_ID \
--region=$REGION \
--source=configuration \
--deploy-parameters=$DEPLOY_PARAMS

# Created Cloud Deploy release release-0001.
# Creating rollout projects/prod-project/locations/prod-region/deliveryPipelines/delivery-pipeline-id/releases/release-id/rollouts/rollout-id

In this case, the following command line flags are used to supply the custom deployer with the necessary parameters for deployment:

  • — source: This flag instructs gcloud where to look for the configuration files relative to the working directory.
  • — deploy-parameters: This flag is used to provide the custom deployer with additional parameters needed to perform it. In this case, customTarget/vertexAIModel indicates the full resource name of the model to deploy.

About the remaining parameters, — delivery-pipeline is the name of the delivery pipeline where the release will be created. The project and region of the pipeline are specified by — project and — region, respectively.

That command automatically initiates a rollout. Below is the running release process in the Cloud Deploy UI.

Figure 4— Cloud Deploy Release

The process deploys the initial model version to the target endpoint as you can see below.

Figure 5 — Vertex AI Endpoint view

The model rollout will require some time to complete. To get information about the release such as the rollout ID, you can use the following gcloud command.

gcloud deploy releases describe $RELEASE_ID --delivery-pipeline=$DELIVERY_PIPELINE --project=$PROJECT_ID --region=$REGION

#condition:
#...
#customTargetTypeSnapshots:
#...
#deliveryPipelineSnapshot:
#...
#deployParameters:
# customTarget/vertexAIModel:
#...
#targetArtifacts:
#...
#targetRenders:
#...
#targetSnapshots:
#...

Using the rollout ID, you can monitor the status of your rollout by utilizing the following CLI command.

export ROLLOUT_ID = 'release-0001-to-prod-endpoint-0001'

gcloud deploy rollouts describe $ROLLOUT_ID --release=$RELEASE_ID --delivery-pipeline=$DELIVERY_PIPELINE --project=$PROJECT_ID --region=$REGION

# renderState: SUCCEEDED

After the rollout completes, you can inspect the deployed models and traffic splits of the endpoint with gcloud.

gcloud ai endpoints describe $ENDPOINT_ID --region $REGION --project $PROJECT_ID

# Using endpoint [https://xx-aiplatform.googleapis.com/]
# deployedModels:
# dedicatedResources:
# machineSpec:
# machineType: n1-standard-2
# maxReplicaCount: 9
# minReplicaCount: 3
# displayName: model_to_deploy
# id: '166...'
# model: projects/685.../locations/us-central1/models/681...
# modelVersionId: '1'
# displayName: target_endpoint
# trafficSplit:
# '166...': 100

The same can be achieved through the Vertex AI Endpoint view:

Figure 6— Vertex AI Deployed model

Additionally, since custom actions are used to modify model aliases, querying the rollout enables you to monitor the post-deployment operation.

gcloud deploy rollouts describe $ROLLOUT_ID --release=$RELEASE_ID  --delivery-pipeline=$DELIVERY_PIPELINE --project=$PROJECT_ID --region=$REGION--format "(phases[0].deploymentJobs.postdeployJob)"

# phases:
# - deploymentJobs:
# postdeployJob:
# id: postdeploy
# jobRun:
# ...
# postdeployJob:
# actions:
# - add-aliases
# state: SUCCEEDED

Upon successful completion of the post-deployment job, you can examine the deployed model and its associated aliases. Specifically, the aliases “prod” and “champion” should be assigned. These aliases play a crucial role in managing the delivery process of your model.

gcloud ai models describe $model_id --region $REGION --project $PROJECT_ID --format "(versionAliases)"

# Using endpoint [https://xxx-aiplatform.googleapis.com/]
#versionAliases:
#- default
#- prod
#- champion

Conclusion

Why settle for a custom ML deployment process when you can embark on a streamlined deployment journey on Vertex AI with Cloud Deploy? This blogpost shows how Cloud Deploy enables you to effortlessly deploy custom models on Vertex AI.

With Cloud Deploy, you can seamlessly optimize the delivery of your ML models by eliminating manual processes, thus enhancing efficiency. Additionally, the Custom Target for Vertex AI within Cloud Deploy provides a reliable integration with Vertex AI for deploying models to your ML environments.

To conclude, Cloud Deploy guarantees a streamlined and regulated deployment process through its continuous delivery capabilities such as approvals, promotions, and rollbacks. By leveraging these features, you can elevate your MLOps capabilities to new heights.

What’s Next

Do you want to know more about Cloud Deploy Vertex AI Custom Target and how to use it to implement continuous deployment? Check out the following resources:

Documentation

Github samples

This article solely focuses on model delivery. It does not cover the delivery of machine learning (ML) pipelines. I encourage you to investigate building a custom target for delivering ML pipelines on Vertex AI Pipelines. Please let me know if you decide to implement it 😉

Thanks for reading

I hope you enjoyed the article. If so, please clap or leave your comments. Also let’s connect on LinkedIn or X to share feedback and questions 🤗

Special thanks to both Cloud Deploy and Vertex AI Model Registry teams for their collaboration, support and contribution. Also thanks to Federico Iezzi for his feedback!

--

--

Ivan Nardini
Google Cloud - Community

Developer relations engineer at @GoogleCloud who is passionate with Machine Learning Engineering. The Lead of MLOps.community’s Engineering Lab.