Serving Machine Learning models with Google Vertex AI

Deploying and serving any kind of machine learning model at any scale.

Published in

Google Cloud - Community

9 min readAug 15, 2022

Companies frequently deploy their models to virtual machines (Google Compute Engine or even on-prem machines). This is something that should be avoided. Google Cloud provides a dedicated service called Vertex AI Endpoints to deploy your models.

Vertex AI Endpoint provides great flexibility compared with easy usage. You can keep it simple or go full in and customize it to your needs using custom containers.

A Google data center reimagined by Open AI’s DALL·E 2

This article covers everything needed to put your models into production and serve requests at a large scale. Including a large section on how to properly scale your models. And a few workarounds around the limitations of the service.

YouTube

Jump Directly to the Notebook and Code

All the code for this article is ready to use in a Google Colab notebook. If you have questions, please reach out to me via LinkedIn or…

Serving Machine Learning models with Google Vertex AI

Deploying and serving any kind of machine learning model at any scale.

YouTube

Jump Directly to the Notebook and Code

Written by Sascha Heyer