Train & Deploy a scikit-learn regression model on GCP VertexAI using Python SDK
In this article, I will show you how to perform the below on GCP Vertex AI .
- Train a random forest regressor model using VertexAI
- Upload the model artifact to Vertex AI Model registry
- Deploy the model to an endpoint for online inference
- Running batch inferences against the model artifact
And all of this using a few lines of simple Python code.
SOURCE CODE LINK : https://github.com/sidoncloud/gcp-use-cases/tree/main/gcp-vertexai-regression-example
NOTE: This article demonstrates how to train and deploy an end-to-end data science model on GCP Vertex AI. It focuses on getting started with Vertex AI on Google Cloud, emphasizing the training and deployment aspects rather than developing a model from scratch.
Step-1 : Setting up a Vertex AI Jupyter Workbench
As a first step, we will create a Workbench (Jupyter Notebook instance) on VertexAI which will let us execute all the necessary code on google cloud.
Start by heading over to Vertex AI from your GCP console. From the dashboard, select Workbench from the left navigation and select Create New.
We will create a simple workbench with Python 3 environment running on Debian operating system.
Once you’ve filled out the form options from the first step , click on Advanced Options at the bottom.
Here, we will just select a smaller instance which is e2-standard-2 and set the idle time shutdown to 30 minutes. This is important as the workbench will automatically shutdown in case of an idle time of 30 minutes or above.
Click on Create.
Give it a couple of minutes, you will soon have a Jupyter workbench up and running.
Step-2 : Model training code development
Note — We will not be executing this code in this step. The execution will happen on Vertex AI in the next step.
We are going to be developing a data science model using scikit-learn’s random forest regressor in order to predict the total count of bikes that will be rented out. The input dataset has been taken from the public UCI repository: https://archive.ics.uci.edu/dataset/275/bike+sharing+dataset.
Our output variable is called “cnt” which is what we will try to predict. Below is the code that reads the input CSV file from a gas bucket , applies transformation using pandas and performs model training.
NOTE: Make sure to change the bucketname and also upload the input dataset bikeshare.csv to your GCS bucket before saving the code and using it in the next step.
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestRegressor
from sklearn.metrics import mean_squared_error
from google.cloud import storage
from joblib import dump
from sklearn.pipeline import make_pipeline
storage_client = storage.Client()
bucket = storage_client.bucket(“bucket-name”)
def load_data(filename):
df = pd.read_csv(filename)
return df
def preprocess_data(df):
df = df.rename(columns={‘weathersit’:’weather’,
‘yr’:’year’,
‘mnth’:’month’,
‘hr’:’hour’,
‘hum’:’humidity’,
‘cnt’:’count’})
df = df.drop(columns=[‘instant’, ‘dteday’, ‘year’])
cols = [‘season’, ‘month’, ‘hour’, ‘holiday’, ‘weekday’, ‘workingday’, ‘weather’]
for col in cols:
df[col] = df[col].astype(‘category’)
df[‘count’] = np.log(df[‘count’])
df_oh = df.copy()
for col in cols:
df_oh = one_hot_encoding(df_oh, col)
X = df_oh.drop(columns=[‘atemp’, ‘windspeed’, ‘casual’, ‘registered’, ‘count’], axis=1)
y = df_oh[‘count’]
return X, y
def one_hot_encoding(data, column):
data = pd.concat([data, pd.get_dummies(data[column], prefix=column, drop_first=True)], axis=1)
data = data.drop([column], axis=1)
return data
def train_model(x_train, y_train):
model = RandomForestRegressor()
pipeline = make_pipeline(model)
pipeline.fit(x_train, y_train)
return pipeline
def save_model_artifact(pipeline):
artifact_name = ‘model.joblib’
dump(pipeline, artifact_name)
model_artifact = bucket.blob(‘model-artifact/’+artifact_name)
model_artifact.upload_from_filename(artifact_name)
def main():
filename = ‘gs://bucket-name/input-dataset/bikeshare.csv’
df = load_data(filename)
X, y = preprocess_data(df)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25, random_state=42)
pipeline = train_model(X_train, y_train)
y_pred = pipeline.predict(X_test)
save_model_artifact(pipeline)
rmse = np.sqrt(mean_squared_error(y_test, y_pred))
print(‘RMSE:’, rmse)
if __name__ == ‘__main__’:
main()
The output of this script is a model artifact joblib file which will be placed inside a GCS bucket inside the folder gs://your-bucket-name/model-artifact/model.joblib
Step-3 : Execute Model training on VertexAI compute resources
Now, we execute the model training code using Vertex AI Python SDK where we will just submit the model-training code from the previous step to Vertex AI training.
By the end of the execution , you should have a model.joblib file sitting inside your GCS bucket (as per the bucket name in the model training script ).
Start by opening up the jupyterlab on your workbench.
Once opened, upload the model-training-code.py script to your workbench.
Now lets start with VertexAI Python SDK. Open a new jupyter notebook and execute the below lines of code.
from google.cloud import aiplatform
project_id = “your-project-id”
region = “us-central1”
staging_bucket=”gs://your-bucket-name”aiplatform.init(project=project_id, location=region, staging_bucket=staging_bucket)
job = aiplatform.CustomTrainingJob(
display_name=”bikeshare-training-job”,
script_path=”model-training-code.py”,
container_uri=”us-docker.pkg.dev/vertex-ai/training/scikit-learn-cpu.0–23:latest”,
requirements=[“gcsfs”]
)job.run(
replica_count=1,
machine_type=”n1-standard-4",
sync=True
)
job.wait()
Look at the function aiplatform.CustomTrainingJob being invoked here. Below are the parameters passed to the function invokation.
1: It takes a display name which is a custom name
2: The script path which expects the model-training-code.py to be present in the same working directory (i assume you’ve uploaded the script)
3: Next is the container_uri which is a managed training container offered by google. We are using a prebuilt scikit-learn container here.
4: Finally, the requirements argument where we pass the gcsfs library as the code deals with the GCS buckets.
Upload the input csv file required for model training and go ahead and execute this code block inside the Jupyter nbk.
Once the code block is executed, navigate to the Training section under MODEL DEVELOPMENT on your Vertex AI dashboard. You should see the training is in progress.
This step will take less than 5 minutes to complete, upon successful completion, you will see the model.joblib file in your GCS bucket.
You have successfully implemented the model training part.
Step-4 : Upload the model artifact to Vertex AI Model Registry
Next, we will upload the model.joblib file to the model registry using the below lines of code.
The artifact_uri is the path where your joblib file resides. So make sure to change the path according to your bucket.
serving_container_image_uri is a prebuilt docker container for model serving.
We invoke Model.upload function by passing these values as arguments.
display_name = “bikeshare-model-sdk”
artifact_uri = “gs:///nl-datascience/model-artifact/”
serving_container_image_uri = “us-docker.pkg.dev/vertex-ai/prediction/sklearn-cpu.1–0:latest”model = aiplatform.Model.upload(
display_name=display_name,
artifact_uri=artifact_uri,
serving_container_image_uri=serving_container_image_uri,
sync=False
)
This code block should be executed right away and now, if you head over to Model Registry from your Vertex AI dasboard, you should be able to see the uploaded model.
Step-5 : Model Deployment to Vertex Endpoint
Finally, we deploy this model to an endpoint for online inference. We do that by running the below lines of code.
This code is quite simple to understand, so i wont explain every line of it. Basically we invoke the model.deploy method from the Vertex SDK. We pass the instance count (min and max) for model serving & the type of machine (n1-standard-4 is the smallest) used for hosting the model serving app.
deployed_model_display_name = “bikeshare-model-endpoint”
traffic_split = {“0”: 100}
machine_type = “n1-standard-4”
min_replica_count = 1
max_replica_count = 1endpoint = model.deploy(
deployed_model_display_name=deployed_model_display_name,
traffic_split=traffic_split,
machine_type=machine_type,
min_replica_count=min_replica_count,
max_replica_count=max_replica_count
)
As you can see from the output, it says this operation is “LRO” which stands for Long running operations. Unfortunately, the deployment will take anything between 20–45 minutes depending on your luck and day ( Hope the superstars of Google’s engineering team do something about this in near future).
So you can grab a coffee in the meantime ;-). You can monitor the output by going to the dashboard and selecting Online Prediction and then clicking on the endpoint name.
After 20–25 minutes:
NOTE: In case your Workbench times out or you lose connection, you dont need to execute all the code blocks again. Just run the below and you should be able to use your endpoint again after it has been successfully deployed.
endpoint = aiplatform.Endpoint('projects/{project-number}/locations/us-central1/endpoints/{endpoint-id}')
Step-6 : Run Inference against the deployed endpoint
Alright, we are almost done. Create a python list called INSTANCE which will hold some input values. These values correspond to all the attributes in the input dataset except the output variable cnt.
Now we invoke the endpoint by calling endpoint.predict method and passing a list of list to it.
INSTANCE = [0.24, 0.81, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 0, 1, 0, 0, 0, 0]
instances_list = [INSTANCE]
prediction = endpoint.predict(instances_list)
print(prediction)
You should see the predicted output against the input values which were passed.
You can try different input values to see the predicted output against them.
Congratulations, you have successfully implemented model training, model upload using model registry and model endpoint deployment.
Step-7 : Run Batch inference against the deployed model
Not always do you need to invoke the model and get the predicted output in real time. Often times, you might want to run batch inferences against bulk data and store the predicted values somewhere.
For batch inference, you dont need to invoke the endpoint at all. By this point, you can feel free to delete the created endpoint from the previous steps.
Now, to run batch inference you will first need the input data against which you want to run inference. So upload the csv file batch-data.csv to your GCS bucket.
Next, execute the below lines of code.
gcs_input_uri is where your input batch data resides.
BUCKET_URI is where the output of batch inference will be stored.
We will invoke the method model.batch_predict by passing the self-explanatory arguments below where the output of the batch inference execution will be a jsonl file.
Note: Make sure to change the bucket name according to that of yours before executing the below code.
gcs_input_uri = ‘gs://bucket-name/input-dataset/batch-data.csv’
BUCKET_URI = “gs://bucket-name/bikeshare-batch-prediction-output”
batch_predict_job = model.batch_predict(
job_display_name=”bikeshare_batch_predict”,
gcs_source=gcs_input_uri,
gcs_destination_prefix=BUCKET_URI,
instances_format=”csv”,
predictions_format=”jsonl”,
machine_type=”n1-standard-4",
starting_replica_count=1,
max_replica_count=1,
generate_explanation=True,
sync=False
)
The code block does not necessarily output anything but you can head over to Batch Predictions from the VertexAI console after executing this code block and you should be able to see your job.
Note: After a couple of minutes, the status should change to “Running”. The execution will take up to 30 minutes to complete.
Upon completion, you should see the output file (jsonl format ) inside the path: gs://bucket-name/bikeshare-batch-prediction-output
Step-8 : Clean-up
Make sure to delete the endpoint by “undeploying” it first and then deleting the endpoint.