ML from Local to Production (Vertex AI)

5 min readJun 8, 2022

Moving from testing ML locally to production is easy!, This step-by-step guide will walk you through the process of doing ML from scratch and hand it over to the cloud.

If you are like me and hate reading to do easy stuff, just go to “the code”!

About Data:

“Automobile miles per gallon” data was taken from the StatLib library which is maintained at Carnegie Mellon University.
Label (what you predict): MPG (Miles per Gallon)
Features (what you use to predict): cylinders, displacement, horsepower, weight, acceleration, model year, origin.

Frameworks:

Pandas
Tensorflow [keras]
Docker
fastapi https://fastapi.tiangolo.com/

5. gcloud SDK + Vertex

First off we’ll use Google Cloud (why not?).

Notes:

Get ready!, We’ll use bash / your Mac, Linux terminal, etc… Because I’m lazy.
We’ll encapsulate all with containers, because they are portable!.

Security Comes First:

Install google cloud sdk in your terminal and authenticate yourself using Application Default Credentials.

If you want to know more about security behind containers, I loved this story.

This is the way (run these steps in your terminal):

gcloud auth login
gcloud auth application-default login
gcloud auth configure-docker us-central1-docker.pkg.dev

Both commands trigger an OAuth2 authentication flow, synchronous or asynchronous, and store a refresh token locally. Thanks guillaume blaquiere

Local Topology

Set your variables friend!

Check your variables (this might help), we need Google Cloud Project ID, storage (bucket) and the username.

REGION=something  # [us-central1]                                                            
PROJECT_ID=something                                                          
BUCKET=something  # Name of the bucket @string                                                          
BUCKET_FOLDER_ARTIFACTS=$BUCKET/something                                     
USERNAME=something                                                            
IMAGE_URI=$REGION-docker.pkg.dev/$PROJECT_ID/repo-models/something                
ADC=/home/$USERNAME/.config/gcloud/application_default_credentials.json

First: Create the training code, wrap it in a container and store the model in google cloud storage.

Create your bucket

What the hell is a bucket?, Buckets are basic containers to store your data.

gsutil mb -l $REGION $BUCKET

Create folder structure:

if [ ! -d train ]; then
   mkdir train;
fi
cd train

Create DockerFile:

huh?, dockerfile is a declarative document/file to create a docker container.

cat << EOF > Dockerfile
FROM gcr.io/deeplearning-platform-release/tf2-cpu.2-6
WORKDIR /# Copies the trainer code to the docker image.
COPY trainer /trainer# Sets up the entry point to invoke the trainer.
ENTRYPOINT ["python", "-m", "trainer.train"]
EOF

Create Code for Machine Learning Training (train.py):

It’s ugly I know, but these are the steps.

Read miles per gallon data.
Clean and normalize it.
Build the neural network.
Train the model and store it.

cat << EOF > trainer/train.py
import warnings
import pandas as pd
import tensorflow as tfwarnings.filterwarnings('ignore')from tensorflow import keras
from tensorflow.keras import layersprint(tf.__version__)BUCKET = '$BUCKET_FOLDER_ARTIFACTS'# Extraction process
dataset = pd.read_csv('https://storage.googleapis.com/jchavezar-public-datasets/auto-mpg.csv')
dataset.tail()dataset.isna().sum()
dataset = dataset.dropna()
dataset['Origin'] = dataset['Origin'].map({1: 'USA', 2: 'Europe', 3: 'Japan'})
dataset = pd.get_dummies(dataset, prefix='', prefix_sep='')
dataset.tail()train_dataset = dataset.sample(frac=0.8, random_state=0)
test_dataset = dataset.drop(train_dataset.index)train_stats = train_dataset.describe()
train_stats.pop("MPG")
train_stats = train_stats.transpose()train_labels = train_dataset.pop('MPG')
test_labels = test_dataset.pop('MPG')def norm(x):
    return (x - train_stats['mean']) / train_stats['std']normed_train_data = norm(train_dataset)
normed_test_data = norm(test_dataset)def build_model():
    model_ai = keras.Sequential([
        layers.Dense(64, activation='relu', input_shape=[len(train_dataset.keys())]),
        layers.Dense(64, activation='relu'),
        layers.Dense(1)
    ])
    optimizer = tf.keras.optimizers.RMSprop(0.001)
    model_ai.compile(loss='mse',
                     optimizer=optimizer,
                     metrics=['mae', 'mse'])
    return model_aimodel = build_model()
model.summary()
EPOCHS = 1000# The patience parameter is the amount of epochs to check for improvement
early_stop = keras.callbacks.EarlyStopping(monitor='val_loss', patience=10)
early_history = model.fit(normed_train_data, train_labels,
                          epochs=EPOCHS, validation_split=0.2,
                          callbacks=[early_stop])# Export model and save to GCS
print(BUCKET)
model.save(BUCKET)
EOF

Build model and run it locally:

docker build -t train .docker run -ti --name train -e GOOGLE_APPLICATION_CREDENTIALS=/tmp/temp.json -v ${ADC}:/tmp/temp.json train

Awesome, you have trained your first machine learning model and store it in the Google Cloud Storage.

Second: Create the web-server that handles prediction, wrap it and test it locally.

Preparing stage (create your folders):

cd ..
if [ ! -d prediction ]; then
   mkdir prediction;
fi
cd predictionif [ ! -d app ]; then
   mkdir app;
fi

Create DockerFile

cat << EOF > Dockerfile
FROM tiangolo/uvicorn-gunicorn-fastapi:python3.7

COPY app /app
WORKDIR /app
RUN pip install sklearn joblib pandas tensorflow
CMD ["uvicorn", "main:app", "--host", "0.0.0.0", "--port", "8080"]

EXPOSE 8080
EOF

Create Logic Behind Web server

cat << EOF > app/main.py
from fastapi import Request, FastAPI
from tensorflow import keras
import json
import osapp = FastAPI()if os.environ.get('AIP_STORAGE_URI') is not None:
    BUCKET = os.environ['AIP_STORAGE_URI']
else:
    BUCKET = '$BUCKET_FOLDER_ARTIFACTS'
print(BUCKET)model = keras.models.load_model(BUCKET)@app.get('/')
def get_root():
    return {'message': 'Welcome mpg API: miles per gallon prediction'}@app.get('/health_check')
def health():
    return 200if os.environ.get('AIP_PREDICT_ROUTE') is not None:
    method = os.environ['AIP_PREDICT_ROUTE']
else:
    method = '/predict'print(method)
@app.post(method)
async def predict(request: Request):
    print("----------------- PREDICTING -----------------")
    body = await request.json()
    instances = body["instances"]
    outputs = model.predict(instances)
    response = outputs.tolist()
    print("----------------- OUTPUTS -----------------")
    return {"predictions": response}
EOF

Create a Container Repo

You should be authenticated with GCP by now, so let’s use the SDK:

gcloud artifacts repositories create repo-models --repository-format=docker \
--location=$REGION --description="Models repository"

Build / Tag Container

docker build -t $IMAGE_URI .

Run Container Locally

docker run --name predict \
  -e GOOGLE_APPLICATION_CREDENTIALS=/tmp/keys/FILE_NAME.json \
  -v ${ADC}:/tmp/keys/FILE_NAME.json \
  -p 732:8080 $IMAGE_URI

You can break it down with Ctrl+C.

For predictions, open a new terminal an make an http request with the data in json format:

curl -X POST -H "Content-Type: application/json" http://localhost:732/predict -d '{
 "instances": [[1.4838871833555929,
 1.8659883497083019,
 2.234620276849616,
 1.0187816540094903,
 -2.530890710602246,
 -1.6046416850441676,
 -0.4651483719733302,
 -0.4952254087173721,
 0.7746763768735953]]
}'

DONE

You have trained, deployed and tested your first prediction engine.

Wait, that’s it?, Nope…

If you want to take your model to the next level (production), follow next steps.

Third: Upload and Deploy on Vertex Endpoints.

Again, VARIABLES:

ENDPOINT_NAME=something               # Endpoint's name
MODEL_NAME=something                  # Model's name
DEPLOYED_MODEL_NAME=something         # Deployed model's name
MACHINE_TYPE=n1-standard-2            # Machine type (more types)

Push the Local Image Tested

docker push $IMAGE_URI

Upload your Model

gcloud ai models upload \
  --region=$REGION \
  --display-name=$MODEL_NAME \
  --container-image-uri=$IMAGE_URI \
  --container-ports=8080 \
  --container-health-route=/health_check \
  --artifact-uri=$BUCKET_FOLDER_ARTIFACTS

Create the Endpoint

Wait, what for?, because with endpoints you can deploy multiple model versions without losing the URL. (smart)

gcloud ai endpoints create \
  --display-name=$ENDPOINT_NAME \
  --region=$REGION

List Model and Endpoint (again I’m lazy).

MODEL_ID=$(gcloud ai models list \
  --region=$REGION \
  --filter=displayName:$MODEL_NAME \
  --format='value(name)')ENDPOINT_ID=$(gcloud ai endpoints list \
  --region=$REGION \
  --filter=displayName:$ENDPOINT_NAME \
  --format='value(name)')

Deploy Endpoint

gcloud ai endpoints deploy-model $ENDPOINT_ID\
  --region=$REGION \
  --model=$MODEL_ID \
  --display-name=$DEPLOYED_MODEL_NAME \
  --machine-type=$MACHINE_TYPE \
  --min-replica-count=1 \
  --max-replica-count=1 \
  --traffic-split=0=100

Test it, Test it:

Are you bored of CLI?
Check this out, is the graphical way.