Deploy Scalable Machine Learning Model API using Artifact Registry and Cloud Run
This article provides a step by step on how to deploy machine learning API using FastAPI and GCP Stack.
Please make sure that you have enabled these GCP Services:
- Cloud Artifacts
- Cloud Build
- Cloud Run
- Cloud Storage
Dependencies:
- google-cloud-cli. installation
- Docker Desktop [Windows] [Mac] [Linux]
pip install -r requirements.txt
This article assumes that you already have a trained model and for the sake of demostration, skeleton repo was used and I have made some modification. Please make a clone to your local machine
git clone https://github.com/andreaschandra/aomori
Cloud Artifacts: Build Docker Image
Create environment variables in your terminal
export PROJECT_ID=$(gcloud config get-value project)
export PROJECT_NUMBER=$(gcloud projects describe $PROJECT_ID --format='value(projectNumber)')
Create artifact repository
Now, we need to create artifact repository to store docker image. We set repository format as docker and the location of the repository is in asia-southeast1 which means Singapore.
gcloud artifacts repositories create aomori --repository-format=docker --location=asia-southeast1 --description="FastAPI Skeleton"
Build image
To build an image, we can use command gcloud builds submit. We set the region to asia-southeast1. We store and tag the docker image to aomori repository with docker image name aomori and tag latest. This tag latest will automatically replace with the new one.
don’t forget to fill PROJECT_NAME
gcloud builds submit --region=asia-southeast1 --tag asia-southeast1-docker.pkg.dev/PROJECT_NAME/aomori/aomori:latest
Another process also occured in the google cloud build, success or failure for building an image. Check the details to see what happen inside.
Cloud Run: Deploy Container
gcloud run deploy aomori --image asia-southeast1-docker.pkg.dev/jakartaresearch/aomori/aomori:latest --region asia-southeast1 --platform managed \
--args aomori.main:app,--host,0.0.0.0,--port,8080 --cpu 1 --memory 256Mi --timeout 300 --concurrency 80 \
--set-env-vars IS_DEBUG=False,API_KEY=1103371a-e057-4874-b5b9-e96417c711f3,DEFAULT_MODEL_PATH=./sample_model/lin_reg_california_housing_model.joblib \
--allow-unauthenticated
Load Testing
Locust was used for HTTP load testing. This tool measures API performance regarding HTTP Request and Response Time.
Conclusion
Hope this step by step help you to learn machine learning deployment.