How to Deploy AI and Computer Vision Containers on Google Cloud Run
Ever since its introduction, containerization has taken the industry by storm. Even giants like Google acknowledge that products ranging from “Gmail to YouTube to Search”*(1) run in containers now. Containerization simplifies distributed applications, cross deployments issues, and allows development teams to move fast and operate at an unprecedented scale.
At Trueface we looked at containers from day one as a way to distribute offline secure software to our clients with minimal deployment overhead. Below, I describe how to deploy general AI and computer vision containers (including Trueface Visionbox containers) on Google Cloud Run.
What is Google Cloud Run and why use it?
Cloud Run enables you to run request or event-driven stateless workloads without having to worry about servers. It allows you to go from container to production in seconds and offers zero scale support, which means that you pay nothing when the container is not used while being able to automatically scale and load balance with up to 1000 containers during peak times.
How to deploy a containerized API on Google Cloud Run:
Step 1 — Install the GCloud SDK & Configure Docker
Install GCloud SDK and login:
https://cloud.google.com/sdk/install
To login: gcloud auth login
After authenticating your GCloud installation, configure your Docker to work with GCR and GCloud: gcloud auth configure-docker
Step 2 — Download Trueface Agebox and Tag it
If you plan to use a Trueface Visionbox, make sure to request a test token from the Trueface team before continuing. Otherwise, feel to continue with the tutorial using any containerized Rest API you can deploy.
sudo docker pull trueface/agebox:latest
Tag the image with your Google Cloud project name to enable pushing it to GCR (Google Container Registry):
sudo docker tag trueface/agebox gcr.io/$project_name/agebox
Step 3 — Push the image
sudo docker push gcr.io/$project_name/agebox
Step 4 — Create A Cloud Run Service
- Navigate to the Google Cloud Console.
- Navigate to Cloud Run withing your Google Cloud project and Click “create service”
- Input container image URL
(gcr.io/$project/agebox)
- Select region and authentication type (allow unauthenticated for this demo)
- Set your token as an environment variable
Since you’ll be working with computationally intensive deep learning models, I recommend using 2GB of RAM for each deployment. Set the max number of instances; these represent the maximum number of instances/APIs GCR will spin to handle load increases. Google Cloud Run auto-scales from zero to 1,000 containers as mentioned previously, making it a more flexible and efficient solution relative to traditional instance rental.
Click Create.
That’s it. Once the service is done initializing you’ll get an API URL .
Test your API:
curl https://api_url/predict?url=https://image_url
Live demos:
Live Object Detection Running on Stateless Containers Using The SDK
Live Face detect + landmarks + headpose estimation with Stateless Containers (GCR)
Get the frontend used in the age demo above
Conclusion
Deploying high TPS AI & Computer Vision APIs (at scale) continues to get progressively easier with great solutions like GCR. Trueface Visionbox and other containerized solutions offer turnkey APIs you can self-host and deploy with minimal research, development, and maintenance overhead.
Nezare Chafni
CTO @ Trueface