How to Deploy AI and Computer Vision Containers on Google Cloud Run

Published in

Trueface

3 min readJul 10, 2020

Demo of Agebox running with sample frontend

Ever since its introduction, containerization has taken the industry by storm. Even giants like Google acknowledge that products ranging from “Gmail to YouTube to Search”*(1) run in containers now. Containerization simplifies distributed applications, cross deployments issues, and allows development teams to move fast and operate at an unprecedented scale.

At Trueface we looked at containers from day one as a way to distribute offline secure software to our clients with minimal deployment overhead. Below, I describe how to deploy general AI and computer vision containers (including Trueface Visionbox containers) on Google Cloud Run.

What is Google Cloud Run and why use it?

Cloud Run enables you to run request or event-driven stateless workloads without having to worry about servers. It allows you to go from container to production in seconds and offers zero scale support, which means that you pay nothing when the container is not used while being able to automatically scale and load balance with up to 1000 containers during peak times.

How to deploy a containerized API on Google Cloud Run:

Step 1 — Install the GCloud SDK & Configure Docker

Install GCloud SDK and login:

https://cloud.google.com/sdk/install

To login: gcloud auth login

After authenticating your GCloud installation, configure your Docker to work with GCR and GCloud: gcloud auth configure-docker

Step 2 — Download Trueface Agebox and Tag it

If you plan to use a Trueface Visionbox, make sure to request a test token from the Trueface team before continuing. Otherwise, feel to continue with the tutorial using any containerized Rest API you can deploy.

sudo docker pull trueface/agebox:latest

Tag the image with your Google Cloud project name to enable pushing it to GCR (Google Container Registry):

sudo docker tag trueface/agebox gcr.io/$project_name/agebox

Step 3 — Push the image

sudo docker push gcr.io/$project_name/agebox

Step 4 — Create A Cloud Run Service

Navigate to the Google Cloud Console.
Navigate to Cloud Run withing your Google Cloud project and Click “create service”
Input container image URL (gcr.io/$project/agebox)
Select region and authentication type (allow unauthenticated for this demo)
Set your token as an environment variable

Since you’ll be working with computationally intensive deep learning models, I recommend using 2GB of RAM for each deployment. Set the max number of instances; these represent the maximum number of instances/APIs GCR will spin to handle load increases. Google Cloud Run auto-scales from zero to 1,000 containers as mentioned previously, making it a more flexible and efficient solution relative to traditional instance rental.

Setting a Token as an environment variable

Click Create.

That’s it. Once the service is done initializing you’ll get an API URL .

Test your API:

curl https://api_url/predict?url=https://image_url

Live demos:

Age Detection Demo

Live Object Detection Running on Stateless Containers Using The SDK

Live Face detect + landmarks + headpose estimation with Stateless Containers (GCR)

Get the frontend used in the age demo above

Conclusion

Deploying high TPS AI & Computer Vision APIs (at scale) continues to get progressively easier with great solutions like GCR. Trueface Visionbox and other containerized solutions offer turnkey APIs you can self-host and deploy with minimal research, development, and maintenance overhead.

Nezare Chafni

CTO @ Trueface

How to Deploy AI and Computer Vision Containers on Google Cloud Run

What is Google Cloud Run and why use it?

How to deploy a containerized API on Google Cloud Run:

Live demos:

Conclusion

Written by Nezare Chafni