Deploying Machine Learning Apps on GCP with Fast API, Docker, Google Cloud Run and API Gateway

Sheng Chai
The Centre for Net Zero Tech Blog
6 min readSep 29, 2022

--

Our last blog post talked about the Faraday tool we’re working on. In this blog post, we will demonstrate how we’ve deployed our Faraday API on Google Cloud Platform, with some code examples to illustrate the key ideas. The three main process of deploying our API are:

  1. Creating an API using Fast API
  2. Dockerise the app and publish it to Google Artifact Registry
  3. Hosting the Docker container on Google Cloud Run and securing it behind Google API Gateway

1️⃣ Creating an API using Fast API

FastAPI is a modern web framework for building APIs with Python. With Fast API, you can quickly develop an app, and use the @app decorator to define the path and the operation (e.g. GET, POST etc).

Here is an example of a code snippet to create a Fast API app:

fastapidemo.py

Where Model is your Machine Learning model. This app has two methods:

  • A GEToperation at / which returns a generic welcome message to test for successful calls
  • A POST operation at /predict/ which accepts an input and returns and output from your model

More details of what Fast API is capable of is on Fast API’s website but here are some of its power features:

  1. Automatic API documentation generated using OpenAPI and SwaggerUI (or ReDoc)
  2. Support for Typing and Data Validation with typing and pydantic
  3. Autocompletions and type hints on most IDEs e.g. VS Code, PyCharm etc

2️⃣ Dockerise the app and publish it to Google Artifact Registry

Docker is the most popular way of wrapping apps in containers. In this step we will:

  1. Define a Dockerfile which is a recipe for building the Docker Image
  2. Host the image in a cloud repository — in this example we use Google Artifact Registry
  3. Our Cloud Run service will download the image and spin up a Docker container

Dockerfile

This Dockerfile copies all our code in our repo and puts it in a docker image. We can then run this build the docker image in our CI/CD pipeline and publish the docker image to Google Artifact Registry.

Running Docker Locally

To build the docker image and run the container locally:

  • docker build -t image_name → this builds a docker image based on the Dockerfile with the name image_name
  • docker run -p 8080:8080 image_name → this launches the Docker container with the image called image_name, mapping ports 8080 to 8080
  • Paste http://localhost:8080into your browser and you should see the welcome message from your FastAPI app

Publishing to Google Artifact Registry via Github Actions

This YAML snippet shows the steps in our Github actions that builds and pushes the docker image on to Google Artifact Registry using gcloud cli where: $ARTIFACT_REGISTY is the URL of our Google Artifact Registry.

docker_push.yaml

There are two main ways to host docker images on Google Cloud Platform:

We chose Google Artifact Registry at Centre for Net Zero for reasons outlined in this blog post from Google.

3️⃣ Deploying the container on Google Cloud Run

In this step, we will deploy our container on Google Cloud Run via Github Actions, then securing our Cloud Run service behind Google API Gateway with API keys. There are many ways of deploying an app on Google Cloud Platform:

  1. Google App Engine
  2. Google Cloud Run
  3. Google Kubernetes Engine
  4. Google Compute Engine

You can read more about the best type of infrastructure to run your application in this blog post from Google. We’ve chosen Google Cloud Run because it’s the simplest way of deploying containerised apps without the overhead of Kubernetes, and also provides a lot more flexibility compared to Google App Engine. We could also customise things like memory and CPU limits and autoscaling policies.

Deploying to Google Cloud Run using Google Github Action

Google provides a Github Action which we can easily use to deploy a container to Google Cloud Run.

cloud_run_deploy.yaml

Couple of notes on Google Cloud Run:

  1. Google Cloud Run requires that your container listens to 0.0.0.0 on port 8080. This is why in our app.py, we have instructed uvicorn to host the app on 0.0.0.0 on port 8080.
  2. By default, Google Cloud Run services are private i.e. not accessible by the general public. We could change this via IAM to allow unauthenticated access, but a more secure way would be to put it behind API Gateway.

To put a Google Cloud Run Service behind API gateway, you will need:

  1. OpenAPI definition
  2. API Config

Recall that FastAPI comes with automated documentation. You can access the OpenAPI spec file from your FastAPI app by calling /openapi.json endpoint (see here). This will generate an OpenAPI 3.0 JSON spec of your endpoint.

Currently, Google API Gateway only supports Swagger 2.0 spec in YAML file. You’ll need to convert OpenAPI 3.0 JSON spec from your Fast API app. There are tools online that you could use, but we wrote a custom parser at CNZ.

api_config.yaml

Couple of things to note about the api_config.yaml file

  1. cloud_run_service_url: This is the URL of the cloud run service. You can find this URL via the GCP console.
  2. You should explicitly define each endpoint you want to expose to end users in this YAML file. For instance if your API has an endpoint called endpoint1/ that is not described in this YAML file, it will not be accessible by your end users.

Securing Google Cloud Run Service behind API Gateway with API Keys

The two main ways of restricting access to APIs are via:

  1. API Keys: users given a key which grants them access to the application
  2. JSON Web Tokens (JWT): requires users to authenticate to gain access to an application

Whilst API Keys provide an easy way to restrict access to APIs, using API Keys isn’t ‘secure’ and is not considered a form of authentication. Any one with access to the API key can access the application, and we can only monitor which API keys are being used. We cannot identify which users are accessing our application.

Google API Gateway allows authentication via different mechanisms. The pros of using API keys however are that admins can easily create and administer API keys, and there’s no need to build any user log in mechanisms to generate the JWTs for JWT-based authentication.

To enable API KEY for your API Gateway, you just need to add the security tags definitions to your Swagger 2.0 spec (see Swagger docs).

Deploy API config to Google API Gateway

Unfortunately there are no open-sourced Github actions to deploy API configs to API Gateway like we do for deploying Docker images to Google Cloud Run. We’ll have to do this using gcloud commands. To deploy API configs to API Gateway:

  1. Create API Config: This step creates an API config based on the api_config.yaml we’ve created when converting FastAPI’s OpenAPI 3.0 JSON spec to the Swagger 2.0 YAML spec.
  2. Deploy API Gateway: This step creates an API Gateway and deploys the config to the API Gateway.
  3. Enabling the API Gateway: Once config is deployed, we need to enable the API Gateway. To do this in the same Github workflow, we can grab the MANAGED_SERVICE_URL from the previous step using sed on the command line outputs.

For a full run-down of the Github action YAML file, check out our cnz-tech-blog Github repository.

Creating API Keys for your users

You can then create API Keys manually using the Console by going to:

API & services > Credentials > Create Credentials > API Key

⚠️ You should also restrict your API Key to only access the cloud run service that you’ve created, so that users with the API Key can only call your API service and nothing else in your GCP project.

End users can supply the API Key in the header when they call the endpoint like so:

The URL here is the default URL that API Gateway creates. We could customise this however using Load Balancers. We’ll cover this in a future blog post.

And there you go! You’ve learnt how to deploy your Machine Learning app on Google Cloud Platform using Fast API and Docker, and securing it with an API Key.

--

--