Use AI Platform for functions that take longer than a couple of minutes

Lak Lakshmanan
Sep 30 · 2 min read

Cloud Functions, Cloud Run, AppEngine, etc. are not a good choice for long-lived functions, i.e., anything that takes longer than a couple of minutes to run (the services themselves impose a limit of 10 or 15 minutes, but that includes errors and retries, so your goal should be 2–3 minutes maximum). If you want to run a function that will take longer than this, what are your options?

What if you want to run a long-running batch job in a serverless way?

Put your code in a Docker container. Run it using AI Platform. Schedule it using Cloud Scheduler.

Custom containers in AI Platform Training

You can use AI Platform Training to run any arbitrary Docker container — it doesn’t have to be a machine learning job. To have some arbitrary container executed on a GPU, you’d just do:

gcloud ai-platform jobs submit training gpu_function \
--scale-tier BASIC_GPU \
--region $REGION \
--master-image-uri gcr.io/$PROJECT_ID/some-image-name

This is just a REST API, so you have a variety of client libraries in a bunch of programming languages to invoke this from. There are no requirements for the container — just that it needs to have an entry point and that it is published in the container registry. It is possible to use custom machine types — see the documentation for details.

Concurrent autoscaling?

Being able to launch a custom container on a job-specific cluster satisfies a number of use cases for serverless functions. But not all of them. Specifically, another use case for serverless functions is concurrent autoscaling — we want to be able to receive multiple requests, and route them to the same machine and once that machine starts to get overwhelmed, we’d like to add more machines. If you need concurrent autoscaling and your tasks last longer than 2–3 minutes, the AI Platform Training solution will not work. You’ll need Kubernetes in that case, and it won’t be serverless.

Google Cloud Platform - Community

A collection of technical articles published or curated by Google Cloud Platform Developer Advocates. The views expressed are those of the authors and don't necessarily reflect those of Google.

Lak Lakshmanan

Written by

Professional Services @ Google

Google Cloud Platform - Community

A collection of technical articles published or curated by Google Cloud Platform Developer Advocates. The views expressed are those of the authors and don't necessarily reflect those of Google.

Welcome to a place where words matter. On Medium, smart voices and original ideas take center stage - with no ads in sight. Watch
Follow all the topics you care about, and we’ll deliver the best stories for you to your homepage and inbox. Explore
Get unlimited access to the best stories on Medium — and support writers while you’re at it. Just $5/month. Upgrade