Background process with GCP Cloud Tasks and it’s limitation

Khureltulga Dashdavaa
4 min readMay 8, 2022

--

Cloud Tasks is great service for simple background processing. However, it requires public endpoing from the workers and it has security issue.

Photo by Ilya Pavlov on Unsplash

Background process

Many projects require some kind of background processing feature in the core. It could be payment processes, email sending, or offloading time-consuming processes for a better customer experience. There could be 2 kinds of ways to invoke background processes which are scheduled (time-based batches) and event-based. As for scheduling, the trigger is periodically, or at some time. As for event-based, some user-involved event triggers the background processes.

GCP Services for a Background process

The classic traditional approach for the background process is basically to run compute engine and deploy code onto it. However, if the background process runs for only a short amount of time (several hours or less per day), it would be inefficient and costly. On the other hand, a serverless approach would be the better choice because we would be using resources only when it is necessary. GCP provides us with a service we can use for serverless background processes. Cloud Tasks, Cloud Scheduler, and Pub/Sub resources for invoke methods, and App Engine, Cloud Run, GKE, and others are for actual processes. In addition, GCP added features to monitor and control those services that make them easy to use.

In this article, I will only cover the building manual invoking method using Cloud Tasks with Cloud Run for batch process.

Cloud Tasks

Cloud Tasks works as a Message Broker. When some event pushes a message to Cloud Tasks, it makes sure to be processed through default or specified computing resource (worker).

Retry method

It has a retry method that reruns the process with the worker if the last execution had an error. You can set up how many times to retry. Cloud Tasks can have various types of resources as a worker. It can have Cloud Function, Cloud Run, Compute Engine, GKE, and even on-premises resources.

Invoke method

Now, Cloud Tasks can invoke workers through the HTTP protocol, and it makes it very flexible. So, we even can use outside compute resources as long as it provides an HTTP endpoint. If the worker returns 2XX, it assumes the background process finished successfully. On time out or other statuses, Cloud Tasks assumes the background process has an error.

Invoke control

We can set how many tasks will be running at the same time. Also, we can set how many tasks can be invoked per second at max. These configurations allow us to control tasks throughput and workers’ load.

Cloud Run

Cloud Run can be one of the workers of Cloud Tasks. I have written about Cloud Run and how to deploy Flask application to Cloud Run on this blog (Deploy Flask application on GCP Cloud Run).

Cloud Tasks + Cloud Run

Here (https://cloud.google.com/run/docs/triggering/using-tasks) is a nice article from GCP on how to deploy Cloud Tasks with Cloud Run. What I have struggled with building this system with Cloud Tasks and Cloud Run, is that Cloud Tasks could not send properly to Cloud Run. As a result, Cloud Tasks was assuming Cloud Run is failed and trying to retry it again and again.

My mistake was in the message body of the POST method.

  • Even though there is not necessary to add a message in the body, I needed to add a dummy body
  • As the example code to publish a message to Cloud Tasks shows, the message body must be encoded

Caveats of Cloud Tasks

Cloud Tasks is easy to use because it includes both Queue management and a Tasks handler. Compared to the SQS of AWS, we do not have to implement fetching/deleting/retrying messages on the worker. So all we need to do is just publish messages. And Cloud Tasks take care of the rest of the work.

However, Cloud Tasks require workers have a public endpoint to receive messages through HTTP protocol. This could make vulnerable workers outside. Even though we can add authentication methods like IAM or OAuth to the workers, it is still not enough security. In addition, Cloud Tasks and workers will be communicating on public network. So, we need to make request data as small as possible.

Summary

I have briefly wrote about what is background task and how it can be implemented on GCP. Cloud Tasks is simple to use and great service for usual cases. However, if the project require high level of security, like batch processes should not have public endpoint, may be Cloud Tasks is not better solution. In this case, I would recommend AWS actually.

--

--

Khureltulga Dashdavaa

Love infrastructure solutions on AWS, GCP. Working at an IT start-up as Software Engineer in Tokyo. Candidate for Master's Degree in CS from Osaka Univ.