Cloud Functions & Cloud Tasks for Controlled Throughput

Sandeep Manocha
Google Cloud - Community
3 min readJul 2, 2023

Lot of Google Cloud APIs have Quota and Limits. In this example I am going to pick Document AI Warehouse service to elaborate this solution, but you can expand this solutions for other similar problem spaces. My intention in this blog is not to introduce any specific product, but to focus on handling API quotas and limits using Cloud Functions and Cloud Tasks.

Lets first define the problem we solved. We were working on project to upload files to Document AI Warehouse from on-prem server as part of migration. But there was limit of 500 QPS imposed for using the upload API.

After some analysis for we listed ted following characteristics of the problem and solution

  1. Each file is an independent unit of work and success or failure of each individual upload is not affecting other
  2. Each single file in Document AI Warehouse is a combination of two files, actual data file and a manifest file which contains business properties of the files
  3. Data files and manifest files can come in any order
  4. Upload activity is back-end asynchronous process
  5. Transfer to GCS can be more than 500 QPS but upload to Document AI Warehouse should stay below 500
  6. Once the upload starts, there should not be any human-in-loop portion
  7. Track the status of each file so that we can reconcile the files
  8. Need logging and monitoring

Solution

To solve this problem we developed following solution.

To describe the full solution with code examples would need multi blog series. But here is simple explanation, we used TSOP service to send files from on-premise servers to Google Cloud Storage (GCS) bucket.

As soon as files arrives in this bucket, that would trigger log-file-entry function, which would do three things, one made an entry into Cloud SQL about which file was received, manifest or data file, seconds if received both files create a cloud tasks and finally third update Cloud SQL with READY status.

Now Cloud Tasks would call load-file function, which read the data files and manifest file from the GCS bucket and uploads that to Document AI Warehouse and also update status in Cloud SQL to COMPLETED.

During the whole process, we collected stats and routed them to Big Query for analysis and reconciliation. We also exported GCS events to Big Query to be used in the reconciliation process, to make sure there is no file unaccounted for.

Lets focus on the portion, which is unique to this solution which is about controlled QPS.

To control how many Rest API hits we can make to upload end-point, we used Cloud Tasks. In this we started with single Queue with Max throughput to 500. This was as simple as it can go. Refer this example to see how easy it is to code this out.

Trick with Cloud Tasks

We realized that Cloud Tasks does not scale as fast as we wanted or in other words it not accelerating to its full throughput. To solve this problem we ended up creating four queues rather than one, each with maximum throughput set at 125. We tried many combinations, before settling to these numbers. So when you try similar solution, try different combinations of queues and jobs in a queue.

Conclusion

Although we used Cloud Tasks with Cloud Functions, but you can replicate similar solution for any system where you are calling a Rest API end-point. For e.g. you can create custom Rest API end-point in App Engine or Kubernetes and call that using Cloud Tasks.

--

--