Send Messages From Pub/Sub To BigQuery Cheaper with Cloud Run

Martin Beranek
Ackee
Published in
2 min readApr 19, 2022

WARNING!

On 28.8.22, GCP introduced a new option for pushing the messages directly to BigQuery. It is available at https://cloud.google.com/blog/products/data-analytics/pub-sub-launches-direct-path-to-bigquery-for-streaming-analytics If you are interested in customization of the data transfer, this blog post might still be useful.

The last time I wrote about this topic, I mentioned how expensive is the Dataflow template for consuming messages from Pub/Sub subscription to the BigQuery table. My solution was to use Cloud Function to subscribe itself to the Pub/Sub topic. This works fine for most cases. We reached the limit of the maximum size of the submitted message. That is for Cloud Function only 10 MB.

For some reason, we had issues with messages sized only 300 KB. The log looks like this:

Function execution could not start, status: 'request too large'

The log does not say how large the message was, and it is hard to determine the actual size. Is there anything you can do about the issue? Currently, it seems there is no way how to enlarge to message size. The only thing you can do is to use a different consumer.

For this reason, I have prepared the implementation of the same thing with Cloud Run. The maximum message size for Cloud Run is 32 MB. The implementation uses OIDC token for Service Account authorization. Therefore, it is not open to the public and could be considered safe as using Cloud Function.

The terraform code to enable SA usage looks like this:

resource "google_pubsub_subscription" "default" {
name = "pubsub_to_bq_${split(".", var.bigquery_table)[2]}_${lower(random_string.random.result)}"
topic = var.topic_name
push_config {
oidc_token {
service_account_email = google_service_account.sa.email
}
push_endpoint = one(google_cloud_run_service.default.status)["url"]
}
}

One last thing is that you have to provide the docker image by yourself and push it to the gcr.io registry. GCP does not allow executing images from the docker hub. I also checked possible billing increases, but you might use only the free tier for such a short runtime of a container.

Hopefully, it is a terraform you might enjoy, and if you are interested, it’s at GitHub.

Originally published at https://www.ackee.agency on April 19, 2022.

--

--

Martin Beranek
Ackee
Writer for

I am an Infra Team Lead at Shipmonk. My interest is Terraform mainly in GCP. I am also enthusiastic about backend and related topics: Golang, Typescript, ...