Reducing your Google Pub/Sub costs over 95% by micro-batching with Google Cloud Storage

Using streams to separate components logic

Currently even the smallest applications can find themselves having to connect multiple components, either through API calls or queues. Many situations that require control flow or asynchronous communication might need streams or queues, and when working on GCP (Google Cloud Platform) you can take advantage of the managed service Google Pub/Sub as an event ingestion and delivery system.

To show a simple example, we could imagine the situation where we would want to index logs into Elasticsearch. We wouldn’t want to have the syslog servers (which would collect the logs) to also run the same logic which indexes the logs (we would need to deal then with local queues in the Syslog server if Elasticsearch was down or it’s performance impact among others). For this we will be using a different set of boxes which will run Logstash (which would write the logs to Elasticsearch) and Google Pub/Sub will allow us to decouple the logic between those and the Syslog servers.

In this case we would be considering the following load:

  • Amount of logs sent per second: 10000
  • Size of each log (after compression): 5kb
How the setup might look like to get the logs from syslog servers into Elasticsearch.

How does this affect your budget?

As with any cloud deployment, calculating the pricing cost becomes straightforward based on the information provided by the cloud vendor. In this case, since there is no cost associated for the network traffic, we would only need to calculate the cost of using Pub/Sub to send the messages.

You can find the Pub/Sub pricing here. We will be using their new pricing model that starts on June to simplify the calculation that you can find here.

In this scenario we would have about 50MiB per second sent to Pub/Sub, which ends up being 4TiB per day or 120TiB per month. This means our Pub/Sub cost would sum up to around 4800 dollars per month or 57600 per year. This cost would keep increasing linearly so if we were to send 10 times the amount of data, we would expect 10 times the cost increase.

Micro batching in GCP

While the previous example might be acceptable for medium to large companies, as the amount of data grows (either the size or amount of logs per second) the cost would grow with it. Now, if we started sending 100 thousand logs per second, we would expect the cost to grow also to half a million dollars per year. A way to dramatically reduce the cost would be to change the architecture from directly streaming the logs through Pub/Sub to working with micro batches by taking advantage of the lack of regional network transfer cost in Google Cloud Storage (GCS).

Micro batching changes the way we would work by no longer working on single logs but small batches of them. Syslog servers in this example would create files of 1000 logs and write them to GCS (about 5mb per file), while writing the GCS filenames into Pub/Sub. Logstash servers would now read from Pub/Sub new objects and request them from GCS to be processed. The added latency can vary depending on its implementation and underlying applications (in my experience adding from 2 to 5 extra seconds of delay).

Here you can find the code for an example container that runs python script which reads new filename entries from Pub/Sub, requesting the file contents from GCS and sending them to Logstash through local TCP socket.

How the setup would change if we were using micro batching. In this case there are extra calls to Google Cloud Storage to store objects and retrieve them after getting their location from Pub/Sub.

Reducing your cost with micro batching

In the micro batching architecture, we would need to consider the following costs:


In this architecture Pub/Sub would only be used to send and receive GCS filenames (with a randomised prefix to avoid hot spots), so we can estimate 10 filenames of up to 100 bytes per second, which would become around 2.5GiB per month and cost 10 cents per month.

Google Cloud Storage

In this case we would be storing in GCS regional storage the batches and keeping them for up to one day. For this we need to consider the write operations which create the files, the read operations that retrieve it and the cost of storing the files for one day.

Write operations to GCS
This would be a class A operation with a cost of 0.05 dollars every 10 thousand operations.
Write operations: 10 per second / 25920000 per month.
Cost of write operations per month: ±130 dollars

Read operations to GCS
This would be a class B operation with a cost of 0.004 dollars every 10 thousand operations.
Read operations: 10 per second / 25920000 per month.
Cost of read operations per month: ±11 dollars

Pricing: $0.020 per GB per Month, for one day it should be 0.0006 cents
Amount to store: 120TiB
Storage cost per month: ±74 dollars

In this case the micro batching strategy would cost about 215 dollars per month or 2580 per year. This would be mean more than a 95% monthly cost reduction against the streaming model described before by adding less than 5 seconds of extra latency. If the latency you are getting with this model could be increased even further, you could decrease the cost from using Pub/Sub ( decreasing the read and write operations to Google Cloud Storage) by increasing the batch size (which can also improve the compression of the logs and decrease the storage cost). The cost can also be reduced if you are willing to keep less than one day of data available at any time (reducing the GCS storage cost).

When scaling the solution we just need to take into account that if we go above 1000 write requests per second (so about 100x of the example in this article) you would need to look into the best practices from Google on scaling GCS found here. Also from the Logstash or Syslog perspective, there would be some additional processing but not enough to have a noticeable increment on the budget.

While this post focuses on Google Cloud’s Pub/Sub, the same strategy could be used with other cloud services such as SQS & S3, or even in your local DC by placing less load on a Kafka Cluster and storing files in HDFS (with a much larger impact on latency than the other counterparts) for example.


When architecting solutions we should always strive for simplicity and to get the best performance possible, but the tradeoffs of complexity and performance impact should be balanced with their associated cost impact. If you happen to have a solution that requires a large amount of data to be streamed and can accept a couple of extra seconds of latency, micro batching can drastically reduce your required budget.

Let me know if you are doing something similar and your experience, or any comments on this approach!

Example code for micro batching with Pub/Sub, GCS and Logstash: