Things I wish I knew about Google Cloud Pub/Sub: Part 2

Published in

Google Cloud - Community

5 min readDec 11, 2020

This is a continuation of our three part series in providing useful tidbits in using Pub/Sub effectively. In Part 1, Megan Potter gave a high level overview of Pub/Sub’s concepts and how to call the Pub/Sub APIs. This post will dive a bit deeper in the lifecycle of a message as well as settings needed to optimize your application through publisher and subscriber settings when using the official client libraries.

What is the lifecycle of a message in Pub/Sub?

A key part of using Pub/Sub effectively is understanding how a message travels from the moment a message is published to when it is consumed.

First, a publisher sends a message to the topic. Once the Pub/Sub server receives the message, it adds fields to the message such as publish timestamp and message ID. Pub/Sub delivers this message to all subscriptions. From each subscription, subscribers pull messages via one of two pull calls (more on this in a bit). Each subscription has an acknowledgement deadline, which determines how much time is allocated for the client to acknowledge (ack) this message before the server redelivers it. If the client holds the message beyond the agreed deadline without acking, or if the client sends a nack (negative acknowledgement) request, the message will be redelivered (potentially to a different subscriber on the same subscription).

Typical subscriber flow for receiving messages

Sometimes, the acknowledgement deadline is not enough time for a message to be acked by the client, so a ModifyAckDeadline request is used to extend the deadline. With the official client libraries, this will be handled transparently under the covers, which optimizes the ack deadline extensions with a frequency distribution model based on previous processing times.

What publisher settings are available, and what do they do?

Our Pub/Sub client libraries provide many configuration options to control the publishing of messages, including batching functionality.

Batch publishing helps decrease overhead by publishing multiple messages as part of a single Publish RPC. When you call Publish using the official client libraries, your message is not published right away, but instead added to a batcher to be published alongside other messages. You can configure this behavior using the following:

The maximum number of messages per batch
The maximum size of a batch in bytes
How long to hold on to a batch of messages before sending if a batch is not yet filled (delay).

Increasing the number of messages per batch or size of a batch decreases the number of publish requests you need to make, which translates to higher throughput per publish request. However, you may face increased latency (time between when publish is called in the client and when that publish is acknowledged by the server) if you are not filling your batches fast enough. Conversely, while decreasing batch delay can help with latency, your publish requests might end up queueing if they can’t be pushed out fast enough (lower throughput).

What does acknowledging a message mean? Why do duplicates happen?

A key part of using Pub/Sub effectively is understanding the lifecycle of a message.

First, a publisher sends a message to the topic. Once the Pub/Sub server receives the message, it adds fields to the message such as publish timestamps and message IDs. If a subscription exists for the topic prior to the message is received by the topic, subscribers can pull this message from the subscription. A predetermined amount of time (this defaults to 10 seconds) is allocated for the client to acknowledge (ack) this message before the server redelivers it. This is known as the acknowledgement deadline. If the client holds the message beyond the agreed deadline without acking, or if the client sends a Nack (negative acknowledged) request, the message will be redelivered (potentially to a different subscriber if you have multiple set up).

Sometimes, the acknowledgement deadline is not enough time for a message to be ack’ed by the client, so a ModifyAckDeadline request is used to extend the deadline. With the official client libraries, this will be handled transparently under the covers, using a frequency distribution model based on previous processing times.

What are the different ways to pull messages?

Streaming pull is the optimal and recommended way of receiving messages from Pub/Sub.

Streaming pull establishes a long-running streaming connection to the Pub/Sub service for receiving messages. This differs from synchronous Pull, which is a unary RPC call made each time the client needs to receive messages.

Client libraries that support streaming pull will establish a StreamingPull connection (over a gRPC stream). This is usually preferred over the unary Pull method, since there is less overhead. However, there are a few “gotchas” that might cause issues for users of streaming pull.

First, streaming pull will always return a non-OK status, and when viewing logs, this might look confusing at first. This is due to the nature of the error codes being emitted when streams are broken, so it doesn’t necessarily mean your application is not working. Client libraries provide an abstraction layer that prevents these transient disconnections from being sent to your application code.

In addition, the Pub/Sub servers recurrently close the streaming pull connection after a time period to avoid a long-running sticky connection. The client library automatically reopens these StreamingPull connections.

What subscriber settings are available, and what do they do?

Client libraries provide settings to adjust the subscriber’s behavior: flow control and concurrency control.

Flow control is useful for limiting how many messages your subscribers will pull from Pub/Sub. If your client pulls too many messages at once, it may not be able to process all of the messages, potentially leading to expired messages and an increasing message backlog. Messages expire after a certain time to allow messages to be redelivered to healthy subscribers if a single subscriber fails. You can limit both the number of messages as well as the maximum size of messages held by the client at one time, so as to not overburden a single client.

Concurrency control allows you to configure how many threads or streams are used by the client library to pull messages. Increasing the number of threads or streams allows your client to pull messages from the Pub/Sub server more rapidly. However because each stream requires additional overhead, having too many streams open can actually lead to decreased performance. Each streaming pull connection can handle 10 MB/s so consider your message throughput when changing this value. Increasing the number of callback threads allows you to process more messages concurrently.

More to come

Thanks for reading and stay tuned for Part 3 of this series! Please leave a comment below if you have other topics you would like to hear about.

Edit September 2023 — streaming pull previously had a limitation which prevented publish message batches from being split properly on the subscribe side, which could result in flow control not being respected in all scenarios. This limitation no longer exists and this article was updated to reflect that.