Things I wish I knew about Google Cloud Pub/Sub: Part 2
This is a continuation of our three part series in providing useful tidbits in using Pub/Sub effectively. In Part 1, Megan Potter gave a high level overview of Pub/Sub’s concepts and how to call the Pub/Sub APIs. This post will dive a bit deeper in the lifecycle of a message as well as settings needed to optimize your application through publisher and subscriber settings when using the official client libraries.
What is the lifecycle of a message in Pub/Sub?
A key part of using Pub/Sub effectively is understanding how a message travels from the moment a message is published to when it is consumed.
First, a publisher sends a message to the topic. Once the Pub/Sub server receives the message, it adds fields to the message such as publish timestamp and message ID. Pub/Sub delivers this message to all subscriptions. From each subscription, subscribers pull messages via one of two pull calls (more on this in a bit). Each subscription has an acknowledgement deadline, which determines how much time is allocated for the client to acknowledge (ack) this message before the server redelivers it. If the client holds the message beyond the agreed deadline without acking, or if the client sends a nack (negative acknowledgement) request, the message will be redelivered (potentially to a different subscriber on the same subscription).
Sometimes, the acknowledgement deadline is not enough time for a message to be acked by the client, so a ModifyAckDeadline request is used to extend the deadline. With the official client libraries, this will be handled transparently under the covers, which optimizes the ack deadline extensions with a frequency distribution model based on previous processing times.
What publisher settings are available, and what do they do?
Our Pub/Sub client libraries provide many configuration options to control the publishing of messages, including batching functionality.
Batch publishing helps decrease overhead by publishing multiple messages as part of a single Publish RPC. When you call Publish using the official client libraries, your message is not published right away, but instead added to a batcher to be published alongside other messages. You can configure this behavior using the following:
- The maximum number of messages per batch
- The maximum size of a batch in bytes
- How long to hold on to a batch of messages before sending if a batch is not yet filled (delay).
Increasing the number of messages per batch or size of a batch decreases the number of publish requests you need to make, which translates to higher throughput per publish request. However, you may face increased latency (time between when publish is called in the client and when that publish is acknowledged by the server) if you are not filling your batches fast enough. Conversely, while decreasing batch delay can help with latency, your publish requests might end up queueing if they can’t be pushed out fast enough (lower throughput).
What does acknowledging a message mean? Why do duplicates happen?
A key part of using Pub/Sub effectively is understanding the lifecycle of a message.
First, a publisher sends a message to the topic. Once the Pub/Sub server receives the message, it adds fields to the message such as publish timestamps and message IDs. If a subscription exists for the topic prior to the message is received by the topic, subscribers can pull this message from the subscription. A predetermined amount of time (this defaults to 10 seconds) is allocated for the client to acknowledge (ack) this message before the server redelivers it. This is known as the acknowledgement deadline. If the client holds the message beyond the agreed deadline without acking, or if the client sends a Nack (negative acknowledged) request, the message will be redelivered (potentially to a different subscriber if you have multiple set up).
Sometimes, the acknowledgement deadline is not enough time for a message to be ack’ed by the client, so a ModifyAckDeadline request is used to extend the deadline. With the official client libraries, this will be handled transparently under the covers, using a frequency distribution model based on previous processing times.
What are the different ways to pull messages?
Streaming pull is the optimal and recommended way of receiving messages from Pub/Sub.
Streaming pull establishes a long-running streaming connection to the Pub/Sub service for receiving messages. This differs from synchronous Pull, which is a unary RPC call made each time the client needs to receive messages.
Client libraries that support streaming pull will establish a StreamingPull connection (over a gRPC stream). This is usually preferred over the unary Pull method, since there is less overhead. However, there are a few “gotchas” that might cause issues for users of streaming pull.
First, streaming pull will always return a non-OK status, and when viewing logs, this might look confusing at first. This is due to the nature of the error codes being emitted when streams are broken, so it doesn’t necessarily mean your application is not working. Client libraries provide an abstraction layer that prevents these transient disconnections from being sent to your application code.
In addition, streaming pull connections will only live for about 30 minutes before they are closed, since long-lived streams are periodically terminated. The official client libraries will reopen the stream automatically, so this should be transparent to your application.
What subscriber settings are available, and what do they do?
Client libraries provide settings to adjust the subscriber: flow control and concurrency control.
Flow control is useful for limiting how many messages your subscribers will pull from Pub/Sub. If your client pulls too many messages at once, it may not be able to process all of the messages, leading to many expired messages. Messages expire after a certain time to allow messages to be redelivered to healthy subscribers if a single subscriber fails. You can limit both the number of messages as well as the maximum size of messages held by the client at one time, so as to not overburden a single client.
Note, streaming pull only guarantees flow control on a best-effort basis. Say you’ve noted your application can only handle 100 messages in any one period, so you set max outstanding messages to 100. The client will pause once it has pulled in 100 messages, which works most of the time. However, if you then publish 500 messages in a single publish batch, the client will receive all 500 messages at once but only be able to process 100 at a time, potentially leading to a growing backlog of expired messages. This is because streaming pull can’t split up messages from a single publish batch. To avoid this, either increase your number of subscribers, or decrease your batch sizes to match subscriber message processing capacity while publishing.
Concurrency control allows you to configure how many threads or streams are used by the client library to pull messages. Increasing the number of threads or streams allows your client to pull messages from the Pub/Sub server more rapidly. The C++, Go, Java, Ruby clients let you configure how many threads are used for callbacks, though some clients, like Node, don’t support this. Increasing the number of callback threads allows you to process more messages concurrently.
More to come
Thanks for reading and stay tuned for Part 3 of this series! Please leave a comment below if you have other topics you would like to hear about.