Taager’s foray in Messaging: Part 3 — Options when Pulsar Messages are not on the Happy path

Muhammad Hassan
Taager Tech Blog
Published in
4 min readApr 25, 2022

Usually, messages in a Pulsar system are created by producers on a topic and consumed by consumers on the same topic. The broker orchestrates this whole flow. In this post, we will explore our options in Pulsar to deal with messages that don’t follow this happy path scenario.

The diagram below describes the usual path a Pulsar message takes. First, a producer sends the message to the broker. The broker maintains a list of subscriptions, essentially configurations of different consumers. These subscriptions determine how the broker delivers messages to the consumers. Once a message is consumed, the consumer sends an acknowledgment of that to the broker to complete the cycle.

Typical operational path for a message

What happens if the broker is down/unreachable?

The producer has multiple options to deal with the broker being down (the broker not acknowledging sent messages) with the exact implementation depending on the client. For example, we can store unsent messages in a configurable queue inside the producer. Unfortunately, once that queue gets full, we can’t remove the older messages inside it. Thus all the newly produced messages will be lost. However, we can configure the producer (e.g., in Java using blockIfQueueFullparameter and in Go using DisableBlockIfQueueFullparameter) so that calls to send more messages return with a failure in this scenario.

Potential Issues on the Broker Side

There can be problems even when the messages reach the broker. For example, the broker discards messages on consumer-less topics (i.e., subscriptions that have no consumers registered). However, due to the push-based nature of Pulsar, message retention on the broker kicks in once consumers register on a specific topic.

After consumers register on the broker, the broker retains the message until all registered consumers (on that topic) have acknowledged the message. If any of the consumers disconnect or fail to acknowledge the messages, the broker has an internal message queue to hold these messages.

By default, the Pulsar broker will persist the unacknowledged messages indefinitely (depending on the storage) while the broker is running. Conversely, it will delete the messages as soon as all the subscribed consumers have acknowledged these messages. We can essentially override these settings by modifying the retention policies on the broker, allowing us to temporarily store undelivered messages or keep them even after they are delivered. The two relevant settings are:
- Message retention: Specifies the time and space limit for which already delivered messages will be stored on the broker. These policies are applied on a per namespace basis.
Message Expiry / TTL: Time To Live (TTL) specifies how long unacknowledged messages are stored.

Potential Issues on Consumer Side

Pulsar provides two common ways of dealing with messages that fail on the consumer side. First, if the consumer fails to process a message for any reason (e.g., schema failures), it has the option of putting that message on the Dead letter topic. Dead letter topic allows the user to deal with the troubling messages differently. For example, we can use the Dead letter topics to debug troublesome messages in an ongoing application. Of course, this option will only work if the consumer has message re-delivery enabled (usually by sending a negative acknowledgment).

The other method is enabling the use of Retry Letter Topics. The consumer sends messages on the normal and retry topic in this mode. The messages on the retry topic are consumed only after a delay and if the message is not acknowledged on the original topic. This mechanism allows us to deal with expected failures in business logic without relying on the broker to send the messages repeatedly.

We have only looked at limited ways to handle disruptions using Pulsar as our message transmission mechanism. However, you can find exhaustive documentation available on the Apache Pulsar website that goes into even more details and looks into the edge cases we may encounter. These built-in redundancies allow us to be confident in using Pulsar as our event streaming and messaging system.

--

--