Apache Kafka Guide #30 Safe Producer Settings
Hi, this is Paul, and welcome to the #30 part of my Apache Kafka guide. Today we will discuss how to adjust safe Apache Kafka using Producer.
Safe Kafka Producer Settings
I’m about to give a summary of the default settings for Kafka Producers and the steps to ensure a safe producer setup. Since the release of Kafka 3.0, producers have been configured to be safe by default, requiring no additional adjustments. This includes settings where acks=all
minus one, and producerProps.put(“enable.idempotence”, true)
. However, for versions of Kafka 2.8 and earlier, the default settings are different: acks=1
, and producerProps.put(“enable.idempotence”, false)
. Regardless of the version, whether it’s 3.0 or earlier, I strongly advise opting for a safe producer configuration whenever possible to prevent any data loss.
Furthermore, my recommendation extends to always utilizing the latest Kafka Client updates. This ensures the ability to transmit data to Apache Kafka with the utmost reliability and security.
- Since Kafka 3.0 the Producer safe by default:
acks=all
andenable.idempotence=true
- Kafka 2.8 and lower
acks=1
andenable.idempotence=false
- Always use epgraded Kafka Clients
Safe Kafka Producer Summary
With the introduction of Kafka 3.0, which is deemed secure, it offers an opportunity for clients to be upgraded seamlessly. In scenarios where Kafka versions before 3.0 are used, it necessitates manual configuration adjustments within your application to guarantee data integrity and replication. Specifically, the acknowledgment setting (acks
) should be configured to all
, ensuring that data replication is verified before an acknowledgment is received. Additionally, it's imperative to set the minimum number of in-sync replicas to two, a parameter that can be adjusted at either the broker or topic level. This setting demands a minimum replication factor of three, ensuring that at least two in-sync replicas possess the data before an acknowledgment is sent.
Moreover, enabling idempotence (enable.idempotence=true
) is crucial to prevent the introduction of duplicate records as a result of network retries. To complement this, setting the number of retries to the maximum integer value (retries=MAX_INT
) ensures that the producer will continue attempts until the delivery timeout, measured in milliseconds, is met. A practical timeout duration of two minutes is recommended for this delivery timeout setting.
For the sake of optimizing performance, configuring the maximum number of in-flight requests per connection to five (max.in.flight.requests.per.connection=5
) is advised. This setting not only enhances performance but also maintains message ordering, especially when idempotence is enabled (enable.idempotence=true
).
In conclusion, while Kafka 3.0 simplifies the process with no additional configurations required, versions below 3.0 necessitate the implementation of these settings to ensure data reliability, prevent duplicates, and optimize performance.
Since Kafka 3.0 the Producer is safe by default, or upgrade you're clients or set the following settings:
acks=all
— ensures the data is properly replicated before the ack.min.insync.replicas=2
(broker/topic level) — ensures 2 brokers in ISR at least have the data after an ack.enable.idempotence=true
— not duplicates due to network retries.retries=MAX_INT
(producer level) — retry untildelivery.timeout.ms
.delivery.timeout.ms=120000
— fail after retrying 2 minutes.max.in.flight.requests.per.connection=5
— maximum performance while keeping message ordering.
Thank you for reading until the end. Before you go:
- Please consider clapping and following the writer! 👏
- Follow us on Twitter(X), LinkedIn