Kafka Replication & Min In-Sync Replicas
Kafka is a distributed messaging system that provides resiliency, high availability, and fault tolerance. One of the means it uses to achieve this is data replication across broker nodes. If a broker node fails then replicated data in a topic partition is not lost, and it can still be consumed from replica partitions. The level of redundancy is configurable, but the cost of redundancy is an increase in latency as the data is replicated. Understanding this configuration, from producer acks, to replication factor and minimum in-sync replicas, is therefore essential.
Minimum In-sync Replicas
When a Kafka producer writes a message to a topic, it writes it to the partition replica leader. This is a replica that has been voted the leader by the broker from its list of in-sync replicas that are distributed across a cluster of broker nodes. The data written to the leader by the producer is then replicated across the partition replica followers. This is controlled by the topic’s replication factor. A value of 3 means that the data is replicated from the leader to two follower partitions, ensuring a total of three replicas hold the data.
While the data is replicated across the follower partitions, the min.insync.replicas configuration parameter controls the minimum number of these replicas (including the leader) that…