Replication in Kafka
This article describes the principles of replication in Kafka and how to configure it.
NOTE
|
Replication is a mechanism that creates and distributes exact copies of each topic partition on brokers in Kafka.
Replication ensures that messages are available in the event of failures or maintenance.
Replication factor in Kafka
Replication factor is a parameter that determines the number of copies (replicas) of each partition.
Each partition replica is recorded on a separate follower broker. There cannot be more replicas of each partition than there are brokers.
The figure below shows how replicas of one partition are distributed on brokers, messages are written by the producer, and messages are read by the consumer.
Message writing operations are performed through the leader broker of the partition. Message reading operations can be performed either through the leader (default) or through followers (since Kafka 2.4).
Leader election for each partition is carried out using ZooKeeper in accordance with the consensus algorithm protocol ZAB (ZooKeeper Atomic BroadCast).
Partition leaders are evenly distributed among brokers. Each broker can be a leader for one partition and a follower for another.
The replication factor in Kafka for a cluster is determined by the default.replication.factor parameter. All topics created in the cluster will be created with the default replication factor specified for the cluster.
It can also be installed individually for a topic:
-
when creating a topic via the command line;
-
when creating a topic via Kafka Streams applications (available starting from Kafka 2.4);
-
when creating a remote topic via the Mirror Maker 2 mechanism.
If the topic configuration specifies a replication factor value greater than the number of brokers in the cluster, the topic will not be created.
The replication factor for the created topic cannot be overridden. If the replication factor value is changed for the entire cluster, only new topics will be created with the new replication factor.
Replica synchronization
Replica synchronization is relevant when the replication factor value is greater than 1
and the all
value is assigned to the acks producer parameter.
This combination of parameters is a step towards exactly once message delivery guarantee.
The interaction between leader and followers is based on messaging model used in ZooKeeper.
Follower logs tend to be a copy of the leader’s log, having the same offsets and messages in the same order. A replica that contains all messages written to the leader’s log is called a synchronized replica (In-Sync Replica, ISR).
The figure below shows how replica synchronization is achieved.
The main parameter that controls replica synchronization is replica.lag.time.max.ms, the time period after which the leader accepts decision that the replica wrote (if a write acknowledgment occurred) or did not write (a write acknowledgment did not occur) a message.
The success criterion for writing a message as a whole is determined by the min.insync.replicas parameter — the minimum number of replicas that must confirm the write (can also be set and changed for each topic individually using the min.insync.replicas) parameter. Only when this parameter is reached, the leader transmits data about the recorded partition (metadata) and about the created ISRs to the controller.
After receiving the message, the leader offers a recording of the message to the replicas and waits for confirmation from them. Having received confirmation from the required number of replicas (the min.insync.replicas
parameter), the leader sends a signal to commit messages to the replicas and a confirmation to the producer. The message becomes available only after that.
Recommendations for configuring replication settings
Replication factor
There is a general recommendation to set the replication factor to 3
.
In fact, the exact value of the parameter depends on the requirements and resources of your system. When choosing a value, pay attention to the following:
-
High replication factor provides better system stability.
-
High replication factor can cause higher latency experienced by producers as data needs to be replicated to all replica brokers before an acknowledgment is returned.
-
High replication factor requires large disk space due to the large number of copies of data on disks for high availability (HA).
Number of topic partitions
When setting the replication factor, you should remember that the operation of the system is also affected by the num.partitions parameter — the number of partitions of all topics in the cluster (can be configured via the --partitions
option for an individual topic).
The replication factor and the number of partitions are parameters that work in conjunction, and the performance and durability of the system depend on their settings.
When choosing the value of the num.partitions
parameters, it is worth paying attention to what capabilities and limitations a large number of partitions causes:
-
Provides the ability to combine more consumers into a group for scaling.
-
Allows you to use existing cluster brokers more efficiently.
-
Increases the load on ZooKeeper — leader election, support for a list of synchronized replicas.
-
Increases the number of files created and opened on brokers.
Synchronized replicas
The min.insync.replicas
parameter in combination with the acks producer parameter, which is set to all
, affects the exactly once delivery guarantee of messages.
It is recommended to set the min.insync.replicas
parameter less than the replication factor, otherwise even if one broker fails, the message record will not be confirmed due to insufficient synchronized replicas.
Standard recommended scenario for HA guarantees:
-
acks =
all
Additional features
Starting with Apache Kafka 2.4, you can configure consumers to read from nearby synchronized replicas.
To select the closest synchronized replicas, use the RackAwareReplicaSelector plugin. For it to work the following parameters must be configured:
-
broker.rack — indication of the broker’s affiliation with a specific rack. Each broker must be assigned a rack. For the plugin to work effectively, it is recommended to configure the same number of brokers per rack — this way the replicas will be evenly distributed among the brokers.
-
client.rack — setting up a rack ID for connecting consumers.
The RackAwareReplicaSelector
plugin attempts to match the consumer’s client.rack
to the available broker.racks
. It then selects a replica that has the same rack ID as the consumer.
Configure replication
Configure a cluster
After adding and installing the Kafka service as part of ADS cluster, you can configure the replication factor and the number of partitions on the configuration page of the Kafka service via ADCM. You need to expand the Main node in the configuration settings tree and enter a new value for the parameters.

To change the default value for the min.insync.replicas parameter and other broker parameters that are not available in the ADCM interface, you need to enable the Show advanced switch and expand the server.properties node in the configuration settings tree. Using the Add key,value field, select Add property and enter the name of the parameter and its value.

After changing the settings using the ADCM interface, restart the Kafka service. To do this, apply the Restart action by clicking
in the Actions column.
After new values and parameters are filled in, all changes are reflected in the configuration file /usr/lib/kafka/config/server.properties.
Configure a topic
Configuration of the replication factor for an individual topic is performed via the command line. Use the kafka-topics.sh script via the --replication-factor
option when a new topic is created.
The value of the min.insync.replicas
parameter can be changed via the --config
option, which overrides the configuration parameters for an existing topic or sets the configuration parameters when a new topic is created.
$ /usr/lib/kafka/bin/kafka-topics.sh --create --topic new-topic --replication-factor 3 --config min.insync.replicas=2 --bootstrap-server localhost:9092
It is also possible to configure replication for a specific topic via the CMAK user interface. The interface becomes available after adding and installing the Kafka-Manager service in the ADS cluster.
Configuration of replication factor can be done when a new topic is created.
By selecting Topic → Create in the top menu, proceed to creating a topic. Set the parameter values, including the Replication Factor parameter, and click Create.


Setting the min.insync.replicas
parameter can be done for an existing topic or when a new topic is created.
Select Topic → List in the top menu and go to the list of topics. Select the desired topic from the list by clicking the title. On the topic page that opens, click Update Config.


Set the required value for the min.insync.replicas
parameter and click Update Config.


NOTE
For information about the basic principles of working with Kafka and Kafka-Manager services, refer to the Quick start with Kafka article. |