Replication in Kafka

This article describes the principles of replication in Kafka and how to configure it.

NOTE

Replication is a mechanism that creates and distributes exact copies of each topic partition on brokers in Kafka.

Replication ensures that messages are available in the event of failures or maintenance.

Replication factor in Kafka

Replication factor is a parameter that determines the number of copies (replicas) of each partition.

Each partition replica is recorded on a separate follower broker. There cannot be more replicas of each partition than there are brokers.

The figure below shows how replicas of one partition are distributed on brokers, messages are written by the producer, and messages are read by the consumer.

Replicas distributed on brokers
Replicas distributed on brokers
Replicas distributed on brokers
Replicas distributed on brokers

Message writing operations are performed through the leader broker of the partition. Message reading operations can be performed either through the leader (default) or through followers (since Kafka 2.4).

Leader election for each partition is carried out using ZooKeeper in accordance with the consensus algorithm protocol ZAB (ZooKeeper Atomic BroadCast).

Partition leaders are evenly distributed among brokers. Each broker can be a leader for one partition and a follower for another.

The replication factor in Kafka for a cluster is determined by the default.replication.factor parameter. All topics created in the cluster will be created with the default replication factor specified for the cluster.

It can also be installed individually for a topic:

  • when creating a topic via the command line;

  • when creating a topic via Kafka Streams applications (available starting from Kafka 2.4);

  • when creating a remote topic via the Mirror Maker 2 mechanism.

If the topic configuration specifies a replication factor value greater than the number of brokers in the cluster, the topic will not be created.

The replication factor for the created topic cannot be overridden. If the replication factor value is changed for the entire cluster, only new topics will be created with the new replication factor.

Replica synchronization

Replica synchronization is relevant when the replication factor value is greater than 1 and the all value is assigned to the acks producer parameter. This combination of parameters is a step towards exactly once message delivery guarantee.

The interaction between leader and followers is based on messaging model used in ZooKeeper.

Follower logs tend to be a copy of the leader’s log, having the same offsets and messages in the same order. A replica that contains all messages written to the leader’s log is called a synchronized replica (In-Sync Replica, ISR).

The figure below shows how replica synchronization is achieved.

Replica synchronization
Replica synchronization
Replica synchronization
Replica synchronization

The main parameter that controls replica synchronization is replica.lag.time.max.ms, the time period after which the leader accepts decision that the replica wrote (if a write acknowledgment occurred) or did not write (a write acknowledgment did not occur) a message.

The success criterion for writing a message as a whole is determined by the min.insync.replicas parameter — the minimum number of replicas that must confirm the write (can also be set and changed for each topic individually using the min.insync.replicas) parameter. Only when this parameter is reached, the leader transmits data about the recorded partition (metadata) and about the created ISRs to the controller.

After receiving the message, the leader offers a recording of the message to the replicas and waits for confirmation from them. Having received confirmation from the required number of replicas (the min.insync.replicas parameter), the leader sends a signal to commit messages to the replicas and a confirmation to the producer. The message becomes available only after that.

Recommendations for configuring replication settings

Replication factor

There is a general recommendation to set the replication factor to 3.

In fact, the exact value of the parameter depends on the requirements and resources of your system. When choosing a value, pay attention to the following:

  • High replication factor provides better system stability.

  • High replication factor can cause higher latency experienced by producers as data needs to be replicated to all replica brokers before an acknowledgment is returned.

  • High replication factor requires large disk space due to the large number of copies of data on disks for high availability (HA).

Number of topic partitions

When setting the replication factor, you should remember that the operation of the system is also affected by the num.partitions parameter — the number of partitions of all topics in the cluster (can be configured via the --partitions option for an individual topic). The replication factor and the number of partitions are parameters that work in conjunction, and the performance and durability of the system depend on their settings.

When choosing the value of the num.partitions parameters, it is worth paying attention to what capabilities and limitations a large number of partitions causes:

  • Provides the ability to combine more consumers into a group for scaling.

  • Allows you to use existing cluster brokers more efficiently.

  • Increases the load on ZooKeeper — leader election, support for a list of synchronized replicas.

  • Increases the number of files created and opened on brokers.

Synchronized replicas

The min.insync.replicas parameter in combination with the acks producer parameter, which is set to all, affects the exactly once delivery guarantee of messages.

It is recommended to set the min.insync.replicas parameter less than the replication factor, otherwise even if one broker fails, the message record will not be confirmed due to insufficient synchronized replicas.

Standard recommended scenario for HA guarantees:

Additional features

Starting with Apache Kafka 2.4, you can configure consumers to read from nearby synchronized replicas.

To select the closest synchronized replicas, use the RackAwareReplicaSelector plugin. For it to work the following parameters must be configured:

  • broker.rack — indication of the broker’s affiliation with a specific rack. Each broker must be assigned a rack. For the plugin to work effectively, it is recommended to configure the same number of brokers per rack — this way the replicas will be evenly distributed among the brokers.

  • client.rack — setting up a rack ID for connecting consumers.

The RackAwareReplicaSelector plugin attempts to match the consumer’s client.rack to the available broker.racks. It then selects a replica that has the same rack ID as the consumer.

Configure replication

Configure a cluster

After adding and installing the Kafka service as part of ADS cluster, you can configure the replication factor and the number of partitions on the configuration page of the Kafka service via ADCM. You need to expand the Main node in the configuration settings tree and enter a new value for the parameters.

Configure the replication factor and the number of partitions for the cluster
Configure the replication factor and the number of partitions for the cluster

To change the default value for the min.insync.replicas parameter and other broker parameters that are not available in the ADCM interface, you need to enable the Show advanced switch and expand the server.properties node in the configuration settings tree. Using the Add key,value field, select Add property and enter the name of the parameter and its value.

Adding the min.insync.replicas parameter
Configure the min.insync.replicas parameter

After changing the settings using the ADCM interface, restart the Kafka service. To do this, apply the Restart action by clicking actions default dark actions default light in the Actions column.

After new values ​​and parameters are filled in, all changes are reflected in the configuration file /usr/lib/kafka/config/server.properties.

Configure a topic

Configuration of the replication factor for an individual topic is performed via the command line. Use the kafka-topics.sh script via the --replication-factor option when a new topic is created.

The value of the min.insync.replicas parameter can be changed via the --config option, which overrides the configuration parameters for an existing topic or sets the configuration parameters when a new topic is created.

An example of replication factor and min.insync.replicas parameters configuration for an existing topic via the command line
$ /usr/lib/kafka/bin/kafka-topics.sh --create --topic new-topic --replication-factor 3 --config min.insync.replicas=2 --bootstrap-server localhost:9092

It is also possible to configure replication for a specific topic via the CMAK user interface. The interface becomes available after adding and installing the Kafka-Manager service in the ADS cluster.

Configuration of replication factor can be done when a new topic is created.

An example of replication factor configuration when creating a topic via the Kafka-Manager service

By selecting Topic → Create in the top menu, proceed to creating a topic. Set the parameter values, including the Replication Factor parameter, and click Create.

Configure the replication factor for the topic
Configure the replication factor for the topic
Configure the replication factor for the topic
Configure the replication factor for the topic

Setting the min.insync.replicas parameter can be done for an existing topic or when a new topic is created.

An example of min.insync.replicas configuration for an existing topic via the Kafka-Manager service

Select Topic → List in the top menu and go to the list of topics. Select the desired topic from the list by clicking the title. On the topic page that opens, click Update Config.

Switch to configuring a topic in Kafka-Manager
Switch to configuring a topic in Kafka-Manager
Switch to configuring a topic in Kafka-Manager
Switch to configuring a topic in Kafka-Manager

Set the required value for the min.insync.replicas parameter and click Update Config.

Configure a topic in Kafka-Manager
Configure a topic in Kafka-Manager
Configure a topic in Kafka-Manager
Configure a topic in Kafka-Manager
NOTE

For information about the basic principles of working with Kafka and Kafka-Manager services, refer to the Quick start with Kafka article.

Found a mistake? Seleсt text and press Ctrl+Enter to report it