Geo-distributed cluster
A geo-distributed cluster is a cluster whose nodes are located in several geographically distant data centers. This solution ensures high reliability of the system, which will remain operational even if one of the data centers fails.
This article contains examples of geo-distributed ADQM cluster topologies. The most suitable topology depends on the required balance between reliability, low latency, disk storage, and maintainability. For example, the Stretched cluster, Stretched cluster with local reads and Stretched cluster with local writes topologies are often used to ensure the cluster performance in the high-latency environment.
In each of the examples below, cluster servers are located in two data centers — West and East.
Stretched cluster
Topology features:
-
SELECT
andINSERT
queries can be routed to East or West, but it is not guaranteed that all reads are local (it is possible that a query will be executed by replicas in different data centers). -
ZooKeeper nodes are placed in the East and West data centers, as well as in the third Nord data center, as ZooKeeper requires an odd number of nodes.
Configure a logical cluster
In each server configuration (config.xml), define the stretched_cluster
cluster that contains two shards with two replicas (located in different data centers) in each one. For shards, set the internal_replication
parameter to true
to replicate data through a replicated table (data is written to any available replica, and another replica receives its copy of data automatically).
<remote_servers>
<stretched_cluster>
<shard>
<internal_replication>true</internal_replication>
<replica>
<host>shard1.west</host>
<port>9000</port>
</replica>
<replica>
<host>shard1.east</host>
<port>9000</port>
</replica>
</shard>
<shard>
<internal_replication>true</internal_replication>
<replica>
<host>shard2.west</host>
<port>9000</port>
</replica>
<replica>
<host>shard2.east</host>
<port>9000</port>
</replica>
</shard>
</stretched_cluster>
</remote_servers>
Create distributed tables
On all nodes of the cluster, create a replicated table (replicated_table
) based on the ReplicatedMergeTree engine.
Use the Distributed engine to create distributed tables both for reads and writes on all nodes of the cluster. To do this, run the following query on any node:
CREATE TABLE table_distributed ON CLUSTER stretched_cluster AS replicated_table
ENGINE = Distributed(stretched_cluster, default, replicated_table);
Stretched cluster with local reads
Topology features:
-
SELECT
andINSERT
queries can be routed to East or West, and it is guaranteed that all reads are local. -
ZooKeeper nodes are placed in the East and West data centers, as well as in the third Nord data center, as ZooKeeper requires an odd number of nodes.
Configure logical clusters
In each server configuration (config.xml), define the following clusters:
-
stretched_cluster_writes
— hosts of both West and East data centers for writes (two shards with two replicas in each one); -
stretched_cluster_west_reads
— hosts of the West data center for reads (two shards with one replica in each one); -
stretched_cluster_east_reads
— hosts of the East data center for reads (two shards with one replica in each one).
For shards, set the internal_replication
parameter to true
to replicate data through a replicated table (data is written to any available replica, and another replica receives its copy of data automatically).
<remote_servers>
<stretched_cluster_writes>
<shard>
<internal_replication>true</internal_replication>
<replica>
<host>shard1.west</host>
<port>9000</port>
</replica>
<replica>
<host>shard1.east</host>
<port>9000</port>
</replica>
</shard>
<shard>
<internal_replication>true</internal_replication>
<replica>
<host>shard2.west</host>
<port>9000</port>
</replica>
<replica>
<host>shard2.east</host>
<port>9000</port>
</replica>
</shard>
</stretched_cluster_writes>
<stretched_cluster_west_reads>
<shard>
<internal_replication>true</internal_replication>
<replica>
<host>shard1.west</host>
<port>9000</port>
</replica>
</shard>
<shard>
<internal_replication>true</internal_replication>
<replica>
<host>shard2.west</host>
<port>9000</port>
</replica>
</shard>
</stretched_cluster_west_reads>
<stretched_cluster_east_reads>
<shard>
<internal_replication>true</internal_replication>
<replica>
<host>shard1.east</host>
<port>9000</port>
</replica>
</shard>
<shard>
<internal_replication>true</internal_replication>
<replica>
<host>shard2.east</host>
<port>9000</port>
</replica>
</shard>
</stretched_cluster_east_reads>
</remote_servers>
Create distributed tables
On all nodes of the cluster, create a replicated table (replicated_table
) based on the ReplicatedMergeTree engine.
Create distributed tables for reads separately in each data center:
-
West:
CREATE TABLE table_distributed_west_reads ON CLUSTER stretched_cluster_west_reads AS replicated_table ENGINE = Distributed(stretched_cluster_west_reads, default, replicated_table);
-
East:
CREATE TABLE table_distributed_east_reads ON CLUSTER stretched_cluster_east_reads AS replicated_table ENGINE = Distributed(stretched_cluster_east_reads, default, replicated_table);
Create identical distributed tables for writes in a logical cluster that combines both data centers:
CREATE TABLE table_distributed_writes ON CLUSTER stretched_cluster_writes AS replicated_table
ENGINE = Distributed(stretched_cluster_writes, default, replicated_table);
Stretched cluster with local writes
Topology features:
-
SELECT
andINSERT
queries can be routed to East or West, and it is guaranteed that all writes are local. -
ZooKeeper nodes are placed in the East and West data centers, as well as in the third Nord data center, as ZooKeeper requires an odd number of nodes.
Configure logical clusters
In each server configuration (config.xml), define the following clusters:
-
stretched_cluster_reads
— hosts of both West and East data centers for reads (two shards with two replicas in each one); -
stretched_cluster_west_writes
— hosts of the West data center for writes (two shards with one replica in each one); -
stretched_cluster_east_writes
— hosts of the East data center for writes (two shards with one replica in each one).
For shards, set the internal_replication
parameter to true
to replicate data through a replicated table (data is written to any available replica, and another replica receives its copy of data automatically).
<remote_servers>
<stretched_cluster_reads>
<shard>
<internal_replication>true</internal_replication>
<replica>
<host>shard1.west</host>
<port>9000</port>
</replica>
<replica>
<host>shard1.east</host>
<port>9000</port>
</replica>
</shard>
<shard>
<internal_replication>true</internal_replication>
<replica>
<host>shard2.west</host>
<port>9000</port>
</replica>
<replica>
<host>shard2.east</host>
<port>9000</port>
</replica>
</shard>
</stretched_cluster_reads>
<stretched_cluster_west_writes>
<shard>
<internal_replication>true</internal_replication>
<replica>
<host>shard1.west</host>
<port>9000</port>
</replica>
</shard>
<shard>
<internal_replication>true</internal_replication>
<replica>
<host>shard2.west</host>
<port>9000</port>
</replica>
</shard>
</stretched_cluster_west_writes>
<stretched_cluster_east_writes>
<shard>
<internal_replication>true</internal_replication>
<replica>
<host>shard1.east</host>
<port>9000</port>
</replica>
</shard>
<shard>
<internal_replication>true</internal_replication>
<replica>
<host>shard2.east</host>
<port>9000</port>
</replica>
</shard>
</stretched_cluster_east_writes>
</remote_servers>
Create distributed tables
On all nodes of the cluster, create a replicated table (replicated_table
) based on the ReplicatedMergeTree engine.
Create distributed tables for writes separately in each data center:
-
West:
CREATE TABLE table_distributed_west_writes ON CLUSTER stretched_cluster_west_writes AS replicated_table ENGINE = Distributed(stretched_cluster_west_writes, default, replicated_table);
-
East:
CREATE TABLE table_distributed_east_writes ON CLUSTER stretched_cluster_east_writes AS replicated_table ENGINE = Distributed(stretched_cluster_east_writes, default, replicated_table);
Create identical distributed tables for reads in a logical cluster that combines both data centers:
CREATE TABLE table_distributed_reads ON CLUSTER stretched_cluster_reads AS replicated_table
ENGINE = Distributed(stretched_cluster_reads, default, replicated_table);
Asymmetric cluster
Topology features:
-
Reads are possible both in the West and in East data centers, while writes — in West only.
-
Servers of a single ZooKeeper ensemble are located in two data centers (the third data center is not needed), while only servers in West participate in the quorum. Replicas in the East data center cannot be leaders (the
replicated_can_become_leader=0
setting in themerge_tree
section of config.xml controls this behavior). -
The insert_quorum parameter is set to
1
as the number of servers in the quorum should be equal or less than the number of replicas in the main data center (in this example, it is the West data center to which data can be inserted).
Configure logical clusters
In each server configuration (config.xml), define the following clusters:
-
asymmetric_cluster_west
— hosts of the West data center (two shards with one replica in each one); -
asymmetric_cluster_east
— hosts of the East data center (two shards with one replica in each one).
For shards, set the internal_replication
parameter to true
to replicate data through a replicated table (data is written to any available replica, and another replica receives its copy of data automatically).
<remote_servers>
<asymmetric_cluster_west>
<shard>
<internal_replication>true</internal_replication>
<replica>
<host>shard1.west</host>
<port>9000</port>
</replica>
</shard>
<shard>
<internal_replication>true</internal_replication>
<replica>
<host>shard2.west</host>
<port>9000</port>
</replica>
</shard>
</asymmetric_cluster_west>
<asymmetric_cluster_east>
<shard>
<internal_replication>true</internal_replication>
<replica>
<host>shard1.east</host>
<port>9000</port>
</replica>
</shard>
<shard>
<internal_replication>true</internal_replication>
<replica>
<host>shard2.east</host>
<port>9000</port>
</replica>
</shard>
</asymmetric_cluster_east>
</remote_servers>
Create distributed tables
On all nodes of the cluster, create a replicated table (replicated_table
) based on the ReplicatedMergeTree engine.
Create distributed tables for reads separately in each data center:
-
West:
CREATE TABLE table_distributed_reads ON CLUSTER asymmetric_cluster_west AS replicated_table ENGINE = Distributed(asymmetric_cluster_west, default, replicated_table);
-
East:
CREATE TABLE table_distributed_reads ON CLUSTER asymmetric_cluster_east AS replicated_table ENGINE = Distributed(asymmetric_cluster_east, default, replicated_table);
Create a distributed table for writes in the West data center only:
CREATE TABLE table_distributed_writes ON CLUSTER asymmetric_cluster_west AS replicated_table
ENGINE = Distributed(asymmetric_cluster_west, default, replicated_table);
Cluster with local reads, without native geo replication
Topology features:
-
Native ClickHouse replication between data centers is not supported. Replication between data centers is provided by distributed tables.
-
Set the insert_quorum parameter to
2
— when executing anINSERT
query, a distributed table will write data to two replicas located in different data centers. -
SELECT
andINSERT
queries can be routed to East or West, and it is guaranteed that all reads are local. -
High latency between West and East is possible.
-
ZooKeeper is available in one data center only.
Configure logical clusters
In each server configuration (config.xml), define the following clusters:
-
cluster_writes
— includes hosts of the West and East data centers for writes (two shards with two replicas in each one); -
cluster_west_reads
— includes hosts of the West data center for reads (two shards with one replica in each one); -
cluster_east_reads
— includes hosts of the East data center for reads (two shards with one replica in each one).
The internal_replication
parameter for shards of the cluster_writes
cluster to which data is written is set to false
so that the distributed table writes data to both replicas located in different data centers.
<remote_servers>
<cluster_writes>
<shard>
<internal_replication>false</internal_replication>
<replica>
<host>shard1.west</host>
<port>9000</port>
</replica>
<replica>
<host>sh1r1.east</host>
<port>9000</port>
</replica>
</shard>
<shard>
<internal_replication>false</internal_replication>
<replica>
<host>shard2.west</host>
<port>9000</port>
</replica>
<replica>
<host>sh2r1.east</host>
<port>9000</port>
</replica>
</shard>
</cluster_writes>
<cluster_west_reads>
<shard>
<replica>
<host>shard1.west</host>
<port>9000</port>
</replica>
</shard>
<shard>
<replica>
<host>shard2.west</host>
<port>9000</port>
</replica>
</shard>
</cluster_west_reads>
<cluster_east_reads>
<shard>
<internal_replication>true</internal_replication>
<replica>
<host>shard1r2.east</host>
<port>9000</port>
</replica>
</shard>
<shard>
<internal_replication>true</internal_replication>
<replica>
<host>shard2r2.east</host>
<port>9000</port>
</replica>
</shard>
</cluster_east_reads>
</remote_servers>
Create distributed tables
On all nodes of the cluster, create a replicated table (replicated_table
) based on the ReplicatedMergeTree engine.
Create distributed tables for reads in all nodes of the West data center individually (ZooKeeper is required to make ON CLUSTER
working):
CREATE TABLE table_distributed_reads AS replicated_table
ENGINE = Distributed(cluster_west_reads, default, replicated_table);
To create a distributed table for reads in East, it is enough to run the following query on any node:
CREATE TABLE table_distributed_reads ON CLUSTER cluster_east_reads AS replicated_table
ENGINE = Distributed(cluster_east_reads, default, replicated_table);
A distributed table for writes should be created on the shard1.west, shard2.west, shard1r1.east, shard2r1.east hosts:
CREATE TABLE table_distributed_writes AS replicated_table
ENGINE = Distributed(cluster_writes, default, replicated_table);