ADB ClickHouse Connector usage example
This article describes how to transfer data from ADB to the distributed table in ADQM via ADB ClickHouse Connector. The following prerequisites are met:
-
The ADQM cluster is installed according to the Online installation guide on the following hosts: dev-adqm-01, dev-adqm-02, dev-adqm-03, dev-adqm-04.
-
The
default
database exists in ADQM. -
The
default
user exists in ADQM. -
The
default_cluster
cluster is defined in theremote_servers
section of the config.xml file in ADQM. For more information, see Typical cluster in the ADQM documentation. -
All ADB nodes have access to the
8123
HTTP port on all ADQM nodes through which you plan to load data.
Follow the steps:
-
Connect to the
default
database of the ADQM cluster under thedefault
user (for example, via clickhouse-client). Create the following tables in the given order.-
The
test_local
local table:CREATE TABLE test_local ON CLUSTER default_cluster ( id Int32, value_string String ) ENGINE = ReplicatedMergeTree('/clickhouse/test/shard-{shard}', 'replica-{replica}') ORDER BY id;
-
The
test_distr
distributed table:CREATE TABLE test_distr ON CLUSTER default_cluster AS default.test_local ENGINE = Distributed(default_cluster, default, test_local, rand());
-
-
Connect to the
adb
database of the ADB cluster under thegpadmin
user (for example, via psql). Create the following objects in the given order.-
The
clickhouse_server
server:CREATE SERVER clickhouse_server FOREIGN DATA WRAPPER tkh_fdw OPTIONS ( database 'default', hosts 'dev-adqm-01:8123, dev-adqm-02:8123, dev-adqm-03:8123, dev-adqm-04:8123', distribution_type 'random', lines_batch_size '100000', send_compressed 'true', send_delay '300', insert_distributed_sync 'true', use_staging 'true', staging_table_name_format '$_tmp_$');
-
The user mapping for the
gpadmin
user:CREATE USER MAPPING FOR gpadmin SERVER clickhouse_server OPTIONS (clickhouse_username 'default', clickhouse_password '');
-
The
tkhemali_test
foreign table:CREATE FOREIGN TABLE tkhemali_test (id INT, value_string TEXT) SERVER clickhouse_server OPTIONS (resource 'test_distr', hosts 'dev-adqm-01:8123', cluster 'default_cluster');
IMPORTANTSince the
cluster
option is defined, the work will be essentially produced with the hosts obtained from the ADQM cluster metadata. Due to this, remember to overwrite thehosts
option value defined forclickhouse_server
at the server level and leave only one host in the option value at the table level.
-
-
In ADB, insert some data into the
tkhemali_test
foreign table:INSERT INTO tkhemali_test VALUES (1, 'test1'), (2, 'test2'), (3, 'test3'), (4, 'test4'), (5, 'test5'), (6, 'test6'), (7, 'test7'), (8,'test8'), (9,'test9'), (10,'test10');
The result:
WARNING: skipping "themali_test" --- cannot analyze this foreign table INSERT 0 10
-
In ADQM, check that data from ADB has been successfully transferred:
SELECT * FROM test_distr;
The result:
┌─id─┬─value_string─┐ │ 2 │ test2 │ │ 5 │ test5 │ │ 8 │ test8 │ │ 10 │ test10 │ └────┴──────────────┘ ┌─id─┬─value_string─┐ │ 1 │ test1 │ │ 3 │ test3 │ │ 4 │ test4 │ │ 6 │ test6 │ │ 7 │ test7 │ │ 9 │ test9 │ └────┴──────────────┘ 10 rows in set. Elapsed: 0.005 sec.