ADB ClickHouse Connector usage examples
This article describes how to transfer data between ADB and a distributed table in ADQM via ADB ClickHouse Connector. The following prerequisites are met:
-
The ADQM cluster is installed according to the Online installation guide on the following hosts:
dev-adqm-01,dev-adqm-02,dev-adqm-03,dev-adqm-04. -
The
defaultdatabase exists in ADQM. -
The
defaultuser exists in ADQM. -
The
default_clustercluster is defined in theremote_serverssection of the config.xml file in ADQM. For more information, see Typical cluster in the ADQM documentation. -
All ADB nodes have access to the
8123HTTP port on all ADQM nodes through which you plan to load data.
Insert data
To check that you can add data to ADQM from ADB, follow the steps:
-
Connect to the
defaultdatabase of the ADQM cluster under thedefaultuser (for example, via clickhouse-client). Create the following tables in the given order.-
The
test_locallocal table:CREATE TABLE test_local ON CLUSTER default_cluster ( id Int32, value_string String ) ENGINE = ReplicatedMergeTree('/clickhouse/test/shard-{shard}', 'replica-{replica}') ORDER BY id; -
The
test_distrdistributed table:CREATE TABLE test_distr ON CLUSTER default_cluster AS default.test_local ENGINE = Distributed(default_cluster, default, test_local, rand());
-
-
Connect to the
adbdatabase of the ADB cluster under thegpadminuser (for example, viapsql). Create the following objects in the given order.-
The
clickhouse_serverserver:CREATE SERVER clickhouse_server FOREIGN DATA WRAPPER tkh_fdw OPTIONS ( database 'default', hosts 'dev-adqm-01:8123, dev-adqm-02:8123, dev-adqm-03:8123, dev-adqm-04:8123', distribution_type 'random', lines_batch_size '100000', send_compressed 'true', send_delay '300', insert_distributed_sync 'true', use_staging 'true', staging_table_name_format '$_tmp_$'); -
The user mapping for the
gpadminuser:CREATE USER MAPPING FOR gpadmin SERVER clickhouse_server OPTIONS (clickhouse_username 'default', clickhouse_password ''); -
The
tkhemali_testforeign table:CREATE FOREIGN TABLE tkhemali_test (id INT, value_string TEXT) SERVER clickhouse_server OPTIONS (resource 'test_distr', hosts 'dev-adqm-01:8123', cluster 'default_cluster');IMPORTANTSince the
clusteroption is defined, the work will be essentially produced with the hosts obtained from the ADQM cluster metadata. Due to this, remember to overwrite thehostsoption value defined forclickhouse_serverat the server level and leave only one host in the option value at the table level.
-
-
In ADB, insert some data into the
tkhemali_testforeign table:INSERT INTO tkhemali_test VALUES (1, 'test1'), (2, 'test2'), (3, 'test3'), (4, 'test4'), (5, 'test5'), (6, 'test6'), (7, 'test7'), (8,'test8'), (9,'test9'), (10,'test10');Result:
WARNING: skipping "themali_test" --- cannot analyze this foreign table INSERT 0 10
-
In ADQM, check that data from ADB has been successfully transferred:
SELECT * FROM test_distr;Result:
┌─id─┬─value_string─┐ │ 2 │ test2 │ │ 5 │ test5 │ │ 8 │ test8 │ │ 10 │ test10 │ └────┴──────────────┘ ┌─id─┬─value_string─┐ │ 1 │ test1 │ │ 3 │ test3 │ │ 4 │ test4 │ │ 6 │ test6 │ │ 7 │ test7 │ │ 9 │ test9 │ └────┴──────────────┘ 10 rows in set. Elapsed: 0.005 sec.
Select data
|
NOTE
ADB ClickHouse Connector supports the ability to select data from ADQM/ClickHouse to ADB 6 starting with 6.24.3.47. However, to use a foreign table in ADB for reading, you should specify for this table column names that match exactly the corresponding names in a source table in ADQM/ClickHouse. |
Correct example
To check that ADQM data can be read from ADB, do the following:
-
Perform steps 1—2 that are listed above (if you have not passed them yet).
-
Delete data from the
test_localtable in ADQM:ALTER TABLE default.test_local ON CLUSTER default_cluster DELETE WHERE true; -
In ADQM, add some data to the
test_distrtable:INSERT INTO test_distr VALUES (1, 'test1'), (2, 'test2'), (3, 'test3'), (4, 'test4'), (5, 'test5'), (6, 'test6'), (7, 'test7'), (8,'test8'), (9,'test9'), (10,'test10');Result:
Query id: 430a158f-1374-470d-a8c2-7b4dabdefbc7 Ok. 10 rows in set. Elapsed: 0.048 sec.
-
Check that data added to ADQM can be successfully selected in ADB:
SELECT * FROM tkhemali_test ORDER BY id;Result:
id | value_string ----+-------------- 1 | test1 2 | test2 3 | test3 4 | test4 5 | test5 6 | test6 7 | test7 8 | test8 9 | test9 10 | test10 (10 rows)
Error example
The following example shows that reading data from ADQM/ClickHouse to ADB is impossible if column names in the foreign ADB table and the source ADQM/ClickHouse table do not match:
-
Perform all steps from the previous example with correct data reading (if you have not passed them yet).
-
Ensure that reading data based on the current foreign table structure is successful in ADB:
SELECT * FROM tkhemali_test ORDER BY id;Result:
id | value_string ----+-------------- 1 | test1 2 | test2 3 | test3 4 | test4 5 | test5 6 | test6 7 | test7 8 | test8 9 | test9 10 | test10 (10 rows)
-
Rename some columns of the
tkhemali_testtable in ADB:ALTER FOREIGN TABLE tkhemali_test RENAME value_string TO wrong_string; -
Check again the ability to select data from
tkhemali_testin ADB:SELECT * FROM tkhemali_test ORDER BY id;The following error is displayed:
ERROR: CH easy datum. HTTP error: 404; CH message: Code: 47. DB::Exception: Missing columns: 'wrong_string' while processing query: 'SELECT id, wrong_string FROM default.test_local', required columns: 'id' 'wrong_string', maybe you meant: 'id'. (UNKNOWN_IDENTIFIER) (version 23.3.2.37) (ch_easy_datum.c:142) (seg0 slice1 10.92.17.182:10000 pid=6414) (ch_easy_datum.c:142)
|
NOTE
If names are the same only for part of columns in the foreign table in ADB and the source table in ADQM/ClickHouse — the SQL query succeeds while the resulting dataset uses only the specified columns. |