ADB ClickHouse Connector configuration
To send data from ADB to ClickHouse via ADB ClickHouse Connector, you should first create the following objects on the ADB cluster side:
-
Server — encapsulates connection information that a foreign data wrapper uses to access an external data source.
-
User mapping — provides authentication to an external data source by mapping users.
-
Foreign table — a table in ADB that defines the remote data structure. A foreign table has no storage in ADB, but can be used in queries just like a normal table.
To see how to use the listed objects to send data from ADB to ADQM, refer to ADB ClickHouse Connector usage examples.
IMPORTANT
|
Server
To create a server, use the CREATE SERVER
command. The basic command syntax is listed below:
CREATE SERVER <server_name> [ TYPE '<server_type>' ] [ VERSION '<server_version>' ]
FOREIGN DATA WRAPPER <fdw_name>
[ OPTIONS ( [ <option> '<value>' [, ... ]] ) ]
where:
-
<server_name>
— a server name in ADB. Should be unique within the current ADB database. -
<server_type>
— an optional server type. -
<server_version>
— an optional server version. -
<fdw_name>
— a foreign data wrapper name. You should use thetkh_fdw
foreign data wrapper, which is automatically created after the connector installation (see step 8 in ADB ClickHouse Connector installation). -
<option>
— server options that define the connection details. The options that are available for ADB ClickHouse Connector are listed in the Server options table. Note that these options can be defined both at the server level and at the foreign table level. Options marked as required should be specified at least in one definition (server or table). Table-level options take precedence. -
<value>
— option values.
NOTE
|
Name | Type | Description | Default | Required |
---|---|---|---|---|
database |
TEXT |
A name of the ClickHouse database where the target table is located. Should match the following regular expression: |
— |
Yes |
hosts |
TEXT |
A list of ClickHouse hosts where the target table is located. A ClickHouse cluster does not have a single entry point (Master node), so the requests can be sent to any host in the cluster. To reduce the load, data batches are sent to hosts from the connector according to the Validation rules for the |
— |
Yes |
lines_batch_size |
INT |
The maximum number of rows in the data batch sent to ClickHouse by the connector. Positive integers are allowed |
100000 |
No |
bytes_batch_size_mb |
INT |
The maximum batch size of data sent to ClickHouse by the connector (in MB). If the Positive integers are allowed |
— |
No |
send_compressed |
BOOL |
The flag that indicates whether to use compression on the connector side before sending data to ClickHouse. Possible values:
|
false |
No |
send_delay |
INT |
A delay between connector requests to ClickHouse (in milliseconds). Positive integers are allowed |
300 |
No |
insert_distributed_sync |
BOOL |
Enables or disables synchronous data load to distributed tables. Possible values:
|
true |
No |
distribution_type |
TEXT |
Defines the strategy of load distribution between ClickHouse hosts (see Possible values:
|
random |
No |
use_staging |
BOOL |
The flag that indicates whether to create staging tables in ClickHouse before inserting data into the target table. Used to emulate transactions that are not present in ClickHouse. Staging tables are supported only for the following ClickHouse engine families: MergeTree (including Replicated* engines) and Distributed. For other engines, an error will be returned, since not all engine types support partitioning in ClickHouse. Possible values:
|
— |
Yes |
staging_table_name_format |
TEXT |
Defines the staging table name format (see |
— |
Yes if |
clickhouse_properties |
TEXT |
A list of ClickHouse custom options in the |
— |
No |
fdw_startup_cost |
DOUBLE |
The cost estimate for obtaining the first tuple from the data source. Positive double values are allowed. Can be used only in |
1000.0 |
No |
fdw_tuple_cost |
DOUBLE |
The cost estimate for obtaining one tuple from the data source. Positive double values are allowed. Can be used only in |
0.01 |
No |
fdw_tuples_count |
DOUBLE |
The estimate for the number of all rows in the data source. Does not take into account the query selectivity, Positive double values >= Can be used only in |
100000.0 |
No |
User mapping
To create a user mapping, use the CREATE USER MAPPING
command. The basic command syntax is listed below:
CREATE USER MAPPING FOR { <username> | USER | CURRENT_USER | PUBLIC }
SERVER <server_name>
[ OPTIONS ( <option> '<value>' [, ... ] ) ]
where:
-
<username>
— a user name in ADB. The following constants are allowed:-
USER
orCURRENT_USER
— the current user who established the ADB connection. -
PUBLIC
— all present and future ADB users.
-
-
<server_name>
— a server name. -
<option>
— user mapping options, which typically include a user name and a password to connect to an external data source. The options that are available for ADB ClickHouse Connector are listed in the User mapping options table. -
<value>
— option values.
NOTE
|
Name | Type | Description | Default | Required |
---|---|---|---|---|
clickhouse_username |
TEXT |
A user name in ClickHouse |
— |
Yes |
clickhouse_password |
TEXT |
A user password in ClickHouse. If there is no password (for example, in the test environment), an empty string is specified |
— |
Yes |
Foreign table
To create a foreign table, use the CREATE FOREIGN TABLE
command. The basic command syntax is listed below:
CREATE FOREIGN TABLE [ IF NOT EXISTS ] <table_name> ( [
<column_name> <data_type> [ COLLATE <collation> ] [ <column_constraint> [ ... ] ]
[, ... ]
] )
SERVER <server_name>
[ OPTIONS ( <option> '<value>' [, ... ] ) ]
where:
-
<table_name>
— a foreign table name in ADB. -
<column_name>
— a column name. -
<data_type>
— a column data type. -
<collation>
— a column collation. -
<column_constraint>
— a constraint defined at the column level. The name<constraint_name>
is given optionally. The syntax:[ CONSTRAINT <constraint_name> ] { NOT NULL | NULL | DEFAULT <default_expr> }
Possible constraints:
-
NOT NULL
— specifies that the column is not allowed to contain null values. -
NULL
— specifies that the column is allowed to contain null values. It is the default behavior (unlessNOT NULL
is specified). -
DEFAULT
— sets the default column value equal to<default_expr>
.
-
-
<server_name>
— a server name. -
<option>
— foreign table options. For ADB ClickHouse Connector, all options defined at the server level can be overwritten at the foreign table level (partially or fully). When you specify an option for a server and a foreign table at the same time, the table level takes precedence. Additionally, at the foreign table level, two more options are available (see Foreign table options below). -
<value>
— option values.
NOTE
|
Name | Type | Description | Default | Required |
---|---|---|---|---|
resource |
TEXT |
A table name in ClickHouse. If you need your own strategy for distribution data across hosts (sharding), you should insert data into the distributed ClickHouse table with the sharding key configured. To do this, specify the distributed table name in the The option value should match the following regular expression: |
— |
Yes |
cluster |
TEXT |
A cluster name in ClickHouse. For more information, see Typical cluster in the ADQM documentation. If you specify the The The maximum |
— |
No |