Named collections
Named collections in ADQM/ClickHouse are predefined sets of parameters (specified as key/value pairs) that you can reuse in SQL queries when connecting to external data sources (such as PostgreSQL, Kafka, S3, MySQL, and others) via table functions, dictionaries, integration table and database engines.
This functionality allows you to:
-
simplify SQL queries and make them easier to understand and manage by avoiding re-definition of the same parameters — you can specify values of parameters once within a named collection and then reference that collection in different queries;
-
manage parameter sets centrally — a configuration for integration with an external system is stored in a single location where you can, for example, analyze connection errors if they occur and update the necessary parameters;
-
protect sensitive credentials (for example, passwords) for integration with external systems by hiding them from users without administrator access rights.
NOTE
This article describes how to create and apply named collections using the example of integrating ADQM and ADPG. To run queries provided in examples below, first prepare ADPG as described in the Integration between ADQM and ADPG tutorial: create a PostgreSQL user for connecting to ADPG from ADQM and two PostgreSQL databases with test tables. |
Manage named collections
To manage named collections (create/modify/delete), you can use:
-
DDL queries. To be allowed to configure named collections with DDL, a user should have the
named_collection_control
privilege. To grant this permission, set thenamed_collection_control
parameter to1
in user settings within the /etc/clickhouse-server/users.xml file (or another configuration file located in the /etc/clickhouse-server/users.d/ directory).Example
Assign thenamed_collection_control
privilege to thedefault
user in the users.xml file:<users> <default> ... <named_collection_control>1</named_collection_control> ... </default> </users>
-
XML files. Named collections can also be defined in the /etc/clickhouse-server/config.xml file or other XML files located in the /etc/clickhouse-server/config.d/ directory. A named collection created or modified in a configuration file can be applied after the ClickHouse server restart.
You can view a list of existing named collections (both created via DDL queries and defined in XML files) in the named_collections
system table.
Override parameters
Parameters set in a named collection can be overridden in an SQL query that uses this collection. For each parameter in a named collection, you can restrict the ability to change its value using the special flag:
-
OVERRIDABLE
(oroverridable="true"
in XML configuration) — parameter value can be overridden; -
NOT OVERRIDABLE
(oroverridable="false"
in XML configuration) — parameter value cannot be overridden; -
no flag is set — the ability to override the parameter value is defined by the allow_named_collection_override_by_default setting.
CAUTION
If you use named collections to hide credentials for connecting to external systems from users without administrative access, it is recommended to limit the ability to override the corresponding parameters.
|
Create named collections
DDL queries
You can create a named collection using the CREATE NAMED COLLECTION
query:
CREATE NAMED COLLECTION [IF NOT EXISTS] <collection_name> [ON CLUSTER <cluster_name>] AS
key1 = 'value1' [[NOT] OVERRIDABLE],
key2 = 'value2' [[NOT] OVERRIDABLE],
...;
The keys in a named collection should match the parameter names of the corresponding function, dictionary data source, table or database engine. You can see the exact names of the parameters in the ClickHouse documentation — for example, in the articles postgresql, mysql, s3, remote for the corresponding table functions or PostgreSQL, MySQL, Kafka for table engines.
When referencing a named collection that contains an invalid key, clickhouse-client displays an error message with a list of valid keys (required and optional). For example, if the user_password
key is specified in a named collection and the postgresql
function is called with a link to this collection, clickhouse-client
reports that the function does not support this key:
DB::Exception: Unexpected key user_password in named collection. Required keys: database, db, password, table, user, username, optional keys: addresses_expr, host, hostname, on_conflict, port, schema, use_table_cache. (BAD_ARGUMENTS)
Example
The following query creates the my_adpg
named collection with parameters for connecting ADQM to ADPG:
CREATE NAMED COLLECTION my_adpg AS
host = '<adpg_host>',
port = 5432,
database = 'adpg_db1' NOT OVERRIDABLE,
user = 'adqm_user',
password = '<adqm_user_secret_password>';
Values of the parameters in this example (the adpg_db1
database and adqm_user
PostgreSQL user) are specified according to test data from the Integration between ADQM and ADPG article — see the Prepare ADPG for test examples section.
XML files
To define named collections in the XML configuration of the ClickHouse server, use the named_collections
section:
<clickhouse>
<named_collections>
<named_collection1>
<key1 overridable="true">value1</key1>
<key2 overridable="false">value2</key2>
<key3>value3</key3>
...
</named_collection1>
<named_collection2>
...
</named_collection2>
...
</named_collections>
</clickhouse>
Example
The following configuration defines the my_adpg_xml
named collection with parameters for connection to the adpg_db2
ADPG database (also see Integration between ADQM and ADPG for details on this database creation):
<clickhouse>
<named_collections>
<my_adpg_xml>
<host>adpg_host</host>
<port>5432</port>
<database>adpg_db2</database>
<user>adqm_user</user>
<password>adqm_user_secret_password</password>
</my_adpg_xml>
</named_collections>
</clickhouse>
Use named connections
After you have created a named collection, you can use it in SQL queries to reference the predefined parameters when connecting to an external data source through a table function, integration table engine, database engine, or dictionary.
The following examples demonstrate different ways how you can access PostgreSQL data from ADQM using the named collections with ADPG connection parameters created above.
-
The postgresql table function
Select data from the
adpg_db1.table1
ADPG table using the postgresql table function (a table has not been defined in the named collection, so it should be specified separately in the table function parameters):SELECT id, name FROM postgresql(my_adpg, table = 'table1');
Write data to the ADPG table:
INSERT INTO TABLE FUNCTION postgresql(my_adpg, table = 'table1') (id, name) VALUES (3, 'three');
-
The PostgreSQL table engine
Create an ADQM table based on the PostgreSQL table engine:
CREATE TABLE psql_table (id Int32, name String) ENGINE = PostgreSQL(my_adpg, table = 'table1');
The engine copies data from the named collection when creating a table. Subsequent changes to the named collection do not affect existing tables.
-
The PostgreSQL database engine
Create a database in ADQM based on the PostgreSQL database engine (this database automatically gets tables from the
adpg_db1
PostgreSQL database — in this example, there is only onetable1
):CREATE DATABASE psql_database ENGINE = PostgreSQL(my_adpg);
-
The MaterializedPostgreSQL table engine
Create a MaterializedPostgreSQL table to replicate data of the
adpg_db2.table1
ADPG table using the my_adpg_xml named collection that lists connection parameters:SET allow_experimental_materialized_postgresql_table=1;
CREATE TABLE psql_table_replica (id Int32, name String) ENGINE = MaterializedPostgreSQL(my_adpg_xml, table = 'table1');
-
The MaterializedPostgreSQL database engine
Create a MaterializedPostgreSQL database to which the
table1
andtable2
tables from theadpg_db2
ADPG database are replicated, using themy_adpg_xml
named collection to pass connection parameters for this replication:SET allow_experimental_database_materialized_postgresql=1;
CREATE DATABASE psql_db_replica ENGINE = MaterializedPostgreSQL(my_adpg_xml) SETTINGS materialized_postgresql_tables_list = 'table1,table2';
-
The dictionary data source of the POSTGRESQL type
Create a dictionary in ADQM that reads data from the
adpg_db2.roles_dict
ADPG table (use thename
parameter in the dictionary data source configuration to specify the named collection with settings for connecting to ADPG):CREATE DICTIONARY roles_dict_psql (id Int32, role String DEFAULT 'no role assigned') PRIMARY KEY id SOURCE(POSTGRESQL(name my_adpg_xml table roles_dict)) LIFETIME(MIN 10 MAX 20) LAYOUT(FLAT());
Modify and delete named collections
Named collections created through DDL queries can also be altered or dropped with DDL. To modify or delete named collections defined in XML files, edit or remove their corresponding XML descriptions.
Modify DDL named collections
You can change an existing named collection with the ALTER NAMED COLLECTION
query, using the SET
keyword to add/change parameters and the DELETE
keyword to delete existing ones. The basic syntax of a query is:
ALTER NAMED COLLECTION [IF EXISTS] <collection_name> [ON CLUSTER <cluster_name>]
[SET key1 = 'value1' [[NOT] OVERRIDABLE], key2 = 'value2' [[NOT] OVERRIDABLE], ... ] |
[DELETE key3, key4, ... ];
Below are examples of using the ALTER NAMED COLLECTION
query to modify the my_adpg named collection.
-
Add/change parameters
Change the
database
parameter value and add thetable
parameter (this does not change theNOT OVERRIDABLE
flag initially set for thedatabase
parameter, i.e. the database still cannot be overridden in an SQL query):ALTER NAMED COLLECTION my_adpg SET database = 'adpg_db2', table = 'table1';
Change the
database
parameter value and allow overriding it in SQL:ALTER NAMED COLLECTION my_adpg SET database = 'adpg_db1' OVERRIDABLE;
-
Remove parameters
Remove the
table
parameter from themy_adpg
collection:ALTER NAMED COLLECTION my_adpg DELETE table;
If you need to reconfigure a parameter so that it uses the default setting (
allow_named_collection_override_by_default
) for the flag that allows/restricts overriding the parameter value, remove this parameter from the named collection and re-add it with no flag:ALTER NAMED COLLECTION my_adpg DELETE database;
ALTER NAMED COLLECTION my_adpg SET database = 'adpg_db1';
Change the
password
parameter value and remove thedatabase
parameter with a single query:ALTER NAMED COLLECTION my_adpg SET password = '<new_password>' DELETE database;