Named collections

Named collections in ADQM/ClickHouse are predefined sets of parameters (specified as key/value pairs) that you can reuse in SQL queries when connecting to external data sources (such as PostgreSQL, Kafka, S3, MySQL, and others) via table functions, dictionaries, integration table and database engines.

This functionality allows you to:

  • simplify SQL queries and make them easier to understand and manage by avoiding re-definition of the same parameters — you can specify values of parameters once within a named collection and then reference that collection in different queries;

  • manage parameter sets centrally — a configuration for integration with an external system is stored in a single location where you can, for example, analyze connection errors if they occur and update the necessary parameters;

  • protect sensitive credentials (for example, passwords) for integration with external systems by hiding them from users without administrator access rights.

NOTE

This article describes how to create and apply named collections using the example of integrating ADQM and ADPG. To run queries provided in examples below, first prepare ADPG as described in the Integration between ADQM and ADPG tutorial: create a PostgreSQL user for connecting to ADPG from ADQM and two PostgreSQL databases with test tables.

Manage named collections

To manage named collections (create/modify/delete), you can use:

  • DDL queries. To be allowed to configure named collections with DDL, a user should have the named_collection_control privilege. To grant this permission, set the named_collection_control parameter to 1 in user settings within the /etc/clickhouse-server/users.xml file (or another configuration file located in the /etc/clickhouse-server/users.d/ directory).

    Example

     
    Assign the named_collection_control privilege to the default user in the users.xml file:

    <users>
        <default>
            ...
            <named_collection_control>1</named_collection_control>
            ...
        </default>
    </users>
  • XML files. Named collections can also be defined in the /etc/clickhouse-server/config.xml file or other XML files located in the /etc/clickhouse-server/config.d/ directory. A named collection created or modified in a configuration file can be applied after the ClickHouse server restart.

You can view a list of existing named collections (both created via DDL queries and defined in XML files) in the named_collections system table.

Override parameters

Parameters set in a named collection can be overridden in an SQL query that uses this collection. For each parameter in a named collection, you can restrict the ability to change its value using the special flag:

  • OVERRIDABLE (or overridable="true" in XML configuration) — parameter value can be overridden;

  • NOT OVERRIDABLE (or overridable="false" in XML configuration) — parameter value cannot be overridden;

  • no flag is set — the ability to override the parameter value is defined by the allow_named_collection_override_by_default setting.

CAUTION
If you use named collections to hide credentials for connecting to external systems from users without administrative access, it is recommended to limit the ability to override the corresponding parameters.

Create named collections

DDL queries

You can create a named collection using the CREATE NAMED COLLECTION query:

CREATE NAMED COLLECTION [IF NOT EXISTS] <collection_name> [ON CLUSTER <cluster_name>] AS
key1 = 'value1' [[NOT] OVERRIDABLE],
key2 = 'value2' [[NOT] OVERRIDABLE],
...;

The keys in a named collection should match the parameter names of the corresponding function, dictionary data source, table or database engine. You can see the exact names of the parameters in the ClickHouse documentation — for example, in the articles postgresql, mysql, s3, remote for the corresponding table functions or PostgreSQL, MySQL, Kafka for table engines.

When referencing a named collection that contains an invalid key, clickhouse-client displays an error message with a list of valid keys (required and optional). For example, if the user_password key is specified in a named collection and the postgresql function is called with a link to this collection, clickhouse-client reports that the function does not support this key:

DB::Exception: Unexpected key user_password in named collection. Required keys: database, db, password, table, user, username,
optional keys: addresses_expr, host, hostname, on_conflict, port, schema, use_table_cache. (BAD_ARGUMENTS)

Example

The following query creates the my_adpg named collection with parameters for connecting ADQM to ADPG:

CREATE NAMED COLLECTION my_adpg AS
host = '<adpg_host>',
port = 5432,
database = 'adpg_db1' NOT OVERRIDABLE,
user = 'adqm_user',
password = '<adqm_user_secret_password>';

Values of the parameters in this example ​(the adpg_db1 database and adqm_user PostgreSQL user) are specified according to test data from the Integration between ADQM and ADPG article — see the Prepare ADPG for test examples section.

XML files

To define named collections in the XML configuration of the ClickHouse server, use the named_collections section:

<clickhouse>
    <named_collections>
        <named_collection1>
            <key1 overridable="true">value1</key1>
            <key2 overridable="false">value2</key2>
            <key3>value3</key3>
            ...
        </named_collection1>
        <named_collection2>
            ...
        </named_collection2>
        ...
    </named_collections>
</clickhouse>

Example

The following configuration defines the my_adpg_xml named collection with parameters for connection to the adpg_db2 ADPG database (also see Integration between ADQM and ADPG for details on this database creation):

<clickhouse>
    <named_collections>
        <my_adpg_xml>
            <host>adpg_host</host>
            <port>5432</port>
            <database>adpg_db2</database>
            <user>adqm_user</user>
            <password>adqm_user_secret_password</password>
        </my_adpg_xml>
    </named_collections>
</clickhouse>

Use named connections

After you have created a named collection, you can use it in SQL queries to reference the predefined parameters when connecting to an external data source through a table function, integration table engine, database engine, or dictionary.

The following examples demonstrate different ways how you can access PostgreSQL data from ADQM using the named collections with ADPG connection parameters created above.

  • The postgresql table function

    Select data from the adpg_db1.table1 ADPG table using the postgresql table function (a table has not been defined in the named collection, so it should be specified separately in the table function parameters):

    SELECT id, name FROM postgresql(my_adpg, table = 'table1');

    Write data to the ADPG table:

    INSERT INTO TABLE FUNCTION postgresql(my_adpg, table = 'table1') (id, name) VALUES (3, 'three');
  • The PostgreSQL table engine

    Create an ADQM table based on the PostgreSQL table engine:

    CREATE TABLE psql_table (id Int32, name String) ENGINE = PostgreSQL(my_adpg, table = 'table1');

    The engine copies data from the named collection when creating a table. Subsequent changes to the named collection do not affect existing tables.

  • The PostgreSQL database engine

    Create a database in ADQM based on the PostgreSQL database engine (this database automatically gets tables from the adpg_db1 PostgreSQL database — in this example, there is only one table1):

    CREATE DATABASE psql_database ENGINE = PostgreSQL(my_adpg);
  • The MaterializedPostgreSQL table engine

    Create a MaterializedPostgreSQL table to replicate data of the adpg_db2.table1 ADPG table using the my_adpg_xml named collection that lists connection parameters:

    SET allow_experimental_materialized_postgresql_table=1;
    CREATE TABLE psql_table_replica (id Int32, name String) ENGINE = MaterializedPostgreSQL(my_adpg_xml, table = 'table1');
  • The MaterializedPostgreSQL database engine

    Create a MaterializedPostgreSQL database to which the table1 and table2 tables from the adpg_db2 ADPG database are replicated, using the my_adpg_xml named collection to pass connection parameters for this replication:

    SET allow_experimental_database_materialized_postgresql=1;
    CREATE DATABASE psql_db_replica
    ENGINE = MaterializedPostgreSQL(my_adpg_xml)
    SETTINGS materialized_postgresql_tables_list = 'table1,table2';
  • The dictionary data source of the POSTGRESQL type

    Create a dictionary in ADQM that reads data from the adpg_db2.roles_dict ADPG table (use the name parameter in the dictionary data source configuration to specify the named collection with settings for connecting to ADPG):

    CREATE DICTIONARY roles_dict_psql (id Int32, role String DEFAULT 'no role assigned')
    PRIMARY KEY id
    SOURCE(POSTGRESQL(name my_adpg_xml table roles_dict))
    LIFETIME(MIN 10 MAX 20)
    LAYOUT(FLAT());

Modify and delete named collections

Named collections created through DDL queries can also be altered or dropped with DDL. To modify or delete named collections defined in XML files, edit or remove their corresponding XML descriptions.

Modify DDL named collections

You can change an existing named collection with the ALTER NAMED COLLECTION query, using the SET keyword to add/change parameters and the DELETE keyword to delete existing ones. The basic syntax of a query is:

ALTER NAMED COLLECTION [IF EXISTS] <collection_name> [ON CLUSTER <cluster_name>]
[SET key1 = 'value1' [[NOT] OVERRIDABLE], key2 = 'value2' [[NOT] OVERRIDABLE], ... ] |
[DELETE key3, key4, ... ];

Below are examples of using the ALTER NAMED COLLECTION query to modify the my_adpg named collection.

  • Add/change parameters

    Change the database parameter value and add the table parameter (this does not change the NOT OVERRIDABLE flag initially set for the database parameter, i.e. the database still cannot be overridden in an SQL query):

    ALTER NAMED COLLECTION my_adpg SET database = 'adpg_db2', table = 'table1';

    Change the database parameter value and allow overriding it in SQL:

    ALTER NAMED COLLECTION my_adpg SET database = 'adpg_db1' OVERRIDABLE;
  • Remove parameters

    Remove the table parameter from the my_adpg collection:

    ALTER NAMED COLLECTION my_adpg DELETE table;

    If you need to reconfigure a parameter so that it uses the default setting (allow_named_collection_override_by_default) for the flag that allows/restricts overriding the parameter value, remove this parameter from the named collection and re-add it with no flag:

    ALTER NAMED COLLECTION my_adpg DELETE database;
    ALTER NAMED COLLECTION my_adpg SET database = 'adpg_db1';

    Change the password parameter value and remove the database parameter with a single query:

    ALTER NAMED COLLECTION my_adpg SET password = '<new_password>' DELETE database;

Drop DDL named collections

To remove a named collection created via a DDL query, use the DROP NAMED COLLECTION query:

DROP NAMED COLLECTION <collection_name>;
Found a mistake? Seleсt text and press Ctrl+Enter to report it