Конференция Arenadata
Новое время — новый Greenplum
Мы приглашаем вас принять участие в конференции, посвященной будущему Open-Source Greenplum 19 сентября в 18:00:00 UTC +3. Встреча будет проходить в гибридном формате — и офлайн, и онлайн. Онлайн-трансляция будет доступна для всех желающих.
Внезапное закрытие Greenplum его владельцем — компанией Broadcom - стало неприятным сюрпризом для всех, кто использует или планирует начать использовать решения на базе этой технологии. Многие ожидают выхода стабильной версии Greenplum 7 и надеются на её дальнейшее активное развитие.
Arenadata не могла допустить, чтобы разрабатываемый годами Open-Source проект Greenplum прекратил своё существование, поэтому 19 сентября мы представим наш ответ на данное решение Broadcom, а участники сообщества получат исчерпывающие разъяснения на все вопросы о дальнейшей судьбе этой технологии.

На конференции вас ждёт обсуждение следующих тем:

  • План возрождения Greenplum;
  • Дорожная карта;
  • Экспертное обсуждение и консультации.
Осталось до события

ADB ClickHouse Connector overview

Unlike the previous version Tkhemali Connector 1.X based on the mechanism of external tables, ADB ClickHouse Connector is implemented on the basis of a foreign data wrapper and foreign tables. Consequently, the new version of the ADQM/ClickHouse connector has the following features:

  • More convenient and secure user management. While in the previous connector version you had to write user credentials in the LOCATION option at the level of each external table or in a separate file explicitly, in ADB ClickHouse Connector the user login and password are stored in a separate ADB database object user mapping. It is enough to create a user mapping object once for one ClickHouse cluster with which you plan to connect. If necessary, the data can be modified later via the ALTER USER MAPPING command.

  • Multi-level connection settings. General settings that will be the same for all or most ClickHouse target tables (load distribution strategy, need to use a staging layer, restrictions for batch data, and so on) can be specified at the server object level, and private settings (e.g. specific table name) — at the foreign table level. In the previous version of the connector, all settings had to be specified at the external table level.

  • New load balancing mechanism. In the new connector version, the list of ClickHouse hosts (through which you plan to load data from ADB) is specified in the hosts option. The separate option distribution_type is used to declare the load distribution strategy. Two types of distribution are supported now: random and round-robin. The second method is preferable in terms of uniformity of load. At the same time, when writing to distributed tables, it is possible to specify the ClickHouse cluster name in the cluster option at the foreign table level. In this case, it is not necessary to define explicitly all hosts: one entry point is sufficient, and the full list of hosts will be automatically obtained from the cluster metadata. In the previous version of the connector, it was also possible to get a list of hosts for distributed tables, but the load on ClickHouse cluster hosts could be uneven.

  • Flexible management of custom settings. Custom settings of connection and execution of queries to ClickHouse can now be defined in the clickhouse_properties option in the following format: <custom_option>=<value>;[…​]. To change the settings, it is enough to execute the ALTER SERVER or the ALTER FOREIGN TABLE command. In the previous version of the connector, the custom settings for ClickHouse were stored in the XML file that needed to be copied to all ADB segment hosts.

  • More control over the resource release. A foreign data wrapper supports "out of box" transactions and allows subscribing to transaction control hooks. Due to this, the new connector version provides more control over the progress of requests and resource release, which is especially relevant for transactional data loading via staging tables. In the previous version of the connector, a staging layer might not have been removed under certain circumstances and it was necessary to remove it manually.

  • Simple query syntax. There is no need to use the txn function to enable a transaction mode (staging layer usage) anymore. In the new connector version for data transmission from ADB to ClickHouse, it is enough to execute the ordinary INSERT query against the corresponding foreign table. Staging tables are now managed by a separate flag use_staging. In addition, in the previous connector version, it was necessary to specify full names of staging tables when creating external tables. In ADB ClickHouse Connector, only the target table name in ClickHouse is required to be filled in the resource option at the foreign table level. Names for staging tables are automatically generated based on the template that is specified in the staging_table_name_format option.

IMPORTANT
  • ADB ClickHouse Connector is available in the Enterprise version of ADB 6 (starting with 6.23.3.44).

  • In ADB 6 starting with 6.24.3.47, ADB ClickHouse Connector supports both writing and reading data from ADQM/ClickHouse. However, to use a foreign table in ADB for reading, you should specify for this table column names that match exactly the corresponding names in a source table in ADQM/ClickHouse.

The high-level architecture view of the new connector version is shown below.

ADB ClickHouse Connector: INSERT execution schema
INSERT execution schema
ADB ClickHouse Connector: INSERT execution schema
INSERT execution schema
ADB ClickHouse Connector: SELECT execution schema
SELECT execution schema
ADB ClickHouse Connector: SELECT execution schema
SELECT execution schema
Found a mistake? Seleсt text and press Ctrl+Enter to report it