Конференция Arenadata
Новое время — новый Greenplum
Мы приглашаем вас принять участие в конференции, посвященной будущему Open-Source Greenplum 19 сентября в 18:00:00 UTC +3. Встреча будет проходить в гибридном формате — и офлайн, и онлайн. Онлайн-трансляция будет доступна для всех желающих.
Внезапное закрытие Greenplum его владельцем — компанией Broadcom - стало неприятным сюрпризом для всех, кто использует или планирует начать использовать решения на базе этой технологии. Многие ожидают выхода стабильной версии Greenplum 7 и надеются на её дальнейшее активное развитие.
Arenadata не могла допустить, чтобы разрабатываемый годами Open-Source проект Greenplum прекратил своё существование, поэтому 19 сентября мы представим наш ответ на данное решение Broadcom, а участники сообщества получат исчерпывающие разъяснения на все вопросы о дальнейшей судьбе этой технологии.

На конференции вас ждёт обсуждение следующих тем:

  • План возрождения Greenplum;
  • Дорожная карта;
  • Экспертное обсуждение и консультации.
Осталось до события

ADB to ADB Connector configuration

To send data from the local (source) ADB cluster to the remote (target) ADB cluster via ADB to ADB Connector, you should first create the following objects on the local cluster side:

  1. Server — encapsulates connection information that a foreign data wrapper uses to access an external data source.

  2. User mapping — provides authentication to an external data source by mapping users.

  3. Foreign table — a table in ADB that defines the remote data structure. A foreign table has no storage in ADB, but can be used in queries just like a normal table.

To see how to use the listed objects to send data, refer to ADB to ADB Connector usage examples.

IMPORTANT
  • For each target ADB cluster to which you plan to send queries from the source ADB cluster, it is sufficient to create one server and one user mapping. However, there may be several foreign tables if you need to send queries against various ADB tables.

  • All options that are listed in the tables below (Server options, User mapping options, Foreign table options) should be defined in the text format (i.e. in single quotes '). The Type column in the tables shows what data type will be used when validating input values.

Server

To create a server, use the CREATE SERVER command. The basic command syntax is listed below:

CREATE SERVER <server_name> [ TYPE '<server_type>' ] [ VERSION '<server_version>' ]
    FOREIGN DATA WRAPPER <fdw_name>
    [ OPTIONS ( [ <option> '<value>' [, ... ]] ) ]

where:

  • <server_name> — a server name in ADB. Should be unique within the current ADB database.

  • <server_type> — an optional server type.

  • <server_version> — an optional server version.

  • <fdw_name> — a foreign data wrapper name. You should use the adb_fdw foreign data wrapper, which is automatically created after the connector installation (see step 4 in ADB to ADB Connector installation).

  • <option> — server options that define the connection details. The options that are available for ADB to ADB Connector are listed in the Server options table. Note that these options (except host, port, num_segments) can be defined both at the server level and at the foreign table level. Options marked as required should be specified at least in one definition (server or table). Table-level options take precedence.

  • <value> — option values.

NOTE
  • To create a server, you need the USAGE privilege on the adb_fdw foreign data wrapper. The user who creates the server becomes its owner.

  • The full description of the CREATE SERVER command syntax is available in the Greenplum documentation.

  • To edit a server definition, use the ALTER SERVER command. To delete a server, use DROP SERVER.

Server options
Name Type Description Default Required

host

TEXT

An IP address or name of the remote ADB master host.

Validation rules for the host option are listed below

 — 

Yes

port

TEXT

A port number that is used to connect to the remote ADB master host.

Validation rules for the port option are listed below

 — 

Yes

dbname

TEXT

A name of the target database on the remote ADB cluster side

 — 

Yes

mpp_execute

TEXT

The distribution strategy for query execution. The only possible value is master.

The mpp_execute option cannot be changed. The execution strategy (master, all segments, or any) is selected automatically by the ADB to ADB Connector, depending on the type of operation requested and the result of the query planning

master

No

use_remote_estimate

BOOL

Indicates whether to use an external cluster for planning a query. Based on the flag, query costs are estimated and the number of executors on segments is calculated.

If use_remote_estimate = true, a parallel cursor is created and the number of executors is determined according to the number of parallel cursor endpoints. Otherwise, the number of executors is taken from num_segments, and if this parameter is not filled — the current number of segments is used. For more information, see ADB to ADB Connector overview.

Possible values: true, false

false

No

num_segments

INT

The number of executors on segments that are used to process data received from the parallel cursor endpoints.

If the number of local cluster segments is smaller than the number of endpoints, some local cluster executors will get more endpoints (which will cause processing skew).

Positive integers >= 1 are allowed

 — 

No

fdw_startup_cost

DOUBLE

The cost estimate for obtaining the first tuple from the data source.

Positive double values are allowed

100.0

No

fdw_tuple_cost

DOUBLE

The cost estimate for obtaining one tuple from the data source.

Positive double values are allowed

0.01

No

fetch_size

INT

The number of rows in the data batch read by the connector.

Positive integers >= 1 are allowed

100

No

updatable

BOOL

Indicates the ability to change tables on the remote ADB cluster side via the INSERT command.

Possible values: true, false

true

No

Validation rules for the host and port options
  • The maximum length of the host value is 255 characters.

  • You can use in the host value digits 0-9, ASCII characters a-zA-Z, hyphens -, underscores _, and dots ..

  • The maximum length of the port value is 5 characters.

  • Only digits 0-9 are allowed in the port value.

User mapping

To create a user mapping, use the CREATE USER MAPPING command. The basic command syntax is listed below:

CREATE USER MAPPING FOR { <username> | USER | CURRENT_USER | PUBLIC }
    SERVER <server_name>
    [ OPTIONS ( <option> '<value>' [, ... ] ) ]

where:

  • <username> — a user name in ADB. The following constants are allowed:

    • USER or CURRENT_USER — the current user who established the ADB connection.

    • PUBLIC — all present and future ADB users.

  • <server_name> — a server name.

  • <option> — user mapping options, which typically include a user name and a password to connect to an external data source. The options that are available for ADB to ADB Connector are listed in the User mapping options table.

  • <value> — option values.

NOTE
User mapping options
Name Type Description Default Required

user

TEXT

A user name in the remote ADB database

 — 

Yes

password

TEXT

A user password in the remote ADB database

 — 

Yes

Provide access via pg_hba.conf

 

In addition to a user mapping creation, you should provide an access of the selected user to the remote ADB database from the local cluster. Add the following record to the pg_hba.conf file on the remote ADB cluster side:

host    <database_name>  <remote_user>       <local_master_address>      password

where:

  • <database_name> — a name of the target database in the remote ADB cluster. The all value means all databases.

  • <remote_user> — a user name in the remote ADB database (see the user option above).

  • <local_master_address> — an IP address of the master host in the local ADB cluster (from which you plan to send queries). Use a subnet number. For example, 10.92.17.84/32.

You can modify the pg_hba.conf file via the ADCM web interface. To do this, fill in the Custom pg_hba section parameter on the Configuration tab of the ADB service in the remote ADB cluster. To apply changes, click Save and run the service action Reconfigure.

Foreign table

To create a foreign table, use the CREATE FOREIGN TABLE command. The basic command syntax is listed below:

CREATE FOREIGN TABLE [ IF NOT EXISTS ] <table_name> ( [
    <column_name> <data_type> [ COLLATE <collation> ] [ <column_constraint> [ ... ] ]
      [, ... ]
] )
    SERVER <server_name>
  [ OPTIONS ( <option> '<value>' [, ... ] ) ]

where:

  • <table_name> — a foreign table name in ADB.

  • <column_name> — a column name.

  • <data_type> — a column data type.

  • <collation> — a column collation.

  • <column_constraint> — a constraint defined at the column level. The name <constraint_name> is given optionally. The syntax:

    [ CONSTRAINT <constraint_name> ]
    { NOT NULL |
      NULL |
      DEFAULT <default_expr> }

    Possible constraints:

    • NOT NULL — specifies that the column is not allowed to contain null values.

    • NULL — specifies that the column is allowed to contain null values. It is the default behavior (unless NOT NULL is specified).

    • DEFAULT — sets the default column value equal to <default_expr>.

  • <server_name> — a server name.

  • <option> — foreign table options. For ADB to ADB Connector, all options defined at the server level (except host, port, num_segments) can be overwritten at the foreign table level (partially or fully). When you specify an option for a server and a foreign table at the same time, the table level takes precedence. Additionally, at the foreign table level, three more options are available (see Foreign table options below).

  • <value> — option values.

NOTE
  • To create a foreign table, you need the USAGE privilege on the corresponding server and on all column data types that are used in the table definition. The user who creates the foreign table becomes its owner.

  • The full description of the CREATE FOREIGN TABLE command syntax is available in the Greenplum documentation.

  • To edit a foreign table definition, use the ALTER FOREIGN TABLE command. To drop a foreign table, use DROP FOREIGN TABLE.

Foreign table options
Name Type Description Default Required

schema_name

TEXT

A schema name that is used to access a table on the remote ADB cluster side. If the parameter is undefined, a schema of the table_name table is used

 — 

No

table_name

TEXT

A target table name on the remote ADB cluster side

 — 

Yes

column_name

TEXT

A column name of the target table table_name

 — 

No

Found a mistake? Seleсt text and press Ctrl+Enter to report it