Конференция Arenadata
Новое время — новый Greenplum
Мы приглашаем вас принять участие в конференции, посвященной будущему Open-Source Greenplum 19 сентября в 18:00:00 UTC +3. Встреча будет проходить в гибридном формате — и офлайн, и онлайн. Онлайн-трансляция будет доступна для всех желающих.
Внезапное закрытие Greenplum его владельцем — компанией Broadcom - стало неприятным сюрпризом для всех, кто использует или планирует начать использовать решения на базе этой технологии. Многие ожидают выхода стабильной версии Greenplum 7 и надеются на её дальнейшее активное развитие.
Arenadata не могла допустить, чтобы разрабатываемый годами Open-Source проект Greenplum прекратил своё существование, поэтому 19 сентября мы представим наш ответ на данное решение Broadcom, а участники сообщества получат исчерпывающие разъяснения на все вопросы о дальнейшей судьбе этой технологии.

На конференции вас ждёт обсуждение следующих тем:

  • План возрождения Greenplum;
  • Дорожная карта;
  • Экспертное обсуждение и консультации.
Осталось до события

Configuration parameters

This article describes the parameters that can be configured for ADPG services via ADCM. To read about the configuring process, refer to the relevant article Configure services.

NOTE
Some of the parameters become visible in the ADCM UI after the Show advanced flag being set.

ADPG

Parameter Description Default value

Data directory

Directories that are used to store data on the ADPG hosts

/pg_data1

Sysctl parameters

This section describes Linux kernel parameters. To add a new parameter, click the child node Add property of the Sysctl parameters [key:value] option.

Sysctl parameters
Sysctl parameters

In the window that appears, specify the parameter name and value and click Apply.

The window with a Sysctl parameter
The window with a Sysctl parameter

You can add multiple parameters.

External Etcd cluster

ADPG uses etcd as the Patroni Distributed Configuration Store (DCS). If you use an external etcd cluster, switch on the corresponding toggle button, expand the External Etcd [ip_address:port] node, and click Add property.

External Etcd settings
External Etcd settings

In the window that appears, specify the IP address of the etcd server in the first row and the port in the second row.

The window with external Etcd settings
The window with external Etcd settings

You can add multiple external etcd servers.

Enable PgBouncer

To use PgBouncer, switch on the Enable PgBouncer toggle button. When this toggle button is enabled, you can expand the corresponding node and specify settings listed in the table below.

PgBouncer settings
PgBouncer settings
Parameter name Description Default value

listen_port

The port that PgBouncer listens on. This port cannot be changed for a running cluster. Execute the Reconfigure & Restart action to apply changes

15432

pool_mode

Specifies when a server connection can be reused by other clients. It can contain the following values:

  • session — a server is released back to the pool after a client disconnects.

  • transaction — a server is released back to the pool after a transaction finishes.

  • statement — a server is released back to the pool after a query finishes. Transactions that include multiple statements are prohibited in this mode.

session

max_client_conn

The maximum number of client connections allowed

100

default_pool_size

The number of connections to the server that are allowed for each user/database pair

20

min_pool_size

The minimum number of connections to the server to keep in the pool

0

reserve_pool_size

The number of additional connections allowed to the pool if a client connection exceeds the time specified in the reserve_pool_timeout option

0

reserve_pool_timeout

If a client needs to wait longer than the specified time in seconds, additional connections from reserve_pool_size are used

5

max_db_connections

The maximum number of connections to a server for a database

0

max_user_connections

The maximum number of connections to a server for a user

0

Enable all users

When enabled, this option allows access to all users with password authentication. If the option is disabled, utilize Users list to allow specified users to access ADPG with PgBouncer

Enabled

Users list

List of users who are allowed to access ADPG using PgBouncer, when the Enable all users option is disabled. For example: postgres,user1

Empty

Enable all databases

When enabled, this option allows access to all existing databases with PgBouncer. If the option is disabled, utilize Databases list to allow access to specified databases using PgBouncer

Databases list

List of databases that are allowed to access using PgBouncer when the Enable all databases option is disabled. It has the same syntax as the databases PgBouncer section. For example, postgres = host=localhost port=5432

Empty

Use custom pg_hba.conf

When disabled, the general PG_HBA section of the ADPG service will be used for PgBouncer. If the option is enabled, it is necessary to add required settings to Custom pg_hba.conf. In this case, PgBouncer will use settings from Custom pg_hba.conf

Disabled

Custom pg_hba.conf

Custom pg_hba.conf section for PgBouncer

Empty

After you specify required settings, click Save and execute the Reconfigure & Restart action to apply changes.

To make PgBouncer monitoring work correctly, call the cluster action Reconfigure monitoring agents after changing settings from the Enable pgbouncer configuration section.

Enable backups

To enable creating backups, switch on the Enable backups toggle button. When this toggle button is enabled, you can expand the corresponding node and specify settings listed in the table below. Note that the S3 options only have an effect if Repo type is set to s3.

Enable backups
Enable backups
Parameter name Description Default value

Archive timeout(s)

The amount of time (in seconds) to wait before forcing a switch to the next WAL file

1800

Backup stanza

The name of the current stanza. The stanza is created with the specified name and the added prefix: adpg<version>-. For example, if the specified name is mystanza, the result — adpg14-mystanza

 — 

Backup repo

The path to the repository where pgBackRest should store backups and archives WAL segments

 — 

Repo type

A storage type. You can use the following values:

  • posix — a storage that is compatible with POSIX.

  • s3 — an S3 storage. If you select the S3 type, specify S3 configuration settings listed below.

  • cifs — a storage that supports the CIFS protocol.

posix

S3 URI style

An S3 URI style. The following URI styles are supported:

  • host — <bucket>.<endpoint>.<host>. For example, mybucket.us-east-1.amazonaws.com.

  • path — <endpoint>.<host>/<bucket>. For example, us-east-1.amazonaws.com/mybucket.

path

S3 Region

An S3 repository region where the bucket was created

 — 

S3 Bucket

An S3 bucket used to store the repository with backups

 — 

S3 Endpoint

An S3 repository endpoint. The endpoint should be valid for the specified region

 — 

S3 Key

An S3 repository access key used to access the bucket

 — 

S3 Key Secret

An S3 repository secret access key used to access this bucket

 — 

Retention full type

It is possible to choose whether a time (days) or a number will be used to determine which backups are expired. Retention full type determines whether the Retention full setting represents a time period (days) or a number of full backups to keep. The following Retention full type values are available:

  • count — the number of backups to keep.

  • time — full backups older than the Retention full value (in days) will be removed from the repository if there is at least one backup that expires in a number of days equal to or greater than Retention full.

count

Retention full

A number of full backups to retain or the number of days to keep each full backup depending on Retention full type

7

Retention diff

A number of differential backups to retain

 — 

Enable compression

Enables the backup compression

false

Compress type

A backup compress type. You can use one of the following values: bz2, gz, lz4, and zst

gz

Compress level

A file compression level (from 0 to 9) used when Compression type does not equal none

The following default compression levels are used based on Compression type:

  • bz2 — 9;

  • gz — 6;

  • lz4 — 1;

  • zst — 3.

Log path

A path to the directory with log files

/var/log/adpg14-pgbackrest

Log level

A logging level of messages written to log files. Possible values are: off, error, warn, info, detail, debug, and trace.

info

Use custom config

If checked, all the pgBackRest configuration settings except the stanza name (Backup stanza) are applied from the Global options and Custom options parameters. Specify these parameters to add settings to the pgBackRest configuration file

false

Global options

Settings to add to the [global] section of the pgBackRest configuration file

 — 

Custom options

Settings to add to the pgBackRest configuration file

 — 

NOTE
If you check Use custom config and specify custom pgBackRest settings, ADPG writes these settings to the pgBackRest configuration file without checking. In case of incorrect settings, an error message will not be displayed in the ADCM UI. You need to verify the settings and check if the directory for storing backups exists and is accessible by yourself.

For more information, see Backup and restore using pgBackRest.

Patroni ADPG configurations

This section describes Patroni configuration settings that can be changed in ADCM UI. For more information on these settings, refer to ADPG High Availability overview.

Parameter name Description Default value

synchronous_mode

Enables Patroni synchronous mode

false

synchronous_node_count

Specifies the number of synchronous standby nodes. The parameter value should not be more than the count of hosts with the ADPG component of the ADPG service installed on. This parameter is ignored if synchronous_mode is disabled

1

synchronous_mode_strict

Prevents Patroni from switching off the synchronous replication on the primary when no synchronous standby candidates are available. Enables synchronous_mode_strict in addition to synchronous_mode to guarantee that each entry is stored on at least two nodes

false

maximum_lag_on_failover

Specifies the amount of transactions in bytes that can be lost

1048576

ttl

The TTL to acquire the leader lock (in seconds). It defines the length of time before initiation of the automatic failover process

30

loop_wait

The time that Patroni waits (sleeps) before starting a new loop iteration (in seconds)

10

retry_timeout

The timeout for DCS and PostgreSQL operation retries (in seconds). DCS or network issues shorter than this value do not cause Patroni to demote the leader

10

patroni_listen_port

The port that Patroni listens to for the REST API. Can only be changed before the ADPG service installation

8008

use_custom_location

When enabled, you can specify custom logging settings for Patroni. The parameter requires to execute the Reconfigure & Restart action to apply changes

false

patroni_log_dir

An existing folder to store Patroni logs. The parameter only has an effect if use_custom_location is set to true. Execute the Reconfigure & Restart action to apply changes

/var/log/adpg14/patroni

patroni_log_file_size

The size limit of a single Patroni log file, in bytes. The parameter only has an effect if use_custom_location is set to true. Execute the Reconfigure & Restart action to apply changes

25000000

patroni_log_file_num

The number of log files to retain. The parameter only has an effect if use_custom_location is set to true. Execute the Reconfigure & Restart action to apply changes

4

patroni_log_level

The level of logging. It can contain the following values: critical, error, warning, info, debug, and notset. The parameter only has an effect if use_custom_location is set to true. Execute the Reconfigure & Restart action to apply changes

info

ADPG configurations

The ADPG configurations section allows you to specify parameters from the postgresql.conf file (the default path is /pg_data1/adpg14/postgresql.conf). Use ADCM to set these parameters. When the ADPG service executes the Reconfigure & Restart or Reconfigure & Reload action, postgresql.conf is rewritten with settings specified on the Primary configuration tab of the ADPG service. If you make changes directly to postgresql.conf, these changes will be lost.

After modifying ADPG configuration parameters listed below, execute the Reconfigure & Reload or Reconfigure & Restart action to apply changes.

Parameter name Description Default value

port

The TCP port the server listens on. Execute the Reconfigure & Restart action to apply changes

5432

max_connections

Determines the maximum number of concurrent connections to the server. For a replica host, the value of this parameter must be greater than or equal to the value on the leader host. If this requirement is not met, the replica host will reject all requests. Execute the Reconfigure & Restart action to apply changes

100

shared_buffers

Sets the amount of memory the server used for the shared memory buffer. The higher the value of this parameter, the less the load on the host hard drives will be. Execute the Reconfigure & Restart action to apply changes

128 MB

max_worker_processes

Sets the maximum number of background processes that the system can support. Execute the Reconfigure & Restart action to apply changes

8

max_parallel_workers

Sets the maximum number of workers that the system can support for parallel operations. Execute the Reconfigure & Restart action to apply changes

8

max_parallel_workers_per_gather

Sets the maximum number of workers that can be started by a single Gather or Gather Merge node. Execute the Reconfigure & Restart action to apply changes

2

max_parallel_maintenance_workers

Sets the maximum number of parallel workers that can be started by a single utility command. Execute the Reconfigure & Restart action to apply changes

2

effective_cache_size

Sets the planner’s assumption about the effective size of the disk cache that is available to a single query. This is taken into account when estimating the cost of using the index. A higher value makes it more likely index scans will be used, a lower value makes it more likely sequential scans will be applied. When setting this parameter, you should consider both PostgreSQL shared buffers and the portion of the kernel’s disk cache that will be used for PostgreSQL data files, though some data might exist in both places. Also, take into account the expected number of concurrent queries on different tables, since they will have to share the available space. This parameter does not affect the size of shared memory allocated by PostgreSQL, and it does not reserve kernel disk cache. It is used only for estimation purposes. The system also does not assume data remains in the disk cache between queries. If this value is specified without units, it is taken as blocks, that is BLCKSZ bytes, typically 8 KB

4096 MB

maintenance_work_mem

Specifies the maximum amount of memory to be used by maintenance operations, such as VACUUM, CREATE INDEX, and ALTER TABLE ADD FOREIGN KEY. If this value is specified without units, it is taken as kilobytes. Since only one of these operations can be executed at a time by a session, and they are usually not executed in parallel in a cluster, it is safe to set this value significantly larger than work_mem. Larger settings might improve performance of vacuuming and restoring database dumps

64 MB

work_mem

Sets the base maximum amount of memory to be used by a query operation (such as a sort or hash table) before writing to temporary disk files. Note that for a complex query, several sort or hash operations might be running in parallel. Each operation will be allowed to use as much memory as this value specifies before it starts to write data into temporary files. Several running sessions can also do such operations concurrently. Therefore, the total memory used can be many times greater than the value of work_mem. If this value is specified without units, it is taken as kilobytes

4 MB

min_wal_size

Until WAL disk usage stays below the min_wal_size value, old WAL files are recycled for future use at a checkpoint instead of removing. This ensures that enough WAL space is reserved to handle spikes in WAL usage, for example, when running large batch jobs

80 MB

max_wal_size

Sets the memory limit to which the log size can grow between automatic checkpoints. Increasing this setting may increase the recovery time after a failure. The specified limit can be exceeded automatically with a high load on ADPG

1024 MB

wal_keep_size

Sets the minimum size of segments retained in the pg_wal directory, in case a standby server needs to fetch them for streaming replication. If a standby server connected to the sending server falls behind by more than wal_keep_size megabytes, the sending server might remove a WAL segment still needed by the standby. In this case, the replication connection is terminated. Downstream connections also fail as a result. If WAL archiving is enabled, the standby server can fetch the segment from archive and recover. The wal_keep_size parameter sets only the minimum size of segments retained in pg_wal. The system might need to retain more segments for WAL archival or to recover from a checkpoint. If wal_keep_size is zero, the system does not keep any extra segments for standby purposes. In the Enterprise version, if the wal_keep_size value is set to less than 16 MB, Patroni uses 128 MB instead in its configuration. The value in the ADCM UI is not changed

0 MB

huge_pages

Defines whether huge pages can be requested for the main shared memory area. The following values are valid:

  • Try — the server tries to request huge pages. If this operation fails, the server falls back to the default page.

  • On — if a huge page request fails, the server does not start.

  • Off — huge pages are not used.

try

superuser_reserved_connections

Determines the number of connection "slots" that are reserved for PostgreSQL superuser connections. Execute the Reconfigure & Restart action to apply changes

3

logging_collector

Enables the logging collector. The logging collector is a background process that captures log messages sent to stderr and redirects them into log files. Execute the Reconfigure & Restart action to apply changes

The logging collector is enabled (true)

log_directory

Determines the directory that contains log files. It can be specified as an absolute path, or relative to the cluster data directory

log
(the absolute path is /pg_data1/adpg14/log)

log_filename

Specifies the log file name pattern. The value can include strftime %-escapes to define time-varying file names. If you specify a file name pattern without escapes, use a log rotation utility to save disk space

postgresql-%a.log

log_rotation_age

Determines the maximum period of time to use a log file, after which a new log file is created. If this value is specified without units, it is taken as minutes. Set log_rotation_age to 0 to disable time-based log file creation

1d

log_rotation_size

Determines the maximum size of a log file. After a log file reaches the specified size, a new log file is created. If the value is set without units, it is taken as kilobytes. Set log_rotation_size to 0 to disable size-based log file creation

0

log_min_messages

Specifies the minimum severity level of messages that are written to a log file. Valid values are debug5, debug4, debug3, debug2, debug1, info, notice, warning, error, log, fatal, and panic (see Severity levels table). Messages with the specified severity or higher are included in the log file. For example, if you set log_min_messages to warning, the log file will include the warning, error, log, fatal, and panic messages

warning

log_min_error_statement

Specifies which SQL statements that cause errors are logged. Valid values are debug5, debug4, debug3, debug2, debug1, info, notice, warning, error, log, fatal, and panic see Severity levels table). The log file includes SQL statements for messages of the specified severity or higher. To disable error statement logging, set log_min_error_statement to panic

error

postgresql.conf custom section

You can use postgresql.conf custom section to set additional parameters that are written to the postgresql.conf file. To add parameters, switch on the corresponding toggle button, expand the postgresql.conf [key:value] node and click Add properties node.

The postgresql.conf custom section option
The "postgresql.conf custom section" option

In the window that appears, specify the parameter name and value and click Apply.

The window with a parameter
The window with a parameter

You can add multiple parameters.

Parameters from postgresql.conf custom section have lower priority than other parameters. ADCM does not validate these parameters. Changes in the postgresql.conf file is reverted if the Start action of the ADPG service fails.

PG_HBA

The section allows you to add lines to the pg_hba.conf file. The pg_hba.conf file configures the client authentication.

Balancer

Parameter Description Default value

leader as replica

Specifies if the leader handles read only transactions

False

leader_port

Sets a port on the host with the HAProxy component for write transactions

6432

replica_port

Specifies a port on the host with the HAProxy component for read only transactions

6433

balancer_stats_port

Defines a port on which a web page with a HAProxy statistics report is available

7000

pgbouncer_leader_port

All connections to this port are transferred to a PgBouncer instance of an ADPG leader node

16432

pgbouncer_replica_port

All connections to this port are transferred to PgBouncer instances of ADPG replica nodes

16433

Chrony

Parameter Description Default value

NTP servers

Addresses of valid NTP servers. For example, 0.ru.pool.ntp.org, 1.ru.pool.ntp.org. If this parameter is set, then Master takes the time from NTP servers, Secondary Master — from Master or NTP servers, Segment hosts — from Master or Secondary Master. If the parameter is not set or all NTP servers become unavailable, then Master uses its local clocks, Secondary Master uses the Master or local clocks, and Segment Hosts get the time from Master or Secondary Master

 — 

Etcd

IMPORTANT
The Etcd configuration settings cannot be changed after installation.
Parameter Description Default value

listen_peer_urls_port

Specifies the port for peer communication

2380

listen_client_urls_port

Specifies the port for client requests

2379

Metrics storage

Parameter Description Default value

Monitoring db name

Defines a database name that the Metrics storage service should use

adpg_metrics

Monitoring username

Specifies a username to connect to an ADPG monitoring cluster

adpg_metrics_sender

Monitoring user’s password

Specifies a password, which the adpg_metrics_sender user utilizes to connect to an ADPG cluster. The password can include uppercase [A-Z] and lowercase [a-z] English letters, digits [0-9], and special characters

 — 

Grafana TCP port

Defines a TCP port on which Grafana listens for connections

12012

Grafana admin’s password

Specifies a password for the admin user, which is the default username in Grafana

 — 

Found a mistake? Seleсt text and press Ctrl+Enter to report it