ADPG architecture overview

Eugenia Kuzina

Arenadata Postgres (ADPG) is a DBMS for efficient work with loads of different profiles (primarily OLTP — Online Transaction Processing). ADPG allows you to work with various amounts of data and supports a wide range of data types (including JSON and custom types) with the ability to create data processing mechanisms.

ADPG architecture

The main components of the ADPG cluster in the diagram:

BackupAgent — creates a backup and sends it to the repository. BackupAgent is based on the pgBackRest utility.
BackupManager — manages the creation and storage of backups.
ADPG leader — the main host that has the right to perform write transactions.
ADPG replica — additional hosts that are replicas of the leading ADPG host.
etcd — distributed configuration storage.
Proxies — distribute read/write transactions between leader host and replica hosts (HAProxy).
Connection Poolers — the PgBouncer instances installed on each ADPG node that manage the number of connections and reuse them.

ADPG uses Patroni to build the cluster based on the PostgreSQL streaming replication and implement High Availability (HA). An ADPG cluster that supports HA should contain several ADPG nodes. Each node consists of a PostgreSQL server and a Patroni agent that serves the local PostgreSQL instance. One ADPG node is the leader, others are replica instances. The leader serves read-write transactions (unless otherwise specified in load balancing settings), and replicas process only read-only requests. In case of the leader failure, Patroni elects a new leader from replicas and the ADPG cluster continues to operate.

Patroni saves the cluster configuration in the Distributed Configuration Storage (DCS). ADPG requires an etcd cluster to use it as Patroni DCS.

To implement load balancing, ADPG uses HAProxy (High Availability Proxy). HAProxy is a software TCP/HTTP load balancer. It listens on ports in pairs: connections to one port from the pair are transferred to the leader (directly or throw PgBouncer), requests to another port are distributed across ADPG nodes (directly or throw PgBouncer). For more information, see Load balancing.

PgBouncer is used as a connection pooler. An application can connect to PgBouncer as an ADPG server, and PgBouncer creates a connection to the ADPG server, or reuses one of its existing connections. PgBouncer is designed to reduce the performance impact of opening new connections. To avoid compromising transaction semantics, PgBouncer supports multiple types of pooling when rotating connections:

Session pooling — when a client connects, a server connection is assigned to it for the entire time the client is connected. When the client disconnects, the server connection returns to the pool. This is the default method.
Transaction pooling — a server connection is assigned to a client only during a transaction. When PgBouncer notices that the transaction is complete, the server connection is put back into the pool.
Statement pooling — the server connection is returned to the pool immediately after a query completes. Multi-statement transactions are prohibited in this mode, because they may be disrupted.

Another ADPG part is Storage that is a backup storage host. You can use any type of storage supported by the pgBackRest utility.

ADPG cluster management is performed via ADCM. The following actions are available in the ADCM UI:

Cluster actions that allow you to manage the entire cluster with all services.
Service actions that manage a single service.

Found a mistake? Seleсt text and press Ctrl+Enter to report it