ADB monitoring metrics
This article describes metrics for monitoring an ADB cluster. For information on how to install monitoring, refer to the sections:
Collect metrics
The monitoring cluster collects metrics from all cluster hosts on which the Monitoring Clients service is installed. Refer to the ADCM Mapping tab to check which hosts are currently monitored. Metrics are collected on hosts using Diamond and custom scripts. Both tools send metrics to Graphite, which is installed on the monitoring cluster.
Diamond
Diamond is a Python daemon that collects system metrics. The Diamond configuration file is /etc/diamond/diamond.conf.
Diamond uses program components, called collectors, to gather system metrics such as CPU utilization, disk space usage, and input/output load. Collector configuration files reside in the /etc/diamond/collectors directory on each monitoring client host. Refer to the Diamond documentation for more information about Diamond and collectors. Besides common collectors, additional collectors can be used:
-
If the PXF service is installed on a host, the
PXFCollector
component is used to collect metrics related to PXF as a Java service. -
If ADBM or ADB Control are monitored, the
DockerCollector
component is used to collect metrics related to ADBM and ADB Control Docker containers.
Custom scripts
Custom ADB scripts extend monitoring capabilities by providing data on the ADB cluster and database activity. By default, monitoring scripts are located in the /home/gpadmin/arenadata_configs directory on the master host. These are:
-
arenadata_segments_monitor.sh — collects information about the cluster, such as the number of segments and their mirrors, their state, and replication lag.
-
db_datfrozenxid_alerter.sh — monitors transaction ID wraparound risk.
-
pxf-monitor.sh — monitors the status and uptime of the PXF service. It is available only on hosts where the PXF service is installed.
View metrics
Graphite
Graphite gets metrics from Diamond and monitoring scripts.
To view metrics in the Graphite UI, enter the address of the host with the monitoring cluster and the Graphite port (80
by default) in a browser URL bar, for example, 192.0.2.5:80
.
You can check and modify this IP address and port in the ADCM user interface, in the Graphite service configuration, using the ip_and_ports → Host IP address and ip_and_ports → Web-interface TCP port parameters.
These values are set when you configure the Graphite service during installation of the monitoring cluster.
On the left side of the window that opens, expand the Metrics → Arenadata → DB → <Cluster_id> node. Two groups of metrics are available:
-
System_metrics — shows general characteristics of hosts, usually related to resource consumption.
System metricsMetrics group Description cpu
CPU utilization
diskspace
Disk space usage
docker
Metrics related to Docker containers. They are available if ADBM or ADB Control are being monitored
files
File statistics
iostat
Input/output operation performance
loadavg
System load averages
memory
Memory usage
netstat
Network connection statistics
network
Network interface performance
pxfjson
PXF service metrics. The group is available only for hosts where the PXF service is installed
uptime
How long the system has been on since it was last restarted
-
database — provides data on the database, segments, and transactions.
Database metricsMetric group Metric name Description available
is_available
Whether a database is available
db_datfrozenxid
Existing database names
The oldest transaction age
replication
REPLICATION_LAG
Sync delay (in bytes) between the master and the standby master
REPLICATION_STATE
The state of the WAL streaming replication process. Possible values:
-
streaming
— the master is streaming changes after its connected standby server has caught up with the primary. -
startup
— the master is starting up. -
catchup
— the master’s connected standby is catching up with the primary. -
backup
— the master is sending a backup. -
inactive
— replication is disabled.
segments
MIRRORS_AS_PRIMARY
The number of mirror segments that are currently running in the primary role
TOTAL_PRIMARY_SEGMENTS
The number of segments that are configured to operate in the primary role (
preferred_role = 'p'
ingp_segment_configuration
)TOTAL_SEGMENTS
The total number of configured primary and mirror segments
UP_SEGMENTS
The number of segments that are currently online and operational (
status ='u'
ingp_segment_configuration
)sessions
LONGEST_XACT_SESS_ID
The session ID of the longest-running active transaction
LONGEST_XACT_TIME
The duration (in seconds) of the longest-running active transaction
-


Grafana
Grafana allows you to visualize metrics stored in Graphite, create your own dashboards, or modify existing ones.
To view Grafana dashboards, enter the address of the host with the monitoring cluster and the Grafana port (3000
by default) in a browser URL bar, for example, 192.0.2.5:3000
.
You can check and modify this IP address and port in the ADCM user interface, in the Grafana service configuration, using the ip_and_ports → Host IP address and ip_and_ports → Port parameters.
To log in to Grafana, use the values of the security → Username and security → Password parameters of the Grafana service configuration.
These values are set when you configure the Grafana service during installation of the monitoring cluster.
Grafana dashboards
By default, the following dashboards are available in Grafana:
Arenadata DB system cluster <Cluster name>
The Arenadata DB system cluster <Cluster name> dashboard consists of two sections: Database and System.
The Database section contains panels that show information about the cluster, such as the master replication state and segment statuses.

The table below describes the Grafana dashboard panels available in the Database section.
Panel name | Description | Graphite source |
---|---|---|
Database is |
The database state: |
<Cluster_id>/database/available/is_available |
Mirrors as primaries |
The number of mirror segments that are running in the primary role |
<Cluster_id>/database/segments/MIRRORS_AS_PRIMARY |
Database segments |
The total number of segments and their status |
<Cluster_id>/database/segments/TOTAL_SEGMENTS |
Longest Transaction (sec) |
The duration (in seconds) of the longest-running active transaction |
<Cluster_id>/database/sessions/LONGEST_XACT_TIME |
Longest transaction (sess_id) |
The session ID of the longest-running active transaction |
<Cluster_id>/database/sessions/LONGEST_XACT_SESS_ID |
Master replication state is |
The state of the WAL streaming replication process |
<Cluster_id>/database/replication/REPLICATION_STATE |
Replication delay |
Sync delay (in bytes) between the master and the standby master |
<Cluster_id>/database/replication/REPLICATION_LAG |
Wraparound warn percentage |
Shows how close the current transaction ID age is to the warning limit |
<Cluster_id>/database/db_datfrozenxid |
The System section shows the system performance metrics.

The table below describes the Grafana dashboard panels available in the System section.
Panel name | Description | Graphite source |
---|---|---|
CPU usage |
CPU utilization rate in percent |
System_metrics/<host>/cpu |
IOPS |
The number of input/output operations per second |
System_metrics/<host>/iostat/<disk_device>/iops |
IO % |
Percentage of time the disk is performing I/O operations |
System_metrics/<host>/iostat/<disk_device>/util_percentage |
Mb per sec |
Read/write data transfer rate |
System_metrics/<host>/iostat/<disk_device>/read_byte_per_second System_metrics/<host>/iostat/<disk_device>/write_byte_per_second |
Await |
The average time (in milliseconds) for I/O requests issued to the device to be served |
System_metrics/<host>/iostat/<disk_device>/await |
Service time |
The average service time (in milliseconds) for I/O requests that were issued to the device |
System_metrics/<host>/iostat/<disk_device>/service_time |
Network receive bytes |
Bytes received by a network interface |
System_metrics/<host>/network/<network_interface>/rx_byte |
Network transmit bytes |
Bytes transmitted by a network interface |
System_metrics/<host>/network/<network_interface>/tx_byte |
Available memory |
How much memory is available for starting new applications without triggering swapping |
System_metrics/<host>/memory/MemAvailable |
Memory free |
Absolute amount of unused physical memory |
System_metrics/<host>/memory/MemFree |
Disk Space Usage - datadirs |
Available disk space as a percentage of the total |
System_metrics/<host>/diskspace/<data_directory>/byte_percentfree |
Disk Space Usage - / |
Available disk space in the /root directory as a percentage of the total |
System_metrics/<host>/diskspace/root/byte_per_second |
LoadAVG |
1-minute load average |
System_metrics/<host>/loadavg/01 |
Processes running |
The number of processes running |
System_metrics/<host>/loadavg/processes_running |
Processes total |
The total number of processes on the system |
System_metrics/<host>/loadavg/processes_total |
Arenadata DB cluster <Cluster name> ADCC
This dashboard shows information about ADB Control and ADBM agents that are installed on the master host and on every segment host. The dashboard also includes information about ADB Control and ADBM Docker containers.

Panel name | Description | Graphite source |
---|---|---|
Agents uptime |
How long ADB Control and ADBM agents have been on since they were last restarted |
System_metrics/<host>/adbm_agent_uptime System_metrics/<host>/adcc_agent_uptime |
Agent CPU usage |
CPU utilization rate (in percent) by ADB Control and ADBM agents |
System_metrics/<host>/adbm_agent_cpu System_metrics/<host>/adcc_agent_cpu |
Agent memory usage |
Agent memory utilization rate in MB |
System_metrics/<host>/adbm_agent_mem System_metrics/<host>/adcc_agent_mem |
Containers uptime |
Uptime of ADBM and ADB Control Docker containers |
System_metrics/<host>/docker/containers/<container_name>/uptime |
Container memory usage |
Memory utilization rate by ADBM and ADB Control Docker containers |
System_metrics/<host>/docker/containers/<container_name>/RSS_byte |
Container CPU usage |
CPU utilization rate by ADBM and ADB Control Docker containers |
System_metrics/<host>/docker/containers/<container_name>/cpu/cpuperc |
Arenadata DB cluster <Cluster name> PXF
This dashboard monitors the uptime of PXF hosts and performance metrics for the PXF application.

Panel name | Description | Graphite source |
---|---|---|
PXF UPTIME |
How long the PXF service has been on since it was last restarted |
System_metrics/<host>/pxfjson/pxf/pxf_uptime |
Active threads |
The number of threads that are actively executing tasks |
System_metrics/<host>/pxfjson/pxf/executor/active |
Queue capacity |
The maximum number of threads to be added to the queue. Configured using the |
System_metrics/<host>/pxfjson/pxf/executor/queue/capacity |
Bytes recieved |
The number of bytes per second received by PXF from ADB |
System_metrics/<host>/pxfjson/pxf/bytes/received |
Records recieved |
The number of records per second received by PXF from ADB |
System_metrics/<host>/pxfjson/pxf/records/received |
Bytes sent |
The number of bytes per second sent by PXF to ADB |
System_metrics/<host>/pxfjson/pxf/bytes/sent |
Records sent |
The number of records per second sent by PXF to ADB |
System_metrics/<host>/pxfjson/pxf/records/sent |
JVM memory committed |
The amount of committed memory (in bytes) for the Java virtual machine that runs the PXF application |
System_metrics/<host>/pxfjson/jvm/memory/committed |
JVM memory max |
The maximum amount of memory in bytes that can be used for memory management by the JVM that runs the PXF application |
System_metrics/<host>/pxfjson/jvm/memory/max |
JVM memory used |
The amount of memory used by the JVM that runs the PXF application |
System_metrics/<host>/pxfjson/jvm/memory/used |
Arenadata System metrics
This dashboard contains the same set of panels as the System dashboard described above, but allows you to monitor hosts from multiple clusters and Arenadata products. Use the Arenadata product list to select which products to display.
