Configure job policies

The Configuration → Job policy page in the ADB Control web interface allows you to configure schedules for different system jobs. This page includes two tabs, each of which is described in detail below.

Data cleanup

On the Configuration → Job policy → Data cleanup tab, you can configure a schedule for removing old metrics from ADB Control.

adbc config jobs dark
The "Configuration → Job policy → Data cleanup" tab
adbc config jobs light
The "Configuration → Job policy → Data cleanup" tab

To configure the job, follow the steps:

  1. Activate the Enable switcher.

  2. Fill in the following parameters in the Main parameters section.

    Field Description Default value

    Batch size

    A maximum number of rows in the data batch

    500

    Expire duration

    A maximum time period that is used to store metrics in ADB Control. If this period expires, the metrics will be removed during the next job launch. Use the following format — <value><unit> (without spaces), where <unit> can have the following values:

    • ms — milliseconds;

    • sec — seconds;

    • min — minutes;

    • hr — hours;

    • d — days;

    • w — weeks.

    Examples: 5min, 1hr, 1d. The minimal value is 100ms; the maximum is 1w

    5d

    Data cleanup schedule

    A schedule for automatic data cleanup (cron expression). The cron dark cron light icon allows you to edit the schedule in the separate form Cron expression generator

    0 0 23 * * ? *

  3. Click Apply. The Revert button is designed to undo the changes that have not been yet saved by clicking Apply.

If all steps are completed successfully, the next launch time of the Data cleanup job is updated on the Jobs → ADB Control → Schedule page.

CAUTION
  • The data cleanup job is enabled in ADB Control by default. It is not recommended to disable the job, since it can cause a lack of disk space.

  • For long-term storage of historical data, you can configure loading metrics to an external database.

Metrics offload

Configuration

CAUTION

Ensure that you do not set up for metrics offload a database of the same ADB cluster that is monitored by ADB Control (where metrics are collected). It is necessary to avoid monitoring and exporting records about export itself.

On the Configuration → Job policy → Metrics offload tab, you can configure a schedule for uploading ADB Control metrics and events to an external database (PostgreSQL/TimescaleDB/ADB). To configure the job, follow the steps:

  1. On the target host, ensure you have a database to store the uploaded metrics. Its name will be used when configuring the JDBC connection later.

  2. Provide the ADB Control access to the selected external database. For example, PostgreSQL requires the following record in the pg_hba.conf file:

    host    <database_name>  <user_name>       <adbc_address>      trust

    where:

    • <database_name> — a database name in PostgreSQL. For example, adbc_metrics.

    • <user_name> — a user name in PostgreSQL. For example, postgres.

    • <adbc_address> — an IP address of the host where ADB Control is deployed (with a subnet number). For example, 10.92.40.57/32.

  3. Go to the Configuration → Job policy → Metrics offload tab in the ADB Control interface.

    adbc config jobs2 dark
    The "Configuration → Job policy → Metrics offload" tab
    adbc config jobs2 light
    The "Configuration → Job policy → Metrics offload" tab
  4. Activate the Service enabled switcher.

  5. In the JDBC URL field, enter a URL of the JDBC connection to the host where the target database is located. For PostgreSQL, use the following format:

    jdbc:postgresql://<external_ip>:5432/<db_name>

    where:

    • <external_ip> — an IP address of the host where the target database is located. For example, 10.92.6.225.

    • <db_name> — a target database name. For example, adbc_metrics.

    adbc config jobs3 dark
    JDBC connection string
    adbc config jobs3 light
    JDBC connection string
  6. Click Connect. In the window that opens, fill in the following fields:

    • User — a user name. In the following example, the default user postgres is used.

    • Password —  a user password.

      adbc config jobs6 dark
      JDBC connection parameters
      adbc config jobs6 light
      JDBC connection parameters
  7. Click Connect.

  8. Fill in the following fields in the Main parameters section.

    Field Description Default value

    Export resource groups

    The flag that indicates whether to export metrics that are related to resource groups (see resgroup_config_audit, resgroup_status, resgroup_status_per_segment)

    False

    Export sessions

    The flag that indicates whether to export information on sessions (see session)

    False

    Batch size

    A maximum number of rows in the data batch that is sent to an external database

    1000

    Interval

    A maximum date range for metrics offload (in days)

    5d

    Delay

    The time delay that is used to look for new metrics to export (in minutes)

    10min

    Offload metrics job schedule

    A schedule for the automatic data export to an external database (cron expression). The cron dark cron light icon allows you to edit the schedule in the separate form Cron expression generator

    0 0 * * * ? *

  9. Click Apply. The Revert button is designed to undo the changes that have not been yet saved by clicking Apply.

    adbc config jobs4 dark
    Data is filled
    adbc config jobs4 light
    Data is filled
  10. Get the following message on the successful result.

    adbc config jobs5 dark
    The successful result
    adbc config jobs5 light
    The successful result
  11. As a result, the adcc schema should be created in the external database. The tables in this schema will store metrics and events collected from ADB Control. To check whether all necessary objects are created, you can run the following query in PostgreSQL:

    SELECT table_name
    FROM information_schema.tables
    WHERE table_schema = 'adcc' ORDER BY  table_name;

    Result:

             table_name
    -----------------------------
     audit_auth
     audit_auth_adb
     audit_operation
     cluster
     cluster_query_metric
     query
     query_error
     resgroup_config_audit
     resgroup_status
     resgroup_status_per_segment
     session
     transaction
    (12 rows)

If all steps are completed successfully, the new scheduled job Export job should be added to the Jobs → ADB Control → Schedule page.

IMPORTANT

In addition to the job schedule, you should specify the clusters and databases which metrics should be exported to the external database. For more information, see Cluster management.

Structure of offloaded data

The following tables are used in external databases to store metrics and events offloaded from ADB Control. Note that the conditions for offloading data to the listed tables are different:

audit_auth

 

This table stores user login attempts to ADB Control, as well as failed ADB authorizations. For more information on the below mentioned parameters, see the Authorizations section of the Audit article.

Column Description

id

A unique identifier of the authorization event

time

An event timestamp

username

A user name

session_time

A start timestamp of the session within which the current event took place

session

An identifier of the session within which the current event took place

host

IP address of the host from which the event was initiated

auth_type

An authorization type:

  • ADCC — ADB Control login attempt via basic authentication;

  • ADCC LDAP — ADB Control login attempt via LDAP authentication;

  • ADB — failed ADB login attempt;

  • ADB LDAP — failed ADB login attempt via LDAP authentication.

type

An event type. Possible values:

  • Auth

  • Logout

  • Session expired

result

An event result. Possible values:

  • Success

  • Failed

message

An error message

audit_auth_adb

 

This table contains additional information about failed login attempts to ADB.

Column Description

auth_id

An identifier of the authorization event. Refers to audit_auth.id (see audit_auth)

pid

An identifier of the process within which the current event took place

database

A name of the database to which the connection was made

port

The port number that was used for the current connection

segment

The content identifier of a segment (or master) instance. Corresponds to gp_segment_configuration.content. For more information, see the System catalog tables → gp_segment_configuration section of the Tables article

transaction

A transaction identifier

cmd_count

A command identifier

cluster

A name of the ADB cluster to which the connection was made

audit_operation

 

This table contains information about user operations in ADB Control. For more information on the below mentioned parameters, see the Operations section of the Audit article.

Column Description

id

A unique operation identifier

object_name

A name of the object on which the operation was performed

object_type

A type of the object on which the operation was performed. Possible values:

operation_type

A type of the operation that was applied to the object_name object. Possible values:

  • Cancel

  • Create

  • Delete

  • Resource group change

  • Terminate

  • Update

  • Archive

  • Activate

operation_result

An operation result. Possible values:

  • Success

  • Failed

  • Denied

operation_time

An operation end timestamp

username

A name of the user who performed the operation

session

An identifier of the ADB Control user session

hostname

IP address of the host from which the operation was initiated

diffs

Information about changed attributes of the object_name object in the JSON format. For each changed attribute, the following data is stored:

  • name — an attribute name;

  • oldValue — a previous value;

  • newValue — a new value.

The example:

[
    {
        "name":"CLUSTER_ARCHIVED",
        "oldValue":true,
        "newValue":false
    }
]
cluster

 

This table contains information on ADB clusters selected in ADB Control for export of metrics and events to the external database.

Column Description

id

A cluster unique identifier

name

A cluster name

cluster_query_metric

 

This table stores statistics of system resource consumption by commands at the cluster level. For more information, see the System metrics collected for commands section of the Monitoring of commands article.

Column Description

query_id

A command identifier. Refers to query.id (see query)

cpu_usage_total

The total CPU consumption during the command execution (in seconds)

cpu_usage_percent_sum

The CPU amount consumed at the end of the command execution (percentage)

cpu_usage_percent_avg

The average amount of CPU consumed across cluster segments at the end of the command execution (percentage)

cpu_usage_percent_skew

The indicator that displays the CPU consumption skew across cluster segments at the end of the command execution (percentage)

mem_sum

The RAM amount consumed at the end of the command execution (in bytes)

mem_max

The maximum amount of RAM consumed at the end of the command execution (in bytes)

mem_avg

The average amount of RAM consumed across cluster segments at the end of the command execution (in bytes)

mem_skew

The indicator that displays the RAM consumption skew across cluster segments at the end of the command execution (percentage)

vmem_sum

The virtual memory amount consumed at the end of the command execution (in bytes)

vmem_max

The maximum amount of virtual memory consumed at the end of the command execution (in bytes)

vmem_avg

The average amount of virtual memory consumed across cluster segments at the end of the command execution (in bytes)

vmem_skew

The indicator that displays the virtual memory consumption skew across cluster segments at the end of the command execution (percentage)

spill

The amount of spill files on cluster segment hosts (excluding master) at the end of the command execution (in bytes)

spill_hosts

The amount of spill files on cluster hosts (including master) at the end of the command execution (in bytes)

spill_max

The maximum amount of spill files produced on cluster segments at the end of the command execution (in bytes)

spill_skew

The indicator that displays the spill file amount skew at the end of the command execution (percentage)

read_bytes_total

The total amount of data read during the command execution (in bytes)

read_bytes_sum

The amount of data read at the end of the command execution (in bytes)

read_bytes_max

The maximum amount of data read at the end of the command execution (in bytes)

read_bytes_avg

The average amount of data read across cluster segments per second at the end of the command execution (in bytes)

read_bytes_skew

The indicator that displays the data reading skew across cluster segments at the end of the command execution (percentage)

write_bytes_total

The total amount of data written during the command execution (in bytes)

write_bytes_sum

The amount of data written at the end of the command execution (in bytes)

write_bytes_max

The maximum amount of data written at the end of the command execution (in bytes)

write_bytes_avg

The average amount of data written across cluster segments per second at the end of the command execution (in bytes)

write_bytes_skew

The indicator that displays the data writing skew across cluster segments at the end of the command execution (percentage)

query

 

This table contains information on completed SQL commands that were registered in ADB Control.

Column Description

query_id

A unique command identifier

status

A command status. Possible values:

  • Done — successfully completed commands.

  • Cancelled — cancelled commands (as a part of the transaction cancellation or termination).

  • Error — the commands during which some errors occurred.

  • Unknown — the commands, the final status of which is undefined. It is set on a timeout for queries in the Queued, Running, Cancelling statuses, for which there are no corresponding rows in the pg_stat_activity view.

pid

An identifier of the process, within which the command was started

submitted

The timestamp when the user submitted the command

run_at

A command start timestamp

finish_at

A command end timestamp

username

A name of the user who started the command

database

A name of the database where the command was started

resource_group

A name of the resource group that was used for the command

planner

A name of the query optimizer that was used to produce the query execution plan. Possible value:

  • GPORCA — GPORCA is used (optimizer = on).

  • Fallbacked — after the attempt to produce a plan via GPORCA (optimizer = on) failed, Postgres query optimizer is used.

  • Legacy — Postgres query optimizer is used (optimizer = off).

querytext

A text of the SQL query

plantext

A text of the command execution plan

session_id

An identifier of the session, within which the command was started

cluster_id

An identifier of the ADB cluster where the command was started. Refers to cluster.id (see cluster)

queued_time

The total time that the command was queued (in hours, minutes, seconds)

run_time

The total command duration (in hours, minutes, seconds)

sql_id

A common identifier for SQL commands with the same structure

tags

The query tags

query_error

 

This table contains information about errors that occurred during execution of SQL commands.

Column Description

id

A unique identifier of the error record

query_id

A command identifier. Refers to query.id (see query)

seg_id

The content identifier of a segment (or master) instance. Corresponds to gp_segment_configuration.content. For more information, see the System catalog tables → gp_segment_configuration section of the Tables article

slice_id

An identifier of the slice during which execution the current error occurred

message

An error message text

resgroup_config_audit

 

This table stores information on configuration changes of resource groups.

Column Description

cluster_id

An identifier of the ADB cluster. Refers to cluster.id (see cluster)

group_id

A resource group identifier

datetime

A timestamp of the resource group configuration modification

group_name

A resource group name

limit_type

A type of the modified resource group parameter. Possible values:

  • CONCURRENCY

  • CPU_RATE_LIMIT

  • MEMORY_LIMIT

  • MEMORY_SHARED_QUOTA

  • MEMORY_SPILL_RATIO

  • CPUSET

For more details on resource group parameters, see the Configuration section of the Work with resource groups article

value_from

An original value of the modified resource group parameter

value_to

A new value of the modified resource group parameter

username

A name of the user who changed the resource group configuration

resgroup_status

 

This table stores the statistics of the resource group usage. For more information, see the Statistic section of the Work with resource groups article.

Column Description

cluster_id

An identifier of the ADB cluster. Refers to cluster.id (see cluster)

group_id

A resource group identifier

datetime

The timestamp when the measurements were taken

group_name

A resource group name

num_running

A number of transactions running at the datetime moment within the current resource group

num_queueing

A number of transactions queued at the datetime moment within the current resource group

num_queued

A total number of transactions queued within the current resource group over the entire observation period

num_executed

A total number of transactions executed within the current resource group over the entire observation period

total_queue_duration

The total time that all transactions were queued within the current resource group over the entire observation period (in hours, minutes, seconds)

resgroup_status_per_segment

 

This table contains statistics on the consumption of system resources by resource groups on each cluster segment. For more information, see the Resource group details → Usage section of the Work with resource groups article.

Column Description

cluster_id

An identifier of the ADB cluster. Refers to cluster.id (see cluster)

group_id

A resource group identifier

hostname

A name of the host for which the measurements were taken

segment_id

The content identifier of a segment (or master) instance. Corresponds to gp_segment_configuration.content. For more information, see the System catalog tables → gp_segment_configuration section of the Tables article

datetime

The timestamp when the measurements were taken

cpu

The percentage of CPU usage by the resource group on the current segment

memory_used

The total RAM value used by the resource group on the current segment (in bytes). This value includes the resource group fixed and shared memory

memory_available

The total RAM value available to the resource group on the current segment (in bytes). This value includes the resource group fixed and shared memory

memory_quota_used

The RAM value used by the resource group on the current segment — from the amount that is guaranteed fixed for resource group transactions (in bytes)

memory_quota_available

The RAM value available to the resource group on the current segment — from the amount that is guaranteed fixed for resource group transactions (in bytes)

memory_shared_used

The RAM value used by the resource group on the current segment — from the amount that is commonly used by all resource group transactions (in bytes)

memory_shared_available

The RAM value available to the resource group on the current segment — from the amount that is commonly used by all resource group transactions (in bytes)

session

 

This table contains information on sessions that were registered in ADB Control.

Column Description

cluster_id

An identifier of the ADB cluster. Refers to cluster.id (see cluster)

datetime

The timestamp when the data was received

datid

A database OID

datname

A database name

pid

A backend process identifier

sess_id

A session identifier

usesysid

An OID of the user who connected to the backend

usename

A name of the user who connected to the backend

application_name

A name of the application that is connected to the backend

client_addr

A client IP address. The null value indicates the use of a Unix socket on the server or the execution of an internal process (e.g. autovacuum)

client_hostname

A host name of the connected client, as reported by a reverse DNS lookup of the client_addr value. The field is not null only for IP connections, and only when the server configuration parameter (GUC) log_hostname is set to on

client_port

The TCP port number that is used by the client to communicate with the backend, or -1 if a Unix socket is used

backend_start

The time when the backend process was started

xact_start

A transaction start timestamp

query_start

A command execution start timestamp

state_change

The last time the value of the state field was changed

waiting

true in case of waiting on a lock, otherwise false

state

The current session state. Possible values:

  • active — the backend is running a query.

  • idle — the backend is waiting for a new client command.

  • idle in transaction — the backend is in a transaction, but there are no currently running queries.

query

A text of the most recent SQL query that was run at the backend. If state is active, the field shows the currently running query

waiting_reason

The reason why the server process is waiting. Possible values:

  • lock;

  • replication;

  • resgroup — resource group queue.

rsgid

A resource group OID or 0 if definition is not possible (for more information, see pg_stat_activity in the Greenplum documentation)

rsgname

A resource group name or unknown if definition is not possible (for more information, see pg_stat_activity in the Greenplum documentation)

rsgqueueduration

The total time the query has been queued (if applicable)

transaction

 

This table contains information on completed transactions that were registered in ADB Control.

Column Description

txn_id

A unique transaction identifier in ADB Control

status

A transaction status. Possible values:

  • Committed — successfully completed transactions.

  • Aborted — cancelled/terminated transactions.

  • Unknown — the transactions, the final status of which is undefined. It is set on a timeout for transactions in the Active status, for which there are no corresponding rows in the pg_stat_activity view.

pid

An identifier of the process, within which the transaction was started

start_time

A transaction start timestamp

end_time

A transaction end timestamp

username

A name of the user who started the transaction

database

A name of the database where the transaction was started

resource_group

A name of the resource group that was used for the transaction

session_id

An identifier of the session, within which the transaction was started

cluster_id

An identifier of the ADB cluster where the transaction was started. Refers to cluster.id (see cluster)

tags

The transaction tags

Found a mistake? Seleсt text and press Ctrl+Enter to report it