Configure job policies

The Configuration → Job policy page in the ADB Control web interface allows you to configure schedules for different system jobs. This page includes two tabs, each of which is described in detail below.

Data cleanup

On the Configuration → Job policy → Data cleanup tab, you can configure a schedule for removing old metrics from ADB Control.

adbc config jobs dark
The "Configuration → Job policy → Data cleanup" tab
adbc config jobs light
The "Configuration → Job policy → Data cleanup" tab

To configure the job, follow the steps:

  1. Activate the Enable switcher.

  2. Fill in the following parameters in the Main parameters section.

    Field Description Default value

    Batch size

    A maximum number of rows in the data batch

    500

    Expire duration

    A maximum time period that is used to store metrics in ADB Control. If this period expires, the metrics will be removed during the next job launch. Use the following format — <value><unit> (without spaces), where <unit> can have the following values:

    • ms — milliseconds;

    • sec — seconds;

    • min — minutes;

    • hr — hours;

    • d — days;

    • w — weeks.

    Examples: 5min, 1hr, 1d. The minimal value is 100ms; the maximum is 1w

    5d

    Data cleanup schedule

    A schedule for automatic data cleanup (cron expression). The cron dark cron light icon allows you to edit the schedule in the separate form Cron expression generator

    0 0 23 * * ? *

  3. Click Apply. The Revert button is designed to undo the changes that have not been yet saved by clicking Apply.

If all steps are completed successfully, the next launch time of the Data cleanup job is updated on the Jobs → ADB Control → Schedule page.

CAUTION
  • The data cleanup job is enabled in ADB Control by default. It is not recommended to disable the job, since it can cause a lack of disk space.

  • For long-term storage of historical data, you can configure loading metrics to an external database.

Metrics offload

Configuration

On the Configuration → Job policy → Metrics offload tab, you can configure a schedule for uploading ADB Control metrics and events to an external database (PostgreSQL/TimescaleDB/ADB). To configure the job, follow the steps:

  1. On the target host, ensure you have a database to store the uploaded metrics. Its name will be used when configuring the JDBC connection later.

  2. Provide the ADB Control access to the selected external database. For example, PostgreSQL requires the following record in the pg_hba.conf file:

    host    <database_name>  <user_name>       <adbc_address>      trust

    where:

    • <database_name> — a database name in PostgreSQL. For example, adbc_metrics.

    • <user_name> — a user name in PostgreSQL. For example, postgres.

    • <adbc_address> — an IP address of the host where ADB Control is deployed (with a subnet number). For example, 10.92.40.57/32.

  3. Go to the Configuration → Job policy → Metrics offload tab in the ADB Control interface.

    adbc config jobs2 dark
    The "Configuration → Job policy → Metrics offload" tab
    adbc config jobs2 light
    The "Configuration → Job policy → Metrics offload" tab
  4. Activate the Service enabled switcher.

  5. In the JDBC URL field, enter a URL of the JDBC connection to the host where the target database is located. For PostgreSQL, use the following format:

    jdbc:postgresql://<external_ip>:5432/<db_name>

    where:

    • <external_ip> — an IP address of the host where the target database is located. For example, 10.92.6.225.

    • <db_name> — a target database name. For example, adbc_metrics.

    adbc config jobs3 dark
    JDBC connection string
    adbc config jobs3 light
    JDBC connection string
  6. Click Connect. In the window that opens, fill in the following fields:

    • User — a user name. In the following example, the default user postgres is used.

    • Password —  a user password.

      adbc config jobs6 dark
      JDBC connection parameters
      adbc config jobs6 light
      JDBC connection parameters
  7. Click Connect.

  8. Fill in the following fields in the Main parameters section.

    Field Description Default value

    Export resource groups

    The flag that indicates whether to export metrics that are related to resource groups (see resgroup_config_audit, resgroup_status, resgroup_status_per_segment)

    False

    Batch size

    A maximum number of rows in the data batch that is sent to an external database

    1000

    Interval

    A maximum date range for metrics offload (in days)

    5d

    Delay

    The time delay that is used to look for new metrics to export (in minutes)

    10min

    Offload metrics job schedule

    A schedule for the automatic data export to an external database (cron expression). The cron dark cron light icon allows you to edit the schedule in the separate form Cron expression generator

    0 0 * * * ? *

  9. Click Apply. The Revert button is designed to undo the changes that have not been yet saved by clicking Apply.

    adbc config jobs4 dark
    Data is filled
    adbc config jobs4 light
    Data is filled
  10. Get the following message on the successful result.

    adbc config jobs5 dark
    The successful result
    adbc config jobs5 light
    The successful result
  11. As a result, the adcc schema should be created in the external database. The tables in this schema will store metrics and events collected from ADB Control. To check whether all necessary objects are created, you can run the following query in PostgreSQL:

    SELECT table_name
    FROM information_schema.tables
    WHERE table_schema = 'adcc' ORDER BY  table_name;

    Result:

             table_name
    -----------------------------
     audit_auth
     audit_auth_adb
     audit_operation
     cluster
     cluster_query_metric
     query
     query_error
     resgroup_config_audit
     resgroup_status
     resgroup_status_per_segment
     transaction
    (11 rows)

If all steps are completed successfully, the new scheduled job Export job should be added to the Jobs → ADB Control → Schedule page.

IMPORTANT

In addition to the job schedule, you should specify the clusters and databases which metrics should be exported to the external database. For more information, see Cluster management.

Structure of offloaded data

The following tables are used in external databases to store metrics and events offloaded from ADB Control. Note that the conditions for offloading data to the listed tables are different:

audit_auth

 

This table stores user login attempts to ADB Control, as well as failed ADB authorizations. For more information on the below mentioned parameters, see the Authorizations section of the Audit article.

Column Description

id

A unique identifier of the authorization event

time

An event timestamp

username

A user name

session_time

A start timestamp of the session within which the current event took place

session

An identifier of the session within which the current event took place

host

IP address of the host from which the event was initiated

auth_type

An authorization type:

  • ADCC — ADB Control login attempt via basic authentication;

  • ADCC LDAP — ADB Control login attempt via LDAP authentication;

  • ADB — failed ADB login attempt;

  • ADB LDAP — failed ADB login attempt via LDAP authentication.

type

An event type. Possible values:

  • Auth

  • Logout

  • Session expired

result

An event result. Possible values:

  • Success

  • Failed

message

An error message

audit_auth_adb

 

This table contains additional information about failed login attempts to ADB.

Column Description

auth_id

An identifier of the authorization event. Refers to audit_auth.id (see audit_auth)

pid

An identifier of the process within which the current event took place

database

A name of the database to which the connection was made

port

The port number that was used for the current connection

segment

The content identifier of a segment (or master) instance. Corresponds to gp_segment_configuration.content. For more information, see the System catalog tables → gp_segment_configuration section of the Tables article

transaction

A transaction identifier

cmd_count

A command identifier

cluster

A name of the ADB cluster to which the connection was made

audit_operation

 

This table contains information about user operations in ADB Control. For more information on the below mentioned parameters, see the Operations section of the Audit article.

Column Description

id

A unique operation identifier

object_name

A name of the object on which the operation was performed

object_type

A type of the object on which the operation was performed. Possible values:

operation_type

A type of the operation that was applied to the object_name object. Possible values:

  • Cancel

  • Create

  • Delete

  • Resource group change

  • Terminate

  • Update

  • Archive

  • Activate

operation_result

An operation result. Possible values:

  • Success

  • Failed

  • Denied

operation_time

An operation end timestamp

username

A name of the user who performed the operation

hostname

IP address of the host from which the operation was initiated

diffs

Information about changed attributes of the object_name object in the JSON format. For each changed attribute, the following data is stored:

  • name — an attribute name;

  • oldValue — a previous value;

  • newValue — a new value.

The example:

[
    {
        "name":"CLUSTER_ARCHIVED",
        "oldValue":true,
        "newValue":false
    }
]
cluster

 

This table contains information on ADB clusters selected in ADB Control for export of metrics and events to the external database.

Column Description

id

A cluster unique identifier

name

A cluster name

cluster_query_metric

 

This table stores statistics of system resource consumption by commands at the cluster level. For more information, see the System metrics collected for commands section of the Monitoring of commands article.

Column Description

query_id

A command identifier. Refers to query.id (see query)

cpu_usage_total

The total CPU consumption during the command execution (in seconds)

cpu_usage_percent_sum

The CPU amount consumed at the end of the command execution (percentage)

cpu_usage_percent_avg

The average amount of CPU consumed across cluster segments at the end of the command execution (percentage)

cpu_usage_percent_skew

The indicator that displays the CPU consumption skew across cluster segments at the end of the command execution (percentage)

mem_sum

The RAM amount consumed at the end of the command execution (in bytes)

mem_max

The maximum amount of RAM consumed at the end of the command execution (in bytes)

mem_avg

The average amount of RAM consumed across cluster segments at the end of the command execution (in bytes)

mem_skew

The indicator that displays the RAM consumption skew across cluster segments at the end of the command execution (percentage)

vmem_sum

The virtual memory amount consumed at the end of the command execution (in bytes)

vmem_max

The maximum amount of virtual memory consumed at the end of the command execution (in bytes)

vmem_avg

The average amount of virtual memory consumed across cluster segments at the end of the command execution (in bytes)

vmem_skew

The indicator that displays the virtual memory consumption skew across cluster segments at the end of the command execution (percentage)

spill

The amount of spill files on cluster segment hosts (excluding master) at the end of the command execution (in bytes)

spill_hosts

The amount of spill files on cluster hosts (including master) at the end of the command execution (in bytes)

spill_max

The maximum amount of spill files produced on cluster segments at the end of the command execution (in bytes)

spill_skew

The indicator that displays the spill file amount skew at the end of the command execution (percentage)

read_bytes_total

The total amount of data read during the command execution (in bytes)

read_bytes_sum

The amount of data read at the end of the command execution (in bytes)

read_bytes_max

The maximum amount of data read at the end of the command execution (in bytes)

read_bytes_avg

The average amount of data read across cluster segments per second at the end of the command execution (in bytes)

read_bytes_skew

The indicator that displays the data reading skew across cluster segments at the end of the command execution (percentage)

write_bytes_total

The total amount of data written during the command execution (in bytes)

write_bytes_sum

The amount of data written at the end of the command execution (in bytes)

write_bytes_max

The maximum amount of data written at the end of the command execution (in bytes)

write_bytes_avg

The average amount of data written across cluster segments per second at the end of the command execution (in bytes)

write_bytes_skew

The indicator that displays the data writing skew across cluster segments at the end of the command execution (percentage)

query

 

This table contains information on completed SQL commands that were registered in ADB Control.

Column Description

query_id

A unique command identifier

status

A command status. Possible values:

  • Done — successfully completed commands.

  • Cancelled — cancelled commands (as a part of the transaction cancellation or termination).

  • Error — the commands during which some errors occurred.

  • Unknown — the commands, the final status of which is undefined. It is set on a timeout for queries in the Queued, Running, Cancelling statuses, for which there are no corresponding rows in the pg_stat_activity view.

pid

An identifier of the process, within which the command was started

submitted

The timestamp when the user submitted the command

run_at

A command start timestamp

finish_at

A command end timestamp

username

A name of the user who started the command

database

A name of the database where the command was started

resource_group

A name of the resource group that was used for the command

planner

A name of the query optimizer that was used to produce the query execution plan. Possible value:

  • GPORCA — GPORCA is used (optimizer = on).

  • Fallbacked — after the attempt to produce a plan via GPORCA (optimizer = on) failed, Postgres query optimizer is used.

  • Legacy — Postgres query optimizer is used (optimizer = off).

querytext

A text of the SQL query

plantext

A text of the command execution plan

session_id

An identifier of the session, within which the command was started

cluster_id

An identifier of the ADB cluster where the command was started. Refers to cluster.id (see cluster)

queued_time

The total time that the command was queued (in hours, minutes, seconds)

run_time

The total command duration (in hours, minutes, seconds)

sql_id

A common identifier for SQL commands with the same structure

query_error

 

This table contains information about errors that occurred during execution of SQL commands.

Column Description

id

A unique identifier of the error record

query_id

A command identifier. Refers to query.id (see query)

seg_id

The content identifier of a segment (or master) instance. Corresponds to gp_segment_configuration.content. For more information, see the System catalog tables → gp_segment_configuration section of the Tables article

slice_id

An identifier of the slice during which execution the current error occurred

message

An error message text

resgroup_config_audit

 

This table stores information on configuration changes of resource groups.

Column Description

cluster_id

An identifier of the ADB cluster. Refers to cluster.id (see cluster)

group_id

A resource group identifier

datetime

A timestamp of the resource group configuration modification

group_name

A resource group name

limit_type

A type of the modified resource group parameter. Possible values:

  • CONCURRENCY

  • CPU_RATE_LIMIT

  • MEMORY_LIMIT

  • MEMORY_SHARED_QUOTA

  • MEMORY_SPILL_RATIO

  • CPUSET

For more details on resource group parameters, see the Configuration section of the Work with resource groups article

value_from

An original value of the modified resource group parameter

value_to

A new value of the modified resource group parameter

username

A name of the user who changed the resource group configuration

resgroup_status

 

This table stores the statistics of the resource group usage. For more information, see the Statistic section of the Work with resource groups article.

Column Description

cluster_id

An identifier of the ADB cluster. Refers to cluster.id (see cluster)