Configure alerts in ADQM Control via web UI

The Alerts tab on the Settings page in the ADQM Control web interface allows you to configure parameters to be criteria for generating alerts.

To change default settings, edit the required fields and click Save. Click Revert all if you need to cancel changes that have not been yet saved by clicking Save.

Alerts in ADQM Control are grouped into modules. Currently, ADQM Control can send and allows you to configure alerts of the System alerts and Internal alerts modules.

System alerts

The System alerts module unites system alerts — alerts that are generated based on values of system metrics that indicate general characteristics of ADQM cluster hosts, usually related to resource consumption (see the System alerts table). For each system metric, you can set thresholds against which ADQM Control compares the current value of the metric and determines whether to generate an alert and its severity level.

Configure system alerts

Use the toggle on dark toggle on light System alerts switch on the Settings/Alerts page to enable/disable the generation of alerts based on system metrics. When the switch is enabled, the System alerts form on the right allows you to configure settings of system alerts.

Configure system alerts
Configure system alerts
System alert settings
Parameter Description Default value

Update frequency

Specifies how often to compare a metric value with thresholds. Once the metric exceeds a threshold, an alert is generated in ADQM Control (but not sent for processing).

The parameter value should be in the range of 1-5 minutes

1m (m — minutes)

Firing at least for

Time during which a metric value should exceed a threshold for the corresponding alert to be sent for processing, after which it will appear in the ADQM Control interface.

The parameter value should be in the range of 1-15 minutes

5m (m — minutes)

Cool down period

Period (after sending an alert) during which ADQM Control ignores an update when a metric value no longer exceeds a threshold. If the metric after this period still does not exceed the threshold, then the alert is considered to be no longer valid. The period starts again after each update when the metric exceeds the threshold (during the previous period) — in other words, it is extended.

The parameter value should be in the range of 2-15 minutes, greater than the value of the Update frequency parameter

2m (m — minutes)

Warning

Metric value at which a medium-significance alert is generated. This level of alert importance indicates that there is a potential problem on an ADQM cluster host due to an increase in the value of the corresponding system metric (but this problem has not reached the critical level yet)

See default thresholds in the System alerts table

Critical

Metric value at which a high-significance alert is generated, indicating that a critical problem has been detected on an ADQM cluster host

See default thresholds in the System alerts table

You can set up the Update frequency and Firing at least for parameters in two ways:

  • enter required values at the top of the System alerts form and enable the check selected dark check selected light Use for all system metrics option to apply the specified values to all system alerts;

  • specify parameter values separately for each alert type.

Types of system alerts

The table below describes the types of alerts that ADQM Control can generate while monitoring the corresponding system metrics on ADQM cluster hosts and comparing their values to specified thresholds.

System alerts
Alert type Condition for generating an alert Default thresholds

Load average

The system load average (a value is in the range [0, 1] for a single CPU and can be higher for multi-core systems) exceeds a threshold — (LA15 + LA5)/2 > threshold. This means either high CPU usage or disk IO activity take too long

Warning — 0.9, Critical — 0.95

CPU utilization

The CPU usage (as a percentage) exceeds a threshold

Warning — 90, Critical — 95

Memory usage

The RAM usage (as a percentage) exceeds a threshold

Warning — 90, Critical — 95

Disk usage

The disk capacity (as a percentage) exceeds a threshold

Warning — 90, Critical — 95

For each alert type, there is a switch toggle on dark toggle on light in the System alerts form that you can use to disable the generation of alerts based on the corresponding metric.

Internal alerts

In the Internal alerts module settings, you can enable/disable the generation of alerts if the system.query_log log table with information about executed queries does not exist in an ADQM cluster.

Configure internal alerts
Configure internal alerts

As with system alerts, you can control the following parameters for internal alerts:

  • Update frequency — frequency of checking for existence of the system.query_log table;

  • Firing at least for — time during which the log table should be missing for the corresponding alert to be added to ADQM Control;

  • Cool down period — period after the alert is sent, during which ADQM Control ignores an update when logging to the system.query_log table has started.

Found a mistake? Seleсt text and press Ctrl+Enter to report it