Arenadata Monitoring overview

Dmitry Bulgakov

Contents

Features
Architecture
Monitoring workflow

Features

Arenadata Monitoring (ADM) is a centralized monitoring and observability system for the Arenadata ecosystem, built on a modern stack of open-source components.

The main functional capabilities of ADM are listed below:

Full metrics lifecycle support for observability and monitoring, from data collection to analysis and visualization.
Centralized long-term metrics storage in a fault-tolerant distributed time-series database (TSDB) based on VictoriaMetrics Cluster.
Active monitoring and a unified health model for Arenadata products (healthy, degraded, or critical).
Support for product-specific dashboards delivered together with Arenadata products.
Open architecture with integration points for external metrics collection and visualization systems.

Architecture

The main components of the ADM architecture are:

VictoriaMetrics. A high-performance TSDB and monitoring storage platform that includes a distributed storage architecture and auxiliary components for collecting, storing, and querying metrics. ADM uses the following components of VictoriaMetrics:
- VictoriaMetrics Insert — data ingestion.
- VictoriaMetrics Storage — persistent storage, compression, and retrieval of time-series data.
- VictoriaMetrics Select — database query execution.
Grafana. A visualization and monitoring platform used to display metrics, logs, and traces collected from various data sources.

To make metrics available for export from the source cluster, the source cluster must have the following components installed:

Metrics exporters (Node Exporter, JMX Exporter). Agents collecting and exposing operating system and service metrics.
One of:
- VictoriaMetrics Agent. A lightweight agent for collecting metrics from various sources, labeling and filtering them, and forwarding them for storage to VictoriaMetrics Storage or any other metrics storage system via the Prometheus-compatible remote_write protocol.
  
  NOTE
  
  VictoriaMetrics Agent is currently not available on a number of Arenadata bundles.
  
  Monitoring architecture (recommended components)
  
  Monitoring architecture (recommended components)
- Prometheus. A monitoring and alerting system that collects and stores metrics from applications and infrastructure.
  
  Monitoring architecture (alternative components)
  
  Monitoring architecture (alternative components)

Monitoring workflow

The monitoring process begins with metrics generation on infrastructure nodes and services. Node Exporter components on each host collect OS-level metrics (CPU, RAM, disk metrics, network I/O). JMX Exporter retrieves metrics from JMX MBeans exposed by JVM-based applications and exposes them in Prometheus format. As a result, metrics from Java services become available.
One of the following components can be used for metrics collection and forwarding to a long-term storage:
- VictoriaMetrics Agent periodically scrapes exporters over HTTP, can apply relabeling and filtering, and then forwards the data to the central VictoriaMetrics cluster using the remote_write mechanism. In this architecture, VictoriaMetrics Agent does not store data locally and acts as a lightweight metrics collector forming a unified edge layer. This approach reduces resource consumption on the cluster side and simplifies monitoring operations.
- An alternative component for metrics collection is Prometheus. Prometheus independently scrapes exporters, stores metrics in its local TSDB, and simultaneously forwards them to the central VictoriaMetrics cluster via remote_write. The presence of local storage preserves cluster autonomy and ensures compatibility with existing Prometheus-compatible infrastructures. It also enables local metrics analysis with the help of a local Grafana instance connected to the local Prometheus server.
After the transmission, all metrics are received by the VictoriaMetrics Insert component of the VictoriaMetrics cluster. It accepts the incoming data stream and distributes time series across storage shards according to the cluster sharding algorithm.
The data is then stored in VictoriaMetrics Storage. This component is responsible for long-term metrics storage, data compression, retention policies, and, when configured, replication. High compression efficiency of VictoriaMetrics allows large volumes of time-series data to be stored with relatively low disk space and memory consumption.
When a user opens a dashboard in Grafana, the request is sent to the VictoriaMetrics Select component. It performs a distributed query, aggregates data from all storage nodes, and returns the result to the client. For the user, the system appears as a single centralized metrics storage regardless of which cluster the data originated from or how it was collected. Grafana connects to VictoriaMetrics as a unified data source. Dashboards display both infrastructure metrics collected by Node Exporter and application metrics exposed through JMX Exporter. This provides centralized observability for the entire distributed system through a single interface.

Found a mistake? Seleсt text and press Ctrl+Enter to report it