Arenadata Documentation
Our passion is to build efficient flexible solutions that scale up to dozens of petabytes
Products
Explore our range of solutions in the world of Big Data

Arenadata Orchestrator

Arenadata Orchestrator (ADO) is a platform for setting up and operating data pipelines in a production environment.
The central component of the platform is Apache Airflow, an open-source tool used to programmatically create, schedule, and monitor process and task sequences (DAGs).
Use cases
Work process orchestration

The flexible capabilities of the scheduler, combined with its reliability, fault tolerance and scalability, make the platform indispensable for planning and orchestrating processes of any complexity.

Machine learning and AI

Machine learning applications require careful data organization to manage the entire lifecycle of models – from their creation to deployment and monitoring.


ADO is a powerful data management platform built on Apache Airflow. It provides robust tools to streamline MLOps workflows by simplifying model development, deployment, and maintenance.

Infrastructure management

Data pipelines written in Python make it easy to turn custom functions into tasks and interact with any API, making it a great tool for managing your infrastructure, such as Kubernetes clusters.

Data integration

ADO has built-in integration with a variety of data and analytics platforms and tools, and supports all the popular Python libraries used by data scientists, data engineers, and other professionals. This provides teams with an easy-to-configure orchestration framework that can be integrated into their preferred tools. In this way, ADO makes it easy to collaborate on designing, debugging, and maintaining data pipelines as code, accelerating the development process and making deployment and maintenance easier.

Creating ETL/ELT pipelines

Extract-Transform-Load (ETL) and Extract-Load-Transform (ELT) data pipelines are the most common use cases for Apache Airflow due to the following features:

  • Tool agnostic. Airflow can be used to orchestrate ETL/ELT pipelines for any data source or destination.
  • Extensions. Airflow supports a variety of modules and also allows you to create your own operators and hooks for specific use cases.
  • Dynamics. The platform allows dynamic creation of new data pipelines based on input parameters/metadata.
  • Scalability. Airflow can scale to handle an infinite number of tasks and workflows given enough computing power.
Enterprise
Community
Built-in Airflow functionality
Automated management and monitoring tools
Technical support 24/7
High availability and disaster recovery features
Deploy & upgrade automation
Corporate training courses
Offline installation
Tailored solutions
Advanced Airflow features including improved security and integration with external systems
Available integrations
ADQM
ADB
ADH
ADPG
ADS
HashiCorp Vault
HashiCorp Vault
Native integration for secrets management
DuckDB
Oracle
MS SQL
AWS S3
Azure Storage
Azure Datalake
GCS
MySQL
SFTP/FTP
DBT
DBT
*In development

Native integration with dbt Core with support for all necessary adapters for Arenadata EDP

Git
Git
*In development

Integration with the Git version control system for easy deployment of workflows

Operating systems
Alt Linux
  • Alt Linux 8.4 SP is supported
  • Alt Linux 10 SP is supported
CentOS
CentOS 7 is supported
RedHat
RedHat 7 is supported
Astra Linux
  • Astra Linux SE 1.7 Orel is supported
  • Astra Linux SE 1.7 Voronezh is supported
Ubuntu
Ubuntu 22.04.2 LTS is supported
RedOS
RedOS 7.3 is supported
Built-in Airflow functionality
Automated management and monitoring tools
Technical support 24/7
High availability and disaster recovery features
Deploy & upgrade automation
Corporate training courses
Offline installation
Tailored solutions
Advanced Airflow features including improved security and integration with external systems
Available integrations
ADQM
ADB
ADH
ADPG
ADS
HashiCorp Vault
HashiCorp Vault
Available only for Enterprise
DuckDB
Oracle
MS SQL
AWS S3
Azure Storage
Azure Datalake
GCS
MySQL
SFTP/FTP
DBT
DBT
Available only for Enterprise
Git
Git
Available only for Enterprise
Operating systems
Alt Linux
Available only for Enterprise
CentOS
CentOS 7 is supported
RedHat
RedHat 7 is supported
Astra Linux
Available only for Enterprise
Ubuntu
Ubuntu 22.04.2 LTS is supported
RedOS
Available only for Enterprise
Features
Time-saving
Reduced installation and configuration time compared to the manual installation
Production readiness
Built-in features such as maintenance mode, high availability, and centralized dependency management make ADO suitable for production environmentss
Monitoring
The kit includes everything you need to organize a monitoring system, which allows you to be confident in the system's functionality
Standardization
Standardized installation across multiple machines, reducing the risk of errors and inconsistencies
Scalability
Ability to quickly scale ADO horizontally
Expertise
Our team has strong expertise for developing new features and evaluating bug fixes from the broader Hadoop community to determine which ones to incorporate into the product
Integration
There are many providers available out of the box that have been tested with all Arenadata products. Supports sharing hosts between ADCM-managed clusters, enabling installation of external service clients (for example, Spark client for ADH) directly on ADO worker nodes
Secrets management
Integration with HashiCorp Vault and secure handling of sensitive data
Releases
2023
ADO 2.10.5.1
  • Enhanced Python dependencies management
  • Support for HA Metastore and repository proxying
  • Integration with external services via shared hosts
  • Upgraded Airflow and monitoring components
  • Support for AltLinux 10 and Ansible 2.16
  • Improved stability and user experience
ADO 2.6.3.2
  • Python dependencies management via ADCM
  • Support for Astra Linux "Voronezh"
  • Enhanced monitoring configuration and SSL management
  • Support for service maintenance mode
  • Stability improvements and internal optimizations
ADO 2.6.3.1
  • First release maintaining compatibility with Airflow (ADH)
  • Advanced service management capabilities
  • Additional features available out of the box
  • Security improvements