Arenadata Hyperwave
Arenadata Hyperwave (ADH) is a universal hybrid platform based on open-source components and proprietary developments, designed for storing, processing, and analyzing data of any structure and volume.
TOP-10 popular articles
An overview of Apache Iceberg architecture, benefits, and use cases. Iceberg is an open-source table format for data lakes that enables ACID transactions, time travel, schema evolution, partition evolution, and more.
The article shows how to create and run your first DAG to process CSV files.
An overview of HDFS (Hadoop Distributed File System) — a highly fault-tolerant distributed file system designed for deployment on low-cost hardware.
A cheatsheet that describes the most common HDFS commands with examples.
An overview of the Trino service, which is an SQL query engine used for processing data in parallel, distributed over multiple storages, such as object storages, databases, and file systems.
An article about the Airflow concepts (DAG, task, operator) and architectural components. Airflow is a platform that allows you to develop, plan, run, and monitor complex workflows.
A list of software requirements for ADH operation.
An overview of the Kerberos authentication mechanism used for secure access to ADH clusters.
An overview of Apache Ozone — a distributed key/value object storage optimized for working with both Hadoop services and S3 storages. Major components and concepts, read and write operations flow description.
The tables with Arenadata Hyperwave network requirements: ADH service ports, JMX ports, ports redefined by Kerberos, client ports.