Arenadata Hyperwave

Arenadata Hyperwave (ADH) is a universal hybrid platform based on open-source components and proprietary developments, designed for storing, processing, and analyzing data of any structure and volume.

TOP-10 popular articles

Iceberg tables

An overview of Apache Iceberg — an open-source table format for data lakes that enables ACID transactions, time travel, schema evolution, partition evolution, and other features.

Create a simple DAG

The article shows how to create and run your first DAG to process CSV files.

HDFS command cheatsheet

A cheatsheet that describes the most common HDFS commands with examples.

Work with Iceberg tables in Spark

Apache Iceberg is an open, high-performance format for large analytic tables. The ADH Spark3 service adopts this format allowing you to work with Iceberg tables through Spark.

HDFS architecture

An overview of HDFS (Hadoop Distributed File System) — a highly fault-tolerant distributed file system designed for deployment on low-cost hardware.

Ozone architecture

An overview of Apache Ozone — a distributed key/value object storage optimized for working with both Hadoop services and S3 storages. Major components and concepts, read and write operations flow description.

Analyze queries in Hive

Hive execution plan analysis using the EXPLAIN and ANALYZE commands.

Software requirements

A list of software requirements for ADH operation.

Trino architecture

An overview of the Trino service, which is an SQL query engine used for processing data in parallel, distributed over multiple storages, such as object storages, databases, and file systems.

Kerberos

An overview of the Kerberos authentication mechanism used for secure access to ADH clusters.

Found a mistake? Seleсt text and press Ctrl+Enter to report it