Arenadata Hadoop
Arenadata Hadoop is a full-fledged enterprise distribution package based on Apache Hadoop and designed for storing and processing semi-structured and unstructured data.
TOP-10 popular articles
A cheatsheet that describes the most common HDFS commands with examples.
The article shows how to create and run your first DAG to process CSV files.
ADH release notes. Learn about new features, improvements, bug fixes, etc.
An overview of Apache Iceberg architecture, benefits, and use cases. Iceberg is an open-source table format for data lakes that enables ACID transactions, time travel, schema evolution, partition evolution, and more.
The tables with Arenadata Hadoop network requirements: ADH service ports, JMX ports, ports redefined by Kerberos, client ports.
Hive provides several ways to work with tables. You can use data manipulation language (DML) queries to import or add data to a table. Also, you can directly ingest data to a Hive table using HDFS commands.
An overview of HDFS (Hadoop Distributed File System) — a highly fault-tolerant distributed file system designed for deployment on low-cost hardware.
An overview of ADH (Arenadata Hadoop) — a full-fledged enterprise distribution package based on Apache Hadoop and designed for storing and processing semi-structured and unstructured data.
The tutorial guides you through the process of installing an Arenadata Hadoop (ADH) cluster using the online and offline installation types.
The section provides reference information on configuration parameters that can be used to configure ADH services via ADCM.