Specific features
Arenadata QuickMarts (ADQM) is a database management system for online analytical processing (OLAP). It can be used in various fields to address the tasks requiring high-speed processing of non-aggregated data that is also constantly added in real-time, for example:
-
web and mobile application analytics;
-
analytics of metrics in real time;
-
quick data marts;
-
monitoring and analysis of structured logs and events;
-
real-time data monitoring and analysis.
ADQM is built on the ClickHouse open-source DBMS that allows generating analytical data reports in real time. Specific features of ADQM/ClickHouse:
-
Column-oriented data storage — each column data is stored independently. This allows reading data quickly (searching for data values in required columns only, not in the entire table) and compressing data effectively (since each column contains data of the same type).
-
Primary indexes — ADQM/ClickHouse keeps data physically sorted by primary key. This makes it possible to quickly retrieve data based on key values or value ranges.
-
Vector calculations — data processing by vectors (parts of columns) allows achieving high CPU efficiency.
-
Parallel query processing — large queries are parallelized across multiple cores and use all the necessary resources available on the current server.
-
Distributed query processing on multiple servers of a cluster due to the sharding mechanism.
-
Sampling (executing a query based on a part of data) and approximate calculations provide acceptable trade-off between accuracy and performance.
-
Linear scalability — you can add new nodes to a cluster to extend it so that it can store and process petabytes of data.
-
Working on hard drives — ADQM/ClickHouse can work on regular hard drives, unlike other column-oriented DBMSs that can work in RAM only. This allows you to optimize data storage cost as hard drives are cheaper than RAM. SSD and additional RAM can also be used if available.
-
Asynchronous multi-master replication — after data has been written to any available replica, all the remaining replicas get their copy in the background. As the system maintains full data identity on different replicas, it can automatically recover data after most failures.
-
SQL query syntax — ADQM/ClickHouse supports an SQL language that is similar to ANSI SQL but includes some extensions: arrays and nested data structures, probabilistic structures, the availability to connect an external key/value store, etc.
-
Support for multiple database engines and table engines — the Database Engines and Table Engines articles of the ClickHouse documentation list and describe all database and table engines that ADQM/ClickHouse can use.
-
Real-time data inserts — the MergeTree table engine allows ADQM/ClickHouse to add data to tables continually in real time (without locks when inserting new data).
-
Integration with external systems — ADQM/ClickHouse can be integrated with different external data sources such as Kafka, RabbitMQ, Hadoop (HDFS), MySQL, PostgreSQL, MongoDB, etc.
ADQM/ClickHouse also has some limitations:
-
Transactions are not supported.
-
Updating or deleting previously inserted data is not fast enough.
-
Low speed of point queries retrieving specific rows by their keys due to a sparse index.