Arenadata Streaming
Arenadata Streaming can ingest data in real-time from various sources, including databases, sensors, and IoT devices.
The platform can process and transform data streams in real-time using Apache Kafka's stream processing capabilities.
Arenadata Streaming provides tools for real-time data analytics, including machine learning, predictive analytics, and anomaly detection.
The platform offers integration with other data systems, such as Hadoop, Spark, and NoSQL databases.
MiNiFi can also be integrated with MQTT (Message Queuing Telemetry Transport) protocol, which is a lightweight messaging protocol designed for IoT devices. This integration allows MiNiFi to receive and publish data to MQTT brokers, which can be used for real-time data streaming and processing at the edge.
- Kafka Connect. This is a tool for scalable and reliable data streaming between Kafka and databases in both directions in real-time for processing and analysis.
- Kafka JDBC Connector. This is an open-source Kafka connector that provides a simple way to connect Kafka with a database using JDBC. The Kafka JDBC Connector can be used to stream data from Kafka topics into a database in real-time, or to stream data from a database into Kafka topics.
- NiFi Database Connection Pooling Service. This is a built-in NiFi service that allows NiFi to connect to a database using JDBC.
- ExecuteSQL Processor. This is a NiFi processor that can be used to execute SQL statements and queries against a database using JDBC.
- Kafka Connect ClickHouse Sink. This is a Kafka Connect plugin that provides a way to sink data from Kafka to ADQM in near real-time. The ClickHouse Sink Connector can be used to stream data from Kafka topics into ADQM tables, either as individual rows or as batches of rows.
- Kafka JDBC Connector. This is a Kafka Connect plugin that provides a way to connect to a JDBC-compliant database, such as ClickHouse. The JDBC Connector can be used to stream data from Kafka topics into ClickHouse tables, enabling data to be analyzed and processed in real-time.
- NiFi Database Connection Pooling Service. This is a built-in NiFi service that allows NiFi to connect to a database using JDBC.
- ExecuteSQL Processor. This is a NiFi processor that can be used to execute SQL statements and queries against a database using JDBC.
- Kafka Connect. This is a tool for scalable and reliable data streaming between Kafka and ADPG in both directions, allowing real-time streaming of data into a database for processing and analysis.
- Kafka JDBC Connector. This is an open-source Kafka connector that provides a simple way to connect Kafka with a database using JDBC. The Kafka JDBC Connector can be used to stream data from Kafka topics into a database in real-time, or to stream data from a database into Kafka topics.
- NiFi Database Connection Pooling Service. This is a built-in NiFi service that allows NiFi to connect to a database using JDBC.
- ExecuteSQL Processor. This is a NiFi processor that can be used to execute SQL statements and queries against a database using JDBC.
- Kafka Connect. This is a tool for scalable and reliable data streaming between Kafka and Oracle in both directions, allowing real-time streaming of data into a database for processing and analysis.
- Kafka JDBC Connector. This is an open-source Kafka connector that provides a simple way to connect Kafka with a database using JDBC. The Kafka JDBC Connector can be used to stream data from Kafka topics into a database in real-time, or to stream data from a database into Kafka topics.
- NiFi Database Connection Pooling Service. This is a built-in NiFi service that allows NiFi to connect to a database using JDBC.
- ExecuteSQL Processor. This is a NiFi processor that can be used to execute SQL statements and queries against a database using JDBC.
- Kafka Connect. This is a tool for scalable and reliable data streaming between Kafka and MS SQL in both directions, allowing real-time streaming of data into a database for processing and analysis.
- Kafka JDBC Connector. This is an open-source Kafka connector that provides a simple way to connect Kafka with MS SQL using JDBC. The Kafka JDBC Connector can be used to stream data from Kafka topics into a database in real-time, or to stream data from a database into Kafka topics.
- NiFi Database Connection Pooling Service. This is a built-in NiFi service that allows NiFi to connect to a database using JDBC.
- ExecuteSQL Processor. This is a NiFi processor that can be used to execute SQL statements and queries against a database using JDBC.
- Kafka Connect S3 Sink. This is a Kafka Connect plugin that provides a way to sink data from Kafka to S3 in near real-time. The S3 Sink Connector can be used to stream data from Kafka topics into S3 buckets, either as individual objects or as batches of objects. This integration can be particularly useful for long-term storage and archiving of data from Kafka.
- Kafka Connect S3 Source. This is a Kafka Connect plugin that provides a way to source data from S3 to Kafka. The S3 Source Connector can be used to stream data from S3 objects into Kafka topics, enabling data to be analyzed and processed in real-time.
- S3 Object Processor. This is NiFi processor that can be used to perform CRUD (create, read, update, delete) operations on S3 objects. The S3 Object Processor can be configured to interact with S3 using access keys or roles, and can be used to transfer data between NiFi and S3 in real-time.
- Amazon S3 Put/Get Object processors. The PutS3Object processor can be used to write data from NiFi to S3, while the GetS3Object processor can be used to read data from S3 into NiFi.
- Kafka Connect MongoDB Sink. This is a Kafka Connect plugin that provides a way to sink data from Kafka to MongoDB in near real-time. The MongoDB Sink Connector can be used to stream data from Kafka topics into MongoDB collections, either as individual documents or as batches of documents.
- Kafka MongoDB Source Connector. This is a Kafka Connect plugin that provides a way to source data from a MongoDB replica set into Kafka topics in near real-time.
- PutMongoRecord Processor. This is a built-in NiFi processor that can be used to write data from NiFi to MongoDB in near real-time. The PutMongoRecord Processor can be configured to connect to MongoDB using a MongoDB client and credentials, and can be used to insert data into MongoDB collections from NiFi.
- GetMongo Processor. This is a built-in NiFi processor that can be used to read data from MongoDB and bring it into NiFi for further processing. The GetMongo Processor can be configured to connect to MongoDB using a MongoDB client and credentials, and can be used to retrieve data from MongoDB collections for further processing in NiFi.
- Kafka Connect ClickHouse Sink. This is a Kafka Connect plugin that provides a way to sink data from Kafka to ADQM in near real-time. The ClickHouse Sink Connector can be used to stream data from Kafka topics into ADQM tables, either as individual rows or as batches of rows.
- Kafka JDBC Connector. This is a Kafka Connect plugin that provides a way to connect to a JDBC-compliant database, such as ClickHouse. The JDBC Connector can be used to stream data from Kafka topics into ClickHouse tables, enabling data to be analyzed and processed in real-time.
- NiFi Database Connection Pooling Service. This is a built-in NiFi service that allows NiFi to connect to a database using JDBC.
- ExecuteSQL Processor. This is a NiFi processor that can be used to execute SQL statements and queries against a database using JDBC.
- Kafka Connect. This is a tool for scalable and reliable data streaming between Kafka and ADPG in both directions, allowing real-time streaming of data into a database for processing and analysis.
- Kafka JDBC Connector. This is an open-source Kafka connector that provides a simple way to connect Kafka with a database using JDBC. The Kafka JDBC Connector can be used to stream data from Kafka topics into a database in real-time, or to stream data from a database into Kafka topics.
- NiFi Database Connection Pooling Service. This is a built-in NiFi service that allows NiFi to connect to a database using JDBC.
- ExecuteSQL Processor. This is a NiFi processor that can be used to execute SQL statements and queries against a database using JDBC.
- Kafka Connect. This is a tool for scalable and reliable data streaming between Kafka and Oracle in both directions, allowing real-time streaming of data into a database for processing and analysis.
- Kafka JDBC Connector. This is an open-source Kafka connector that provides a simple way to connect Kafka with a database using JDBC. The Kafka JDBC Connector can be used to stream data from Kafka topics into a database in real-time, or to stream data from a database into Kafka topics.
- NiFi Database Connection Pooling Service. This is a built-in NiFi service that allows NiFi to connect to a database using JDBC.
- ExecuteSQL Processor. This is a NiFi processor that can be used to execute SQL statements and queries against a database using JDBC.
- Kafka Connect. This is a tool for scalable and reliable data streaming between Kafka and MS SQL in both directions, allowing real-time streaming of data into a database for processing and analysis.
- Kafka JDBC Connector. This is an open-source Kafka connector that provides a simple way to connect Kafka with MS SQL using JDBC. The Kafka JDBC Connector can be used to stream data from Kafka topics into a database in real-time, or to stream data from a database into Kafka topics.
- NiFi Database Connection Pooling Service. This is a built-in NiFi service that allows NiFi to connect to a database using JDBC.
- ExecuteSQL Processor. This is a NiFi processor that can be used to execute SQL statements and queries against a database using JDBC.
- Kafka Connect S3 Sink. This is a Kafka Connect plugin that provides a way to sink data from Kafka to S3 in near real-time. The S3 Sink Connector can be used to stream data from Kafka topics into S3 buckets, either as individual objects or as batches of objects. This integration can be particularly useful for long-term storage and archiving of data from Kafka.
- Kafka Connect S3 Source. This is a Kafka Connect plugin that provides a way to source data from S3 to Kafka. The S3 Source Connector can be used to stream data from S3 objects into Kafka topics, enabling data to be analyzed and processed in real-time.
- S3 Object Processor. This is NiFi processor that can be used to perform CRUD (create, read, update, delete) operations on S3 objects. The S3 Object Processor can be configured to interact with S3 using access keys or roles, and can be used to transfer data between NiFi and S3 in real-time.
- Amazon S3 Put/Get Object processors. The PutS3Object processor can be used to write data from NiFi to S3, while the GetS3Object processor can be used to read data from S3 into NiFi.
- Kafka Connect MongoDB Sink. This is a Kafka Connect plugin that provides a way to sink data from Kafka to MongoDB in near real-time. The MongoDB Sink Connector can be used to stream data from Kafka topics into MongoDB collections, either as individual documents or as batches of documents.
- Kafka MongoDB Source Connector. This is a Kafka Connect plugin that provides a way to source data from a MongoDB replica set into Kafka topics in near real-time.
- PutMongoRecord Processor. This is a built-in NiFi processor that can be used to write data from NiFi to MongoDB in near real-time. The PutMongoRecord Processor can be configured to connect to MongoDB using a MongoDB client and credentials, and can be used to insert data into MongoDB collections from NiFi.
- GetMongo Processor. This is a built-in NiFi processor that can be used to read data from MongoDB and bring it into NiFi for further processing. The GetMongo Processor can be configured to connect to MongoDB using a MongoDB client and credentials, and can be used to retrieve data from MongoDB collections for further processing in NiFi.
Apache ZooKeeper is a distributed coordination service used by Arenadata Streaming to manage the configuration and coordination of its clusters. It is a crucial component of the system as it helps to ensure high availability and fault tolerance in Arenadata Streaming clusters.
ZooKeeper provides a hierarchical namespace that allows Arenadata Streaming to store configuration data, manage distributed locks, and coordinate distributed processes. It provides a consistent view of the system state across all nodes in the cluster, which helps to prevent data inconsistencies and ensure data integrity.
For example, Arenadata Streaming uses ZooKeeper to manage its Kafka brokers, topics, and partitions. When a new broker is added to the cluster, ZooKeeper is used to assign it a unique identifier and to coordinate the distribution of data across the cluster.
Apache Kafka is a distributed streaming platform used by Arenadata Streaming to manage the ingestion, processing, and analysis of real-time data streams. It provides a scalable, fault-tolerant, and highly available infrastructure for processing and storing real-time data.
Arenadata Streaming leverages Kafka's capabilities to handle large volumes of data and support multiple data sources. It provides a real-time data processing platform that enables businesses to analyze data as it flows through the system, providing near-instant insights into business operations.
Schema Registry is a centralized repository used by Arenadata Streaming to store and manage schemas for data produced and consumed by Apache Kafka. It allows users to define, evolve, and share schemas across different applications and systems that use Kafka.
In Arenadata Streaming, Schema Registry enables users to ensure data compatibility across different versions of their applications and systems. It provides a way to enforce data validation and to ensure that all data produced and consumed by Kafka conforms to a predefined schema.
KSQL is a streaming SQL engine used by Arenadata Streaming to process real-time data streams. It allows users to write SQL queries to transform, aggregate, and analyze data in real-time, making it easy to create real-time data processing pipelines without the need for complex programming.
In Arenadata Streaming, KSQL provides a simple yet powerful way to interact with data streams, enabling users to query, join, and filter data as it flows through the system. It supports a wide range of SQL operations, including windowing, aggregations, and joins, allowing users to create complex processing logic without the need for custom code.
Kafka Connect is a data integration framework used by Arenadata Streaming to move data between Apache Kafka and other systems. It provides a scalable and fault-tolerant infrastructure for ingesting and exporting data to and from Kafka, making it easy to integrate different systems and technologies with Kafka.
In Arenadata Streaming, Kafka Connect enables users to integrate data from various sources such as databases, file systems, and messaging systems with Kafka. It provides connectors that can be configured to extract data from different systems and write it to Kafka topics, or to read data from Kafka topics and write it to external systems.
It is also used for MirrorMaker 2. MirrorMaker 2 is a tool used by Arenadata Streaming to replicate data between Apache Kafka clusters. It is a replacement for the original MirrorMaker tool and provides several new features and improvements over its predecessor.
Kafka REST Proxy is a tool used by Arenadata Streaming to expose Apache Kafka functionality as a RESTful API. It provides a simple and scalable way to integrate Kafka with other systems and technologies that support RESTful APIs.
Apache NiFi is an open-source data integration tool used by Arenadata Streaming to automate the flow of data between different systems and technologies. It provides a visual drag-and-drop interface for designing and configuring data flows, making it easy for users to build complex data pipelines without writing any code.
In Arenadata Streaming, Apache NiFi enables users to build and manage data flows across different systems and technologies. It provides a wide range of processors and connectors that can be used to integrate with various data sources and destinations, including databases, message queues, and cloud platforms.
Apache MiNiFi is a lightweight data collection tool used by Arenadata Streaming to collect and preprocess data at the edge of the network. It is designed to run on resource-constrained devices, such as sensors and IoT devices, and enables users to collect and process data in real-time, without relying on a central server.
In Arenadata Streaming, Apache MiNiFi enables users to collect and preprocess data at the edge of the network, before sending it to a central server for further processing and analysis. It provides a wide range of processors and connectors that can be used to collect data from various sources, including sensors, cameras, and other IoT devices.
Apache NiFi Registry is a version control and management system used by Arenadata Streaming to manage and version data flows and other assets created using Apache NiFi. It provides a central repository for storing and managing NiFi flows, templates, and other artifacts, enabling users to easily version, deploy, and reuse them across different environments.
Kafka Manager (also known as CMAK) is a web-based management tool used to manage Apache Kafka clusters. It is designed to simplify the administration of Kafka clusters, providing a user-friendly interface for managing and monitoring Kafka topics, partitions, and brokers.
In Arenadata Streaming, Kafka Manager enables users to easily manage and monitor their Kafka clusters. It provides a web-based interface for performing administrative tasks, such as creating and deleting topics, reassigning partitions, and managing broker configurations. It also provides real-time metrics and monitoring of Kafka clusters, allowing users to easily identify and troubleshoot issues.
A single tool for managing the lifecycle of all Arenadata products.
ADCM is installed with one command and only requires Docker.
Two options for cluster deployment and management.
The Self-managed option for on-premises requires manual installation and configuration.
Cloud Managed allows you to control the cluster via cloud interfaces.
By using infrastructure bundles, ADS supports installation on physical and virtual servers (on-premises), in private and public clouds according to the IaaS model.
Additionally, infrastructure bundles provide automatical installation on existing nodes and nodes creation "on the fly" for part of cloud providers (YC, VK).
Supported.
ADS supports a number of proprietary solutions for integration:
- ADB Kafka Connector;
- ADQM Kafka Connector;
- Kafka Picodata Connector;
- NiFi Hive streaming processor;
- Kafka Connect Mirror Maker 2.
The Kafka clusters are managed via Kafka Manager based on CMAK.
In addition, there is a proprietary solution — ADS Control, the current functionality of which allows you to manage the Kafka Connect service using a convenient interface.
Proprietary interface for configuration and administration of all components.
ADS supplies Kafka Connect, Schema Registry, ksqlDB, Kafka REST Proxy, Kafka Manager, NiFi, MiNiFi.
Additionally, you can install ADS Control — solution for Kafka clusters management. It supports the management of multiple Kafka and Kafka Connect clusters with the ability to create, edit, and remove Kafka connectors.
Includes the entire Apache Kafka ecosystem: Kafka Connect, Schema Registry, ksqlDB, Kafka REST Proxy.
Via ADCM.
Flexible settings with Ranger in a separate ADPS product, which can serve multiple instances of ADS and other Arenadata products.
Knox as a part of ADPS.
Full training on working with Arenadata products.
ADS has a free version available. You can just download it.
Detailed documentation in Russian and English languages for all services, their installation, configuration, and operation.
Publicly available.
ADS has been used for hundreds of thousands of hours in more than 20 Russian leader companies as a streaming platform.
Complete release history with service versions and description of the upgraded functionality is available in the open domain.
Complete release history with service versions and description of the upgraded functionality is available in the open domain.
Separate documentation for Cloud and Self-managed versions.
ADS
Confluent
“Product comparison” section is relevant on the date of 15.09.2024.
- Implemented the consumer groups overview
- Implemented support for RedOS
- Implement message reading
- Supported Java 17
- Updated all services
- Developed the kafka-rest-security plugin
- Improved security management
- Added support for Ubuntu 22.04 LTS
- Upgraded NiFi to 1.23.2
- Added support for Ubuntu 22.04 LTS
- Added the ability to create/delete/edit topics
- Added the page to overview topics
- Added the page to overview brokers/clusters
- Upgraded the Kafka version to 3.3.2 and other Kafka services
- Package checking on cluster actions is now optional
- Improved Kerberos management
- Implemented RedOS support
- Implemented encryption of NiFi configuration parameters
- Upgraded version of NiFi to 1.20.0
- Implemented support for Astra Linux
- Revised user management to use secretmap entries in the ADCM interface
- Implemented support for Astra Linux in the Enterprise version of the ADS bundle
- Implemented showing Confluent licenses in ADS when adding services during cluster installation
- Implemented hostgroup settings for all services
- Reworked user authentication via SASL PLAINTEXT and Basic Auth to use secretmap entries in the ADCM interface
-
Changed versions:
- ZooKeeper up to 3.5.10
- Kafka Connect up to 2.8.1
- Added actions for Schema Registry service for expand and shrink operations
- Added installation of the MiNiFi Toolkit
- Added the ability to use configuration groups for MiNiFi service
- Reworked log4j templates for the ksqlDB service
-
Changed versions:
- NiFi Server and NiFi Registry components up to Apache NiFi 1.18
- MiNiFi service up to Apache MiNiFi 1.18
- Added the ability to delete a service from a cluster in the ADCM interface
- Implemented support for Alt 8 SP in minifi.sh for NiFi service version 1.18
- For the NiFi Registry (a component of the NiFi service), Ranger authorization is implemented – access protection when storing and managing shared resources in one or more NiFi instances
- Added the ability to manage all parameters using the ADCM user interface in all configuration files
-
The basic authentication is now available for the following ADS services:
- Schema Registry
- Kafka REST Proxy
- KSQL
- Added support of Alt Linux 8.4 operating system for ADS
- For the NiFi service, the Ranger authorization plugin has been added, the ability to add or remove permissions for processing messages in NiFi has been implemented
- Added support for Kafka Connect service and Mirror Maker 2 mechanism for ADS
- Updating package versions:
- Kafka 2.8.1
- Nifi 1.15.0
- Nifi-Registry 1.15.0
- Schema-Registry 6.2.1
- Kafka REST Proxy 6.2.1
- KSQL 6.2.1
- MiNiFi 1.15.0
- Added LDAP/AD authentication for NiFi service
- For the NiFi service, the ability to work with the "_routing" option using the NiFi Elasticsearch processor has been added
- Switching of the logging level for ADS services is implemented
- For ADS in ADCM, the ability to configure channel protection via the SSL protocol has been implemented
- The authentication protocol Kerberos is implemented for ADS
- Added the ability to use Active Directory as a Kerberos store for ADS
-
Implemented assembly of components with a dependency on ZooKeeper 3.5.8:
- MiNiFi 0.7.0
- NiFi-Registry 0.7.0
- NiFi 1.12.0
-
Updating package versions:
- Kafka 2.6.0
- Zookeeper 3.5.8
- Nifi 1.12.0
- Nifi-Registry 0.7.0
- Schema-Registry 6.0.0
- Kafka REST Proxy 6.0.0
- KSQL 6.0.0
- Kafka Manager 3.0.0.5
- Implemented SASL/PLAIN support for Kafka, KSQL, Schema-Registry, Kafka-Rest, Kafka-Manager services
- Implemented the ability to add/update users for SASL/PLAIN
- Implemented ADS integration with ADPS (Arenadata Platform Security)
- Implemented support for Ranger Kafka Plugin
- Enterprise version released
-
Updating package versions:
- Kafka 2.4.0
- Zookeeper 3.5.6
- Nifi 1.10.0
- Nifi-Registry 0.5.0
- Schema-Registry 5.4.0
- Kafka REST Proxy 5.4.0
- KSQL 5.4.0
- Kafka Manager 1.3.3.23
- Implementation of the MiNiFi 0.5.0 service
-
For the MiNiFi service, the following actions have been implemented:
- Install
- Start/Stop/Restart
- Check
- Expand
- Shrink
- Monitoring implemented for MiNiFi service
- ALT Linux operating system support
- Added support for Kafka version 2.4.0 in Kafka-Manager
- Added Analytics Framework support for NiFi service
- Added cluster update operation
- Added operations for adding/removing a host from a running Kafka, Nifi, Zookeeper cluster
- Added the ability to export/import the connection string to the Zookeeper service for sharing one service instance in different clusters
- Added the ability to install offline
- Added Restart and Check operations - checking the performance of the Nifi service
- Added integration of Nifi and Nifi-Registry services
- Implemented collection and visualization, automatic sending of Nifi metrics to the Monitoring cluster