Configuration parameters

Konstantin Alpashkin, Daria Barysheva

Collapse content Expand content

Contents

Airflow
Core configuration
Flink
HBase
HDFS
Hive
HUE
Impala
Kyuubi
MySQL
Ozone
Solr
Spark
Spark3
SSM
Trino
YARN
Zeppelin
ZooKeeper
Monitoring

This topic describes the parameters that can be configured for ADH services via ADCM. To read about the configuring process, refer to the relevant articles: Online installation, Offline installation.

NOTE

Some of the parameters become visible in the ADCM UI after the Advanced flag has been set.
The parameters that are set in the Custom group will overwrite the existing parameters even if they are read-only.

Airflow

Redis configuration

Parameter

Description

Default value

redis.conf

Redis configuration file

—

sentinel.conf

Sentinel configuration file

—

redis_port

Redis broker listen port

6379

sentinel_port

Sentinel port

26379

airflow.cfg

Parameter Description Default value

db_user

The user to connect to Metadata DB

airflow

db_password

The password to connect to Metadata DB

—

db_port

The port to connect to Metadata DB

3307

admin_password

The password for the web server’s admin user

—

server_port

The port to run the web server

8080

flower_port

The port that Celery Flower runs on

5555

worker_port

When you start an Airflow Worker, Airflow starts a tiny web server subprocess to serve the Workers local log files to the Airflow main web server, which then builds pages and sends them to users. This defines the port, on which the logs are served. The port must be free and accessible from the main web server to connect to the Workers

8793

fernet_key

The secret key to save connection passwords in the database

—

security

Defines which security module to use. For example, kerberos

—

keytab

The path to the keytab file

—

reinit_frequency

Sets the ticket renewal frequency

3600

principal

The Kerberos principal

ssl_active

Defines if SSL is active for Airflow

false

web_server_ssl_cert

The path to SSL certificate

/etc/ssl/certs/host_cert.cert

web_server_ssl_key

The path to SSL certificate key

/etc/ssl/host_cert.key

Logging level

Specifies the logging level for Airflow activity

INFO

Logging level for Flask-appbuilder UI

Specifies the logging level for Flask-appbuilder UI

WARNING

cfg_properties_template

The Jinja template to initialize environment variables for Airflow

cfg_properties_template

External database

Parameter Description Default value

Database type

The external database type. Possible values: PostgreSQL, MySQL/MariaDB

MySQL/MariaDB

Hostname

The external database host

—

Custom port

The external database port

—

Airflow database name

The external database name

airflow

External Broker

Parameter

Description

Default value

Broker URL

The URL of an external broker

—

LDAP Security manager

Parameter Description Default value

AUTH_LDAP_SERVER

The LDAP server URI

—

AUTH_LDAP_BIND_USER

The path of the LDAP proxy user to bind on to the top level. Example: cn=airflow,ou=users,dc=example,dc=com

—

AUTH_LDAP_BIND_PASSWORD

The password of the bind user

—

AUTH_LDAP_SEARCH

Update with the LDAP path under which you’d like the users to have access to Airflow. Example: dc=example, dc=com

—

AUTH_LDAP_UID_FIELD

The UID (unique identifier) field in LDAP

—

AUTH_ROLES_MAPPING

The parameter for mapping the internal roles to the LDAP Active Directory groups

—

AUTH_LDAP_GROUP_FIELD

The LDAP user attribute which has their role DNs

—

AUTH_ROLES_SYNC_AT_LOGIN

A flag that indicates if all the user’s roles should be replaced on each login, or only on registration

true

PERMANENT_SESSION_LIFETIME

Sets an inactivity timeout after which users have to re-authenticate (to keep roles in sync)

1800

AUTH_LDAP_USE_TLS

Boolean whether TLS is being used

false

AUTH_LDAP_ALLOW_SELF_SIGNED

Boolean to allow self-signed certificates

true

AUTH_LDAP_TLS_CACERTFILE

Location of the certificate

—

Core configuration

core-site.xml

Parameter Description Default value

fs.defaultFS

The name of the default file system. A URI whose scheme and authority determine the file system implementation

hdfs://hdfs

fs.trash.checkpoint.interval

Number of minutes between trash checkpoints. Should be smaller or equal to fs.trash.interval. If zero, the value is set to the value of fs.trash.interval. Every time the checkpointer runs, it creates a new checkpoint and removes checkpoints created more than fs.trash.interval minutes ago

fs.trash.interval

Number of minutes after which the checkpoint gets deleted. If zero, the trash feature is disabled. This option may be configured both on the server and the client. If trash is disabled on the server side, the client side configuration is checked. If trash is enabled on the server side, the value configured on the server is used and the client configuration value is ignored

1440

hadoop.tmp.dir

A base for temporary directories

/tmp/hadoop-${user.name}

hadoop.zk.address

ZooKeeper server host and port

—

io.file.buffer.size

The size of the buffer to use in sequence files. The size value should be a multiple of hardware page size (e.g. 4096 on Intel x86) and it determines how much data is buffered during the read and write operations

131072

net.topology.script.file.name

The script name that should be invoked to resolve the DNS to NetworkTopology name mapping

—

hadoop.proxyuser.hbase.groups

Comma-separated list of groups users from which are allowed to be impersonated by HBase

hadoop.proxyuser.hbase.hosts

Comma-separated list of hosts. Users connecting from these hosts are allowed to be impersonated by HBase

hadoop.proxyuser.hue.groups

Comma-separated list of groups users from which are allowed to be impersonated by HUE

hadoop.proxyuser.hue.hosts

Comma-separated list of hosts. Users connecting from these hosts are allowed to be impersonated by HUE

hadoop.proxyuser.hbase-phoenix_queryserver.groups

Comma-separated list of groups users from which are allowed to be impersonated by Phoenix Query Server

hadoop.proxyuser.hbase-phoenix_queryserver.hosts

Comma-separated list of hosts. Users connecting from these hosts are allowed to be impersonated by Phoenix Query Server

hadoop.proxyuser.hive.groups

Comma-separated list of groups users from which are allowed to be impersonated by Hive

hadoop.proxyuser.hive.hosts

Comma-separated list of hosts. Users connecting from these hosts are allowed to be impersonated by Hive

hadoop.proxyuser.httpfs.groups

Comma-separated list of groups users from which are allowed to be impersonated by HttpFS

hadoop.proxyuser.httpfs.hosts

Comma-separated list of hosts. Users connecting from these hosts are allowed to be impersonated by HttpFS

hadoop.proxyuser.HTTP.groups

Comma-separated list of groups users from which are allowed to be impersonated by HTTP keytab services

hadoop.proxyuser.HTTP.hosts

Comma-separated list of hosts. Users connecting from these hosts are allowed to be impersonated by HTTP keytab services

hadoop.proxyuser.knox.groups

Comma-separated list of groups users from which are allowed to be impersonated by Knox

hadoop.proxyuser.knox.hosts

Comma-separated list of hosts. Users connecting from these hosts are allowed to be impersonated by Knox

hadoop.proxyuser.kyuubi.groups

Comma-separated list of groups users from which are allowed to be impersonated by Kyuubi

hadoop.proxyuser.kyuubi.hosts

Comma-separated list of hosts. Users connecting from these hosts are allowed to be impersonated by Kyuubi

hadoop.proxyuser.livy.groups

Comma-separated list of groups users from which are allowed to be impersonated by Livy

hadoop.proxyuser.livy.hosts

Comma-separated list of hosts. Users connecting from these hosts are allowed to be impersonated by Livy

hadoop.proxyuser.yarn.groups

Comma-separated list of groups users from which are allowed to be impersonated by YARN

hadoop.proxyuser.yarn.hosts

Comma-separated list of hosts. Users connecting from these hosts are allowed to be impersonated by YARN

hadoop.proxyuser.zeppelin.groups

Comma-separated list of groups users from which are allowed to be impersonated by Zeppelin

hadoop.proxyuser.zeppelin.hosts

Comma-separated list of hosts. Users connecting from these hosts are allowed to be impersonated by Zeppelin

hadoop.proxyuser.trino.groups

Comma-separated list of groups users from which are allowed to be impersonated by Trino

hadoop.proxyuser.trino.hosts

Comma-separated list of hosts. Users connecting from these hosts are allowed to be impersonated by Trino

fs.s3a.endpoint

AWS S3 endpoint URL

—

fs.s3a.access.key

AWS S3 access key

—

fs.s3a.secret.key

AWS S3 secret key

—

fs.s3a.impl

AWS S3 filesystem class

org.apache.hadoop.fs.s3a.S3AFileSystem

fs.s3a.fast.upload

Defines whether the Fast Upload feature is enabled

true

fs.s3a.connection.ssl.enabled

Defines whether SSL for connections to AWS services is enabled

false

fs.s3a.path.style.access

Defines whether S3 path-style access is enabled

true

hadoop.proxyuser.om.groups

Comma-separated list of groups users from which are allowed to be impersonated by Ozone Manager

hadoop.proxyuser.om.hosts

Comma-separated list of hosts. Users connecting from these hosts are allowed to be impersonated by Ozone Manager

ha.zookeeper.quorum

A comma-separated list of ZooKeeper server addresses that are to be used by ZKFailoverController in automatic failover

—

ipc.client.fallback-to-simple-auth-allowed

Controls whether the client will accept the instruction to switch to the SASL SIMPLE (unsecure) authentication. When set to false, the client will not allow the fallback to SIMPLE authentication and will abort the connection

true

hadoop.security.authentication

The authentication type. Possible values:

simple — no authentication;
kerberos — Kerberos authentication.

simple

hadoop.security.authorization

Controls whether to allow the RPC service-level authorization

false

hadoop.rpc.protection

List of the active protection features. Possible values:

authentication — authentication only;
integrity — integrity check in addition to authentication;
privacy — data encryption in addition to integrity.

authentication

hadoop.security.auth_to_local

General rule for mapping principal names to local user names

—

User managed hadoop.security.auth_to_local

Controls whether to enable automatic generation of hadoop.security.auth_to_local

false

hadoop.http.authentication.type

Defines the authentication used for the HTTP web-consoles. Possible values:

simple — simple authentication;
kerberos — Kerberos authentication;
[AUTHENTICATION_HANDLER_CLASSNAME] — custom authentication.

simple

hadoop.http.authentication.kerberos.principal

Kerberos principal to be used with the kerberos authentication. The principal short name must be HTTP per Kerberos HTTP SPNEGO specification. _HOST (if present) is replaced with the bind address of the HTTP server

HTTP/_HOST@$LOCALHOST

hadoop.http.authentication.kerberos.keytab

Location of the keytab file with the credentials for the Kerberos principal

/etc/security/keytabs/HTTP.service.keytab

ha.zookeeper.acl

ACLs for all znodes

—

hadoop.http.filter.initializers

Add to this property the org.apache.hadoop.security.AuthenticationFilterInitializer initializer class

org.apache.hadoop.security.AuthenticationFilterInitializer

hadoop.http.authentication.signature.secret.file

The signature secret file for signing the authentication tokens. If not set, a random secret is generated during the startup. The same secret should be used for all nodes in the cluster, JobTracker, NameNode, DataNode, and TastTracker. This file should be readable only by the Unix user running the daemons

/etc/security/http_secret

hadoop.http.authentication.cookie.domain

The domain to use for the HTTP cookie that stores the authentication token. In order for authentication to work properly across all nodes in the cluster, the domain must be correctly set. There is no default value, the HTTP cookie will not have a domain working only with the hostname issuing the HTTP cookie

—

hadoop.ssl.require.client.cert

Controls whether the client certificates are required

false

hadoop.ssl.hostname.verifier

The hostname verifier to provide for HttpsURLConnections. Possible values:

DEFAULT
STIRCT
STRICT_IE6
DEFAULT_AND_LOCALHOST
ALLOW_ALL

DEFAULT

hadoop.ssl.keystores.factory.class

The KeyStoresFactory implementation to use

org.apache.hadoop.security.ssl.FileBasedKeyStoresFactory

hadoop.ssl.server.conf

Resource file from which the SSL server keystore information will be extracted. This file is looked up in the CLASSPATH. Typically, it should be in Hadoop’s conf/ directory

ssl-server.xml

hadoop.ssl.client.conf

Resource file from which the SSL client keystore information will be extracted. This file is looked up in the CLASSPATH. Typically, it should be in Hadoop’s conf/ directory

ssl-client.xml

hadoop.ssl.enabled.protocols

The supported SSL protocols

TLSv1.2

fs.AbstractFileSystem.ofs.impl

AbstractFileSystem for Rooted Ozone file system (ofs) URI

org.apache.hadoop.fs.ozone.RootedOzFs

fs.ofs.impl

The implementation class of the ofs file system

org.apache.hadoop.fs.ozone.RootedOzoneFileSystem

ssl-server.xml

Parameter

Description

Default value

ssl.server.truststore.location

Location of the truststore file to be used by NameNodes and DataNodes

—

ssl.server.truststore.password

Truststore password

—

ssl.server.truststore.type

Truststore file format

jks

ssl.server.truststore.reload.interval

Truststore reload check interval in milliseconds

10000

ssl.server.keystore.location

Location of the keystore file to be used by NameNodes and DataNodes

—

ssl.server.keystore.password

Keystore password

—

ssl.server.keystore.keypassword

Password to the key in the keystore

—

ssl.server.keystore.type

Keystore file format

jks

ssl-client.xml

Parameter

Description

Default value

ssl.client.truststore.location

Location of the truststore file to be used by NameNodes and DataNodes

—

ssl.client.truststore.password

Truststore password

—

ssl.client.truststore.type

Truststore file format

jks

ssl.client.truststore.reload.interval

Truststore reload check interval in milliseconds

10000

ssl.client.keystore.location

Location of the keystore file to be used by NameNodes and DataNodes

—

ssl.client.keystore.password

Keystore password

—

ssl.client.keystore.keypassword

Password to the key in the keystore

—

ssl.client.keystore.type

Keystore file format

jks

Other

Parameter

Description

Default value

Custom core-site.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the core-site.xml configuration file

—

Custom hadoop-env.sh

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the hadoop-env.sh configuration file

—

Custom ssl-server.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the ssl-server.xml configuration file

—

Custom ssl-client.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the ssl-client.xml configuration file

—

Custom log4j.properties

Custom logging properties

log.conf

The Configuration server component

nginx.conf
Parameter	Description	Default value
root_config_dir	Root location for storing configurations	/srv/config
nginx_http_port	HTTP port for Nginx	9998
nginx_https_port	HTTPS port for Nginx	9998
ssl_certificate	Path to SSL certificate for Nginx	/etc/ssl/certs/host_cert.cert
ssl_certificate_key	Path to SSL certificate key for Nginx	/etc/ssl/host_cert.key
ssl_protocols	SSL protocol version to be supported for the SSL transport	TLSv1.2

Flink

flink-env.sh

Parameter

Description

Default value

Sources

A list of sources that will be written into flink-env.sh

/usr/lib/bigtop-utils/bigtop-detect-javahome

HADOOP_CLASSPATH

A list of files/directories to be added to the classpath

$(hadoop classpath)

HADOOP_HOME

Hadoop home directory

/usr/lib/hadoop

HADOOP_CONF_DIR

Directory that stores Hadoop configurations

/etc/hadoop/conf

FLINK_LOG_DIR

Flink server log directory

/var/log/flink

FLINK_CONF_DIR

Directory that stores Flink configurations

/etc/flink/conf

FLINK_HOME

The Flink home directory

/usr/lib/flink

flink-conf.yaml

Parameter Description Default value

historyserver.archive.fs.dir

A comma-separated list of directories with job archives scanned by Flink History Server

hdfs:///apps/flink/completed-jobs

jobmanager.archive.fs.dir

The directory where JobManager stores completed job archives

hdfs:///apps/flink/completed-jobs

archive.fs.refresh-interval

The polling interval in milliseconds for scanning archive directories

10000

historyserver.web.port

The port number of the Flink History Server web UI

8082

historyserver.web.ssl.enabled

Enables/disables SSL for the History Server web UI

false

rest.port

The port that the client connects to. This option is respected only if the high-availability configuration is NONE

8081

jobmanager.rpc.port

The RPC port through which the JobManager is reachable. In the high availability mode, this value is ignored and the port number to connect to JobManager is generated by ZooKeeper

6123

sql-gateway.endpoint.rest.port

A port to connect to the SQL Gateway service

8083

taskmanager.network.bind-policy

The automatic address binding policy used by the TaskManager

name

parallelism.default

The system-wide default parallelism level for all execution environments

taskmanager.numberOfTaskSlots

The number of task slots that each TaskManager offers. Each slot runs one parallel pipeline

taskmanager.cpu.cores

The number of CPU cores used by TaskManager. By default, the value is set to the number of slots per TaskManager

taskmanager.memory.flink.size

The total memory size for the TaskExecutors

—

taskmanager.memory.process.size

The total process memory size for the TaskExecutors. This includes all the memory that a TaskExecutor consumes, including the total Flink memory, JVM Metaspace, and JVM Overhead. In containerized setups, this parameter should be equal to the container memory

2048m

jobmanager.memory.flink.size

The total memory size for the JobManager

—

jobmanager.memory.process.size

The total process memory size for the JobManager. This includes all the memory that a JobManager JVM consumes, including the total Flink memory, JVM Metaspace, and JVM Overhead. In containerized setups, this parameter should be equal to the container memory

2048m

taskmanager.heap.size

The Java heap size for the TaskManager JVM

1024m

jobmanager.memory.heap.size

The heap size for the JobManager JVM

—

flink.yarn.appmaster.vcores

The number of virtual cores (vcores) used by YARN application master

taskmanager.host

The external address of the network interface where the TaskManager runs

—

taskmanager.memory.task.heap.size

The size of the JVM heap memory reserved for tasks

256m

taskmanager.memory.task.off-heap.size

The size of the off-heap memory reserved for tasks

256m

taskmanager.memory.managed.size

The size of the managed memory for TaskExecutors. This is the size of off-heap memory managed by the memory manager, reserved for sorting, hash tables, caching of intermediate results, and the RocksDB state backend

256m

taskmanager.memory.framework.heap.size

The size of the JVM heap memory reserved for TaskExecutor framework that will not be allocated to task slots

256m

taskmanager.memory.framework.off-heap.size

The size of the off-heap memory reserved for TaskExecutor framework that will not be allocated to task slots

256m

taskmanager.memory.network.min

The minimum network memory size for TaskExecutors. The network memory is the off-heap memory reserved for ShuffleEnvironment (e.g. network buffers)

256m

taskmanager.memory.network.max

The maximum network memory size for TaskExecutors. The network memory is the off-heap memory reserved for ShuffleEnvironment (e.g. network buffers)

256m

taskmanager.memory.jvm-overhead.max

The maximum JVM overhead size for the TaskExecutors. This is the off-heap memory reserved for JVM overhead, such as thread stack space, compile cache, etc.

256m

taskmanager.memory.jvm-metaspace.size

The JVM metaspace size for the TaskExecutors

256m

yarn.provided.lib.dirs

A semicolon-separated list of directories with provided libraries. Flink uses these libraries to exclude the local Flink JARS uploading to accelerate the job submission process

hdfs:///apps/flink/

flink.yarn.resourcemanager.scheduler.address

The address of the scheduler interface

—

flink.yarn.containers.vcores

Sets the number of vcores for Flink YARN containers

flink.yarn.application.classpath

A list of files/directories to be added to the classpath. To add more items to the classpath, click Plus icon

/etc/hadoop/conf/*
/usr/lib/hadoop/*
/usr/lib/hadoop/lib/*
/usr/lib/hadoop-hdfs/*
/usr/lib/hadoop-hdfs/lib/*
/usr/lib/hadoop-yarn/*
/usr/lib/hadoop-yarn/lib/*
/usr/lib/hadoop-mapreduce/*
/usr/lib/hadoop-mapreduce/lib/*

high-availability.cluster-id

The ID of the Flink cluster used to separate multiple Flink clusters from each other

default

high-availability.storageDir

A file system path (URI) where Flink persists metadata in the HA mode

—

high-availability

Defines the High Availability (HA) mode used for cluster execution

NONE

high-availability.zookeeper.quorum

The ZooKeeper quorum to use when running Flink in the HA mode with ZooKeeper

—

high-availability.zookeeper.path.root

The root path for Flink ZNode in Zookeeper

/flink

sql-gateway.session.check-interval

The check interval to detect idle sessions. A value <= 0 disables the checks

1 min

sql-gateway.session.idle-timeout

The timeout to close a session if no successful connection was made during this interval. A value <= 0 never closes the sessions

10 min

sql-gateway.session.max-num

The maximum number of sessions to run simultaneously

1000000

sql-gateway.worker.keepalive-time

The time to keep an idle worker thread alive. When the worker thread count exceeds sql-gateway.worker.threads.min, excessive threads are killed after this time interval

5 min

sql-gateway.worker.threads.max

The maximum number of worker threads on the SQL Gateway server

500

sql-gateway.worker.threads.min

The minimum number of worker threads. If the current number of worker threads is less than this value, the worker threads are not deleted automatically

security.kerberos.login.use-ticket-cache

Indicates whether to read from the Kerberos ticket cache

false

security.kerberos.login.keytab

The absolute path to the Kerberos keytab file that stores user credentials

—

security.kerberos.login.principal

Flink Kerberos principal

—

security.delegation.tokens.hive.renewer

Flink Kerberos principal for Hive

—

security.kerberos.login.contexts

A comma-separated list of login contexts to provide the Kerberos credentials to

—

security.ssl.internal.enabled

Enables SSL for internal communication channels between Flink components. This includes the communication between TaskManagers, transporting of blobs from JobManager to TaskManager, RPC-connections, etc.

false

security.ssl.internal.keystore

The path to the keystore file to be used by Flink’s internal endpoints

—

security.ssl.internal.truststore

The path to the truststore file used by internal Flink’s endpoints

—

security.ssl.internal.keystore-password

The password to the keystore file used by internal Flink’s endpoints

—

security.ssl.internal.truststore-password

The password to the truststore file used by internal Flink’s endpoints

—

security.ssl.internal.key-password

The password to decrypt the key in the keystore

—

security.ssl.rest.enabled

Turns on SSL for external communication via REST endpoints

false

security.ssl.rest.keystore

The Java keystore file with SSL keys and certificates to be used by Flink’s external REST endpoints

—

security.ssl.rest.truststore

The truststore file containing public CA certificates to verify the peer for Flink’s external REST endpoints

—

security.ssl.rest.keystore-password

The secret to decrypt the keystore file for Flink external REST endpoints

—

security.ssl.rest.truststore-password

The password to decrypt the truststore for Flink’s external REST endpoints

—

security.ssl.rest.key-password

The secret to decrypt the key in the keystore for Flink’s external REST endpoints

—

security.ssl.protocol

The TLS protocol version to be used for SSL. Accepts a single value, not a list

TLSv1.2

zookeeper.sasl.disable

Defines the SASL authentication in Zookeeper

false

Logging level

Defines the logging level for Flink activity

INFO

Other

Parameter

Description

Default value

Custom flink-conf.yaml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file flink-conf.yaml

—

log4j.properties

The contents of the log4j.properties configuration file

log4j.properties

log4j-cli.properties

The contents of the log4j-cli.properties configuration file

log4j-cli.properties

HBase

hbase-site.xml

Parameter Description Default value

hbase.balancer.period

The time period to run the Region balancer in Master

300000

hbase.client.pause

General client pause value. Used mostly as value to wait before running a retry of a failed get, region lookup, etc. See hbase.client.retries.number for description of how this pause works with retries

100

hbase.client.max.perregion.tasks

The maximum number of concurrent mutation tasks the Client will maintain to a single Region. That is, if there is already hbase.client.max.perregion.tasks writes in progress for this Region, new puts won’t be sent to this Region, until some writes finishes

hbase.client.max.perserver.tasks

The maximum number of concurrent mutation tasks a single HTable instance will send to a single Region Server

hbase.client.max.total.tasks

The maximum number of concurrent mutation tasks, a single HTable instance will send to the cluster

100

hbase.client.retries.number

The maximum number of retries. It is used as maximum for all retryable operations, such as: getting a cell value, starting a row update, etc. Retry interval is a rough function based on hbase.client.pause. See the constant RETRY_BACKOFF for how the backup ramps up. Change this setting and hbase.client.pause to suit your workload

hbase.client.scanner.timeout.period

The Client scanner lease period in milliseconds

60000

hbase.cluster.distributed

The cluster mode. Possible values are: false — for standalone mode and pseudo-distributed setups with managed ZooKeeper; true — for fully-distributed mode with unmanaged ZooKeeper Quorum. If false, the startup will run all HBase and ZooKeeper daemons together in the one JVM, if true — one JVM instance per daemon

true

hbase.hregion.majorcompaction

The time interval between Major compactions in milliseconds. Set to 0 to disable time-based automatic Major compactions. User-requested and size-based Major compactions will still run. This value is multiplied by hbase.hregion.majorcompaction.jitter to cause compaction to start at a somewhat-random time during a given time frame

604800000

hbase.hregion.max.filesize

The maximum file size. If the total size of some Region HFiles has grown to exceed this value, the Region is split in two. There are two options of how this option works: the first is when any store size exceeds the threshold — then split, and the other is if overall Region size exceeds the threshold — then split. It can be configured by hbase.hregion.split.overallfiles

10737418240

hbase.hstore.blockingStoreFiles

If more than this number of StoreFiles exists in any Store (one StoreFile is written per flush of MemStore), updates are blocked for this Region, until a compaction is completed, or until hbase.hstore.blockingWaitTime is exceeded

hbase.hstore.blockingWaitTime

The time for which a Region will block updates after reaching the StoreFile limit, defined by hbase.hstore.blockingStoreFiles. After this time is elapsed, the Region will stop blocking updates, even if a compaction has not been completed

90000

hbase.hstore.compaction.max

The maximum number of StoreFiles that will be selected for a single Minor compaction, regardless of the number of eligible StoreFiles. Effectively, the value of hbase.hstore.compaction.max controls the time it takes for a single compaction to complete. Setting it larger means that more StoreFiles are included in a compaction. For most cases, the default value is appropriate

hbase.hstore.compaction.min

The minimum number of StoreFiles that must be eligible for compaction before compaction can run. The goal of tuning hbase.hstore.compaction.min is to avoid a situation with too many tiny StoreFiles to compact. Setting this value to 2 would cause a Minor compaction each time you have two StoreFiles in a Store, and this is probably not appropriate. If you set this value too high, all the other values will need to be adjusted accordingly. For most cases, the default value is appropriate. In the previous versions of HBase, the parameter hbase.hstore.compaction.min was called hbase.hstore.compactionThreshold

hbase.hstore.compaction.min.size

A StoreFile, smaller than this size, will always be eligible for Minor compaction. StoreFiles this size or larger are evaluated by hbase.hstore.compaction.ratio to determine, if they are eligible. Because this limit represents the "automatic include" limit for all StoreFiles smaller than this value, this value may need to be reduced in write-heavy environments, where many files in the 1-2 MB range are being flushed, because every StoreFile will be targeted for compaction and the resulting StoreFiles may still be under the minimum size and require further compaction. If this parameter is lowered, the ratio check is triggered more quickly. This addressed some issues seen in earlier versions of HBase, but changing this parameter is no longer necessary in most situations

134217728

hbase.hstore.compaction.ratio

For Minor compaction, this ratio is used to determine, whether a given StoreFile that is larger than hbase.hstore.compaction.min.size, is eligible for compaction. Its effect is to limit compaction of large StoreFile. The value of hbase.hstore.compaction.ratio is expressed as a floating-point decimal

1.2F

hbase.hstore.compaction.ratio.offpeak

The compaction ratio used during off-peak compactions if the off-peak hours are also configured. Expressed as a floating-point decimal. This allows for more aggressive (or less aggressive, if you set it lower than hbase.hstore.compaction.ratio) compaction during a given time period. The value is ignored if off-peak is disabled (default). This works the same as hbase.hstore.compaction.ratio

5.0F

hbase.hstore.compactionThreshold

If more than this number of StoreFiles exists in any Store (one StoreFile is written per flush of MemStore), a compaction is run to rewrite all StoreFiles into a single StoreFile. Larger values delay the compaction, but when compaction does occur, it takes longer to complete

hbase.hstore.flusher.count

The number of flush threads. With fewer threads, the MemStore flushes will be queued. With more threads, the flushes will be executed in parallel, increasing the load on HDFS, and potentially causing more compactions

hbase.hstore.time.to.purge.deletes

The amount of time to delay purging of delete markers with future timestamps. If unset or set to 0, all the delete markers, including those with future timestamps, are purged during the next Major compaction. Otherwise, a delete marker is kept until the Major compaction that occurs after the marker timestamp plus the value of this setting (in milliseconds)

hbase.master.ipc.address

HMaster RPC

0.0.0.0

hbase.normalizer.period

The period at which the Region normalizer runs on Master (in milliseconds)

300000

hbase.regionserver.compaction.enabled

Enables/disables compactions by setting true/false. You can further switch compactions dynamically with the compaction_switch shell command

true

hbase.regionserver.ipc.address

Region Server RPC

0.0.0.0

hbase.regionserver.regionSplitLimit

The limit for the number of Regions, after which no more Region splitting should take place. This is not hard limit for the number of Regions, but acts as a guideline for the Region Server to stop splitting after a certain limit

1000

hbase.rootdir

The directory shared by Region Servers and into which HBase persists. The URL should be fully-qualified to include the filesystem scheme. For example, to specify the HDFS directory /hbase where the HDFS instance NameNode is running at namenode.example.org on port 9000, set this value to: hdfs://namenode.example.org:9000/hbase

—

hbase.zookeeper.quorum

A comma-separated list of servers in the ZooKeeper ensemble. For example, host1.mydomain.com,host2.mydomain.com,host3.mydomain.com. By default, this is set to localhost for local and pseudo-distributed modes of operation. For a fully-distributed setup, this should be set to a full list of ZooKeeper ensemble servers. If HBASE_MANAGES_ZK is set in hbase-env.sh, this is the list of servers, which HBase will start/stop ZooKeeper on, as part of cluster start/stop. Client-side, the list of ensemble members is put together with the hbase.zookeeper.property.clientPort config and is passed to the ZooKeeper constructor as the connection string parameter

—

zookeeper.session.timeout

The ZooKeeper session timeout in milliseconds. It is used in two different ways. First, this value is processed by the ZooKeeper Client that HBase uses to connect to the ensemble. It is also used by HBase, when it starts a ZooKeeper Server (in that case the timeout is passed as the maxSessionTimeout). See more details in the ZooKeeper documentation. For example, if an HBase Region Server connects to a ZooKeeper ensemble that is also managed by HBase, then the session timeout will be the one specified by this configuration. But a Region Server that connects to an ensemble managed with a different configuration will be subjected to the maxSessionTimeout of that ensemble. So, even though HBase might propose using 90 seconds, the ensemble can have a max timeout, lower than this, and it will take precedence. The current default maxSessionTimeout that ZooKeeper ships with is 40 seconds, which is lower than HBase

90000

zookeeper.znode.parent

The root znode for HBase in ZooKeeper. All of the HBase ZooKeeper files configured with a relative path will go under this node. By default, all of the HBase ZooKeeper file paths are configured with a relative path, so they will all go under this directory unless changed

/hbase

hbase.rest.port

The port used by HBase Rest Servers

60080

hbase.zookeeper.property.authProvider.1

Specifies the ZooKeeper authentication method

hbase.security.authentication

Set the value to true to run HBase RPC with strong authentication

false

hbase.security.authorization

Set the value to true to run HBase RPC with strong authorization

false

hbase.master.kerberos.principal

The Kerberos principal used to run the HMaster process

—

hbase.master.keytab.file

Full path to the Kerberos keytab file to use for logging in the configured HMaster server principal

—

hbase.regionserver.kerberos.principal

The Kerberos principal name that should be used to run the HRegionServer process

—

hbase.regionserver.keytab.file

Full path to the Kerberos keytab file to use for logging in the configured HRegionServer server principal

—

hbase.rest.authentication.type

REST Gateway Kerberos authentication type

—

hbase.rest.authentication.kerberos.principal

REST Gateway Kerberos principal

—

hbase.rest.authentication.kerberos.keytab

REST Gateway Kerberos principal

—

hbase.rest.support.proxyuser

Enables running the REST server to support the proxy user mode

false

hbase.thrift.keytab.file

Thrift Kerberos keytab

—

hbase.rest.keytab.file

HBase REST gateway Kerberos keytab

—

hbase.rest.kerberos.principal

HBase REST gateway Kerberos principal

—

hbase.thrift.kerberos.principal

Thrift Kerberos principal

—

hbase.thrift.security.qop

Defines authentication, integrity, and confidentiality checking. Supported values:

auth-conf — authentication, integrity, and confidentiality checking;
auth-int — authentication and integrity checking;
auth — authentication checking only.

—

phoenix.queryserver.keytab.file

The path to the Kerberos keytab file

—

phoenix.queryserver.kerberos.principal

The Kerberos principal to use when authenticating. If phoenix.queryserver.kerberos.http.principal is not defined, this principal specified will be also used to both authenticate SPNEGO connections and to connect to HBase

—

phoenix.queryserver.kerberos.keytab

The full path to the Kerberos keytab file to use for logging in the configured HMaster server principal

—

phoenix.queryserver.http.keytab.file

The keytab file to use for authenticating SPNEGO connections. This configuration must be specified if phoenix.queryserver.kerberos.http.principal is configured. phoenix.queryserver.keytab.file will be used if this property is undefined

—

phoenix.queryserver.http.kerberos.principal

The Kerberos principal to use when authenticating SPNEGO connections. phoenix.queryserver.kerberos.principal will be used if this property is undefined

—

phoenix.queryserver.kerberos.http.principal

Deprecated, use phoenix.queryserver.http.kerberos.principal instead

—

hbase.security.authentication.ui

Enables Kerberos authentication to HBase web UI with SPNEGO

—

hbase.security.authentication.spnego.kerberos.principal

The Kerberos principal for SPNEGO authentication

—

hbase.security.authentication.spnego.kerberos.keytab

The path to the Kerberos keytab file with principals to be used for SPNEGO authentication

—

hbase.ssl.enabled

Defines whether SSL is enabled for web UIs

false

hadoop.ssl.enabled

Defines whether SSL is enabled for Hadoop RPC

false

ssl.server.keystore.location

The path to the keystore file

—

ssl.server.keystore.password

The password to the keystore

—

ssl.server.truststore.location

The path to the truststore to be used

—

ssl.server.truststore.password

The password to the truststore

—

ssl.server.keystore.keypassword

The password to the key in the keystore

—

hbase.rest.ssl.enabled

Defines whether SSL is enabled for HBase REST server

false

hbase.rest.ssl.keystore.store

The path to the keystore used by HBase REST server

—

hbase.rest.ssl.keystore.password

The password to the keystore

—

hbase.rest.ssl.keystore.keypassword

The password to the key in the keystore

—

hadoop.security.credential.provider.path

Path to the credential provider (jceks) containing the passwords to all services

—

Credential encryption

Parameter Description Default value

Encryption enable

Defines whether the credentials are encrypted

false

Credential provider path

Path to the credential provider for creating the .jceks files containing secret keys

jceks://file/etc/hbase/conf/hbase.jceks

Ranger plugin credential provider path

Path to the Ranger plugin credential provider

jceks://file/etc/hbase/conf/ranger-hbase.jceks

Custom jceks

Defines whether custom .jceks files located at the credential provider path are used (true) or auto-generated ones (false)

false

Password file name

Name of the password file in the classpath of the service if the password file is selected in the credstore options

hbase_credstore_pass

hbase-env.sh

Parameter Description Default value

Sources

A list of sources that will be written into hbase-env.sh

—

HBASE_MASTER_OPTS

Specifies additional parameters for HBASE_MASTER_OPTS

-Xms700m -Xmx9G

PHOENIX_QUERYSERVER_OPTS

Specifies additional parameters for PHOENIX_QUERYSERVER_OPTS

-Xms700m -Xmx8G

HBASE_THRIFT_OPTS

Specifies additional parameters for HBASE_THRIFT_OPTS

-Xms700m -Xmx8G

HBASE_REST_OPTS

Specifies additional parameters for HBASE_REST_OPTS

-Xms200m -Xmx8G

HBASE_OPTS

Specifies additional parameters for HBASE_OPTS

-XX:+UseConcMarkSweepGC

HBASE_CLASSPATH

The classpath for HBase. A list of files/directories to be added to the classpath. To add more items to the classpath, click Plus icon

/usr/lib/phoenix/phoenix-server-hbase.jar

Final HBASE_OPTS

Final value of the HBASE_OPTS parameter in hbase-env.sh

—

hbase-regionserver-env.sh

Parameter Description Default value

Sources

A list of sources that will be written into hbase-regionserver-env.sh

—

HBASE_REGIONSERVER_OPTS

Sets initial (-Xms) and maximum (-Xmx) Java heap size for HBase Region server

-Xms700m -Xmx9G

Final HBASE_REGIONSERVER_OPTS

Final value of the HBASE_REGIONSERVER_OPTS parameter in hbase-regionserver-env.sh

—

ranger-hbase-audit.xml

Parameter Description Default value

xasecure.audit.destination.solr.batch.filespool.dir

The spool directory path

/srv/ranger/hdfs_plugin/audit_solr_spool

xasecure.audit.destination.solr.urls

A URL of the Solr server to store audit events. Leave this property value empty or set it to NONE when using ZooKeeper to connect to Solr

—

xasecure.audit.destination.solr.zookeepers

Specifies the ZooKeeper connection string for the Solr destination

—

xasecure.audit.destination.solr.force.use.inmemory.jaas.config

Uses in-memory JAAS configuration file to connect to Solr

—

xasecure.audit.is.enabled

Enables Ranger audit

true

xasecure.audit.jaas.Client.loginModuleControlFlag

Specifies whether the success of the module is required, requisite, sufficient, or optional

—

xasecure.audit.jaas.Client.loginModuleName

The name of the authenticator class

—

xasecure.audit.jaas.Client.option.keyTab

The name of the keytab file to get the principal’s secret key

—

xasecure.audit.jaas.Client.option.principal

The name of the principal to be used

—

xasecure.audit.jaas.Client.option.serviceName

The name of a user or a service that wants to log in

—

xasecure.audit.jaas.Client.option.storeKey

Set this to true if you want the keytab or the principal’s key to be stored in the subject’s private credentials

false

xasecure.audit.jaas.Client.option.useKeyTab

Set this to true if you want the module to get the principal’s key from the keytab

false

ranger-hbase-security.xml

Parameter

Description

Default value

ranger.plugin.hbase.policy.rest.url

The URL to Ranger Admin

—

ranger.plugin.hbase.service.name

The name of the Ranger service containing policies for this instance

—

ranger.plugin.hbase.policy.cache.dir

The directory where Ranger policies are cached after successful retrieval from the source

/srv/ranger/hbase/policycache

ranger.plugin.hbase.policy.pollIntervalMs

Defines how often to poll for changes in policies

30000

ranger.plugin.hbase.policy.rest.client.connection.timeoutMs

The HBase Plugin RangerRestClient connection timeout (in milliseconds)

120000

ranger.plugin.hbase.policy.rest.client.read.timeoutMs

The HBase Plugin RangerRestClient read timeout (in milliseconds)

30000

ranger.plugin.hbase.policy.rest.ssl.config.file

The path to the RangerRestClient SSL config file for HBase plugin

/etc/hbase/conf/ranger-hbase-policymgr-ssl.xml

ranger-hbase-policymgr-ssl.xml

Parameter

Description

Default value

xasecure.policymgr.clientssl.keystore

The path to the keystore file used by Ranger

—

xasecure.policymgr.clientssl.keystore.credential.file

The path to the keystore credentials file

/etc/hbase/conf/ranger-hbase.jceks

xasecure.policymgr.clientssl.truststore.credential.file

The path to the truststore credentials file

/etc/hbase/conf/ranger-hbase.jceks

xasecure.policymgr.clientssl.truststore

The path to the truststore file used by Ranger

—

xasecure.policymgr.clientssl.keystore.password

The password to the keystore file

—

xasecure.policymgr.clientssl.truststore.password

The password to the truststore file

—

Other

Parameter

Description

Default value

Custom hbase-site.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file hbase-site.xml

—

Custom hbase-env.sh

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file hbase-env.sh

—

Custom hbase-regionserver-env.sh

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file hbase-regionserver-env.sh

—

Ranger plugin enabled

Whether or not Ranger plugin is enabled

false

Custom ranger-hbase-audit.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file ranger-hbase-audit.xml

—

Custom ranger-hbase-security.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file ranger-hbase-security.xml

—

Custom ranger-hbase-policymgr-ssl.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file ranger-hbase-policymgr-ssl.xml

—

Custom log4j.properties

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file log4j.properties

log4j.properties

Custom hadoop-metrics2-hbase.properties

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file hadoop-metrics2-hbase.properties

hadoop-metrics2-hbase.properties

HDFS

Credential Encryption

Parameter Description Default value

Encryption enable

Enables or disables the credential encryption feature. When enabled, HDFS stores configuration passwords and credentials required for interacting with other services in the encrypted form

false

Credential provider path

The path to a keystore file with secrets

jceks://file/etc/hadoop/conf/hadoop.jceks

Ranger plugin credential provider path

The path to a Ranger keystore file with secrets

jceks://file/etc/hadoop/conf/ranger-hdfs.jceks

Custom jceks

Set to true to use a custom JCEKS file. Set to false to use the default auto-generated JCEKS file

false

Password file name

The name of the file in the service’s classpath that stores passwords

hadoop_credstore_pass

Enable CORS

Parameter Description Default value

hadoop.http.cross-origin.enabled

Enables cross-origin support for all web services

true

hadoop.http.cross-origin.allowed-origins

Comma-separated list of origins that are allowed. Values prefixed with regex are interpreted as regular expressions. Values containing wildcards (*) are possible as well, here a regular expression is generated, the use is discouraged and support is only available for backward compatibility

hadoop.http.cross-origin.allowed-headers

Comma-separated list of allowed headers

X-Requested-With,Content-Type,Accept,Origin,WWW-Authenticate,Accept-Encoding,Transfer-Encoding

hadoop.http.cross-origin.allowed-methods

Comma-separated list of methods that are allowed

GET,PUT,POST,OPTIONS,HEAD,DELETE

hadoop.http.cross-origin.max-age

Number of seconds a pre-flighted request can be cached

1800

core_site.enable_cors.active

Enables CORS (Cross-Origin Resource Sharing)

true

hdfs-site.xml

Parameter Description Default value

dfs.client.block.write.replace-datanode-on-failure.enable

If there is a DataNode/network failure in the write pipeline, DFSClient will try to remove the failed DataNode from the pipeline and then continue writing with the remaining DataNodes. As a result, the number of DataNodes in the pipeline is decreased. The feature is to add new DataNodes to the pipeline. This is a site-wide property to enable/disable the feature. When the cluster size is extremely small, e.g. 3 nodes or less, cluster administrators may want to set the policy to NEVER in the default configuration file or disable this feature. Otherwise, users may experience an unusually high rate of pipeline failures since it is impossible to find new DataNodes for replacement. See also dfs.client.block.write.replace-datanode-on-failure.policy

true

dfs.client.block.write.replace-datanode-on-failure.policy

This property is used only if the value of dfs.client.block.write.replace-datanode-on-failure.enable is true. Possible values:

ALWAYS. Always adds a new DataNode, when an existing DataNode is removed.
NEVER. Never adds a new DataNode.
DEFAULT. Let r be the replication number. Let n be the number of existing DataNodes. Add a new DataNode only, if r is greater than or equal to 3 and either:
1. floor(r/2) is greater than or equal to n;
2. r is greater than n and the block is hflushed/appended.

DEFAULT

dfs.client.block.write.replace-datanode-on-failure.best-effort

This property is used only if the value of dfs.client.block.write.replace-datanode-on-failure.enable is true. Best effort means, that the client will try to replace a failed DataNode in write pipeline (provided that the policy is satisfied), however, it continues the write operation in case that the DataNode replacement also fails. Suppose, the DataNode replacement fails: false — an exception should be thrown so that the write will fail; true — the write should be resumed with the remaining DataNodes. Note, that setting this property to true allows writing to a pipeline with a smaller number of DataNodes. As a result, it increases the probability of data loss

false

dfs.client.block.write.replace-datanode-on-failure.min-replication

The minimum number of replications needed not to fail the write pipeline if new DataNodes can not be found to replace failed DataNodes (could be due to network failure) in the write pipeline. If the number of the remaining DataNodes in the write pipeline is greater than or equal to this property value, continue writing to the remaining nodes. Otherwise throw exception. If this is set to 0, an exception will be thrown, when a replacement can not be found. See also dfs.client.block.write.replace-datanode-on-failure.policy

dfs.balancer.dispatcherThreads

The size of the thread pool for the HDFS balancer block mover — dispatchExecutor

200

dfs.balancer.movedWinWidth

The time window in milliseconds for the HDFS balancer tracking blocks and its locations

5400000

dfs.balancer.moverThreads

The thread pool size for executing block moves — moverThreadAllocator

1000

dfs.balancer.max-size-to-move

The maximum number of bytes that can be moved by the balancer in a single thread

10737418240

dfs.balancer.getBlocks.min-block-size

The minimum block threshold size in bytes to ignore, when fetching a source block list

10485760

dfs.balancer.getBlocks.size

The total size in bytes of DataNode blocks to get, when fetching a source block list

2147483648

dfs.balancer.block-move.timeout

The maximum amount of time for a block to move (in milliseconds). If set greater than 0, the balancer will stop waiting for a block move completion after this time. In typical clusters, a 3-5 minute timeout is reasonable. If the timeout is set for a large proportion of block moves, this needs to be increased. It could also be that too much work is dispatched and many nodes are constantly exceeding the bandwidth limit as a result. In that case, other balancer parameters might need to be adjusted. It is disabled (0) by default

dfs.balancer.max-no-move-interval

If this specified amount of time has elapsed and no blocks have been moved out of a source DataNode, one more attempt will be made to move blocks out of this DataNode in the current Balancer iteration

60000

dfs.balancer.max-iteration-time

The maximum amount of time an iteration can be run by the Balancer. After this time the Balancer will stop the iteration, and re-evaluate the work needed to be done to balance the cluster. The default value is 20 minutes

1200000

dfs.blocksize

The default block size for new files (in bytes). You can use the following suffixes to define size units (case insensitive): k (kilo), m (mega), g (giga), t (tera), p (peta), e (exa). For example, 128k, 512m, 1g, etc. You can also specify the block size in bytes (such as 134217728 for 128 MB)

134217728

dfs.client.read.shortcircuit

Turns on short-circuit local reads

true

dfs.datanode.balance.max.concurrent.moves

The maximum number of threads for DataNode balancer pending moves. This value is reconfigurable via the dfsadmin -reconfig command

dfs.datanode.data.dir

Determines, where on the local filesystem a DFS data node should store its blocks. If multiple directories are specified, then data will be stored in all named directories, typically on different devices. The directories should be tagged with corresponding storage types (SSD/DISK/ARCHIVE/RAM_DISK) for HDFS storage policies. The default storage type will be DISK if the directory does not have a storage type tagged explicitly. Directories, that do not exist, will be created, if the local filesystem permission allows

/srv/hadoop-hdfs/data:DISK

dfs.disk.balancer.max.disk.throughputInMBperSec

The maximum disk bandwidth, used by the disk balancer during reads from a source disk. The unit is MB/sec

dfs.disk.balancer.block.tolerance.percent

The parameter specifies when a good enough value is reached for any copy step (in percents). For example, if set to to 10 then getting close to 10% of the target value is considered as good enough. In other words, if the move operation is 20GB in size, if 18GB (20 * (1-10%)) can be moved, the entire operation is considered successful

dfs.disk.balancer.max.disk.errors

During a block move from a source to destination disk, there might be various errors. This parameter defines how many errors to tolerate before declaring a move between 2 disks (or a step) has failed

dfs.disk.balancer.plan.valid.interval

The maximum amount of time a disk balancer plan (a set of configurations that define the data volume to be redistributed between two disks) remains valid. This setting supports multiple time unit suffixes as described in dfs.heartbeat.interval. If no suffix is specified, then milliseconds are assumed

dfs.disk.balancer.plan.threshold.percent

Defines a data storage threshold in percents at which disks start participating in data redistribution or balancing activities

dfs.domain.socket.path

The path to a UNIX domain socket that will be used for communication between the DataNode and local HDFS clients. If the string _PORT is present in this path, it will be replaced by the TCP port of the DataNode. The parameter is optional

/var/lib/hadoop-hdfs/dn_socket

dfs.hosts

Names a file that contains a list of hosts allowed to connect to the NameNode. The full pathname of the file must be specified. If the value is empty, all hosts are permitted

/etc/hadoop/conf/dfs.hosts

dfs.mover.movedWinWidth

The minimum time interval for a block to be moved to another location again (in milliseconds)

5400000

dfs.mover.moverThreads

Sets the balancer mover thread pool size

1000

dfs.mover.retry.max.attempts

The maximum number of retries before the mover considers the move as failed

dfs.mover.max-no-move-interval

If this specified amount of time has elapsed and no block has been moved out of a source DataNode, one more attempt will be made to move blocks out of this DataNode in the current mover iteration

60000

dfs.namenode.name.dir

Determines where on the local filesystem the DFS name node should store the name table (fsimage). If multiple directories are specified, then the name table is replicated in all of the directories, for redundancy

/srv/hadoop-hdfs/name

dfs.namenode.checkpoint.dir

Determines where on the local filesystem the DFS secondary name node should store the temporary images to merge. If multiple directories are specified, then the image is replicated in all of the directories for redundancy

/srv/hadoop-hdfs/checkpoint

dfs.namenode.hosts.provider.classname

The class that provides access for host files. org.apache.hadoop.hdfs.server.blockmanagement.HostFileManager is used by default that loads files specified by dfs.hosts and dfs.hosts.exclude. If org.apache.hadoop.hdfs.server.blockmanagement.CombinedHostFileManager is used, it will load the JSON file defined in dfs.hosts. To change the class name, NameNode restart is required. dfsadmin -refreshNodes only refreshes the configuration files, used by the class

org.apache.hadoop.hdfs.server.blockmanagement.CombinedHostFileManager

dfs.namenode.rpc-bind-host

The actual address, the RPC Server will bind to. If this optional address is set, it overrides only the hostname portion of dfs.namenode.rpc-address. It can also be specified per NameNode or name service for HA/Federation. This is useful for making the NameNode listen on all interfaces by setting it to 0.0.0.0

0.0.0.0

dfs.permissions.superusergroup

The name of the group of super-users. The value should be a single group name

hadoop

dfs.replication

The default block replication. The actual number of replications can be specified, when the file is created. The default is used, if replication is not specified in create time

dfs.journalnode.http-address

The HTTP address of the JournalNode web UI

0.0.0.0:8480

dfs.journalnode.https-address

The HTTPS address of the JournalNode web UI

0.0.0.0:8481

dfs.journalnode.rpc-address

The RPC address of the JournalNode web UI

0.0.0.0:8485

dfs.datanode.http.address

The address of the DataNode HTTP server

0.0.0.0:9864

dfs.datanode.https.address

The address of the DataNode HTTPS server

0.0.0.0:9865

dfs.datanode.address

The address of the DataNode for data transfer

0.0.0.0:9866

dfs.datanode.ipc.address

The IPC address of the DataNode

0.0.0.0:9867

dfs.namenode.http-address

The address and the base port to access the dfs NameNode web UI

0.0.0.0:9870

dfs.namenode.https-address

The secure HTTPS address of the NameNode

0.0.0.0:9871

dfs.ha.automatic-failover.enabled

Defines whether automatic failover is enabled

true

dfs.ha.fencing.methods

A list of scripts or Java classes that will be used to fence the Active NameNode during a failover

shell(/bin/true)

dfs.journalnode.edits.dir

The directory where to store journal edit files

/srv/hadoop-hdfs/journalnode

dfs.namenode.shared.edits.dir

The directory on shared storage between the multiple NameNodes in an HA cluster. This directory will be written by the active and read by the standby in order to keep the namespaces synchronized. This directory does not need to be listed in dfs.namenode.edits.dir. It should be left empty in a non-HA cluster

---

dfs.internal.nameservices

A unique nameservices identifier for a cluster or federation. For a single cluster, specify the name that will be used as an alias. For HDFS federation, specify, separated by commas, all namespaces associated with this cluster. This option allows you to use an alias instead of an IP address or FQDN for some commands, for example: hdfs dfs -ls hdfs://<dfs.internal.nameservices>. The value must be alphanumeric without underscores

—

dfs.block.access.token.enable

If set to true, access tokens are used as capabilities for accessing DataNodes. If set to false, no access tokens are checked on accessing DataNodes

false

dfs.namenode.kerberos.principal

The NameNode service principal. This is typically set to nn/_HOST@REALM.TLD. Each NameNode will substitute _HOST with its own fully qualified hostname during the startup. The _HOST placeholder allows using the same configuration setting on both NameNodes in an HA setup

nn/_HOST@REALM

dfs.namenode.keytab.file

The keytab file used by each NameNode daemon to login as its service principal. The principal name is configured with dfs.namenode.kerberos.principal

/etc/security/keytabs/nn.service.keytab

dfs.namenode.kerberos.internal.spnego.principal

HTTP Kerberos principal name for the NameNode

HTTP/_HOST@REALM

dfs.web.authentication.kerberos.principal

Kerberos principal name for the WebHDFS

HTTP/_HOST@REALM

dfs.web.authentication.kerberos.keytab

Kerberos keytab file for WebHDFS

/etc/security/keytabs/HTTP.service.keytab

dfs.journalnode.kerberos.principal

The JournalNode service principal. This is typically set to jn/_HOST@REALM.TLD. Each JournalNode will substitute _HOST with its own fully qualified hostname at startup. The _HOST placeholder allows using the same configuration setting on all JournalNodes

jn/_HOST@REALM

dfs.journalnode.keytab.file

The keytab file used by each JournalNode daemon to login as its service principal. The principal name is configured with dfs.journalnode.kerberos.principal

/etc/security/keytabs/jn.service.keytab

dfs.journalnode.kerberos.internal.spnego.principal

The server principal used by the JournalNode HTTP Server for SPNEGO authentication when Kerberos security is enabled. This is typically set to HTTP/_HOST@REALM.TLD. The SPNEGO server principal begins with the prefix HTTP/ by convention. If the value is *, the web server will attempt to login with every principal specified in the keytab file dfs.web.authentication.kerberos.keytab. For most deployments this can be set to ${dfs.web.authentication.kerberos.principal} that is use the value of dfs.web.authentication.kerberos.principal

HTTP/_HOST@REALM

dfs.datanode.data.dir.perm

Permissions for the directories on the local filesystem where the DFS DataNode stores its blocks. The permissions can either be octal or symbolic

700

dfs.datanode.kerberos.principal

The DataNode service principal. This is typically set to dn/_HOST@REALM.TLD. Each DataNode will substitute _HOST with its own fully qualified host name at startup. The _HOST placeholder allows using the same configuration setting on all DataNodes

dn/_HOST@REALM.TLD

dfs.datanode.keytab.file

The keytab file used by each DataNode daemon to login as its service principal. The principal name is configured with dfs.datanode.kerberos.principal

/etc/security/keytabs/dn.service.keytab

dfs.http.policy

Defines if HTTPS (SSL) is supported on HDFS. This configures the HTTP endpoint for HDFS daemons. The following values are supported: HTTP_ONLY — the service is provided only via http; HTTPS_ONLY — the service is provided only via https; HTTP_AND_HTTPS — the service is provided both via http and https

HTTP_ONLY

dfs.data.transfer.protection

A comma-separated list of SASL protection values used for secured connections to the DataNode when reading or writing block data. The possible values are:

authentication — provides only authentication; no integrity or privacy;
integrity — authentication and integrity are enabled;
privacy — authentication, integrity and privacy are enabled.

If dfs.encrypt.data.transfer=true, then it supersedes the setting for dfs.data.transfer.protection and enforces that all connections must use a specialized encrypted SASL handshake. This property is ignored for connections to a DataNode listening on a privileged port. In this case, it is assumed that the use of a privileged port establishes sufficient trust

—

dfs.encrypt.data.transfer

Defines whether or not actual block data that is read/written from/to HDFS should be encrypted on the wire. This only needs to be set on the NameNodes and DataNodes, clients will deduce this automatically. It is possible to override this setting per connection by specifying custom logic via dfs.trustedchannel.resolver.class

false

dfs.encrypt.data.transfer.algorithm

This value may be set to either 3des or rc4. If nothing is set, then the configured JCE default on the system is used (usually 3DES). It is widely believed that 3DES is more secure, but RC4 is substantially faster. Note that if AES is supported by both the client and server, then this encryption algorithm will only be used to initially transfer keys for AES

3des

dfs.encrypt.data.transfer.cipher.suites

This value can be either undefined or AES/CTR/NoPadding. If defined, then dfs.encrypt.data.transfer uses the specified cipher suite for data encryption. If not defined, then only the algorithm specified in dfs.encrypt.data.transfer.algorithm is used

—

dfs.encrypt.data.transfer.cipher.key.bitlength

The key bitlength negotiated by dfsclient and datanode for encryption. This value may be set to either 128, 192, or 256

128

ignore.secure.ports.for.testing

Allows skipping HTTPS requirements in the SASL mode

false

dfs.client.https.need-auth

Whether SSL client certificate authentication is required

false

httpfs-site.xml

Parameter Description Default value

httpfs.http.administrators

The ACL for the admins. This configuration is used to control who can access the default servlets for HttpFS server. The value should be a comma-separated list of users and groups. The user list comes first and is separated by a space, followed by the group list, for example: user1,user2 group1,group2. Both users and groups are optional, so you can define only users, or groups, or both of them. Notice that in all these cases you should always use the leading space in the groups list. Using the asterisk grants access to all users and groups

hadoop.http.temp.dir

The HttpFS temp directory

${hadoop.tmp.dir}/httpfs

httpfs.ssl.enabled

Defines whether SSL is enabled. Default is false, that is disabled

false

httpfs.hadoop.config.dir

The location of the Hadoop configuration directory

/etc/hadoop/conf

httpfs.hadoop.authentication.type

Defines the authentication mechanism used by httpfs for its HTTP clients. Valid values are simple and kerberos. If simple is used, clients must specify the username with the user.name query string parameter. If kerberos is used, HTTP clients must use HTTP SPNEGO or delegation tokens

simple

httpfs.hadoop.authentication.kerberos.keytab

The Kerberos keytab file with the credentials for the HTTP Kerberos principal used by httpfs in the HTTP endpoint. httpfs.authentication.kerberos.keytab is deprecated. Instead, use hadoop.http.authentication.kerberos.keytab

/etc/security/keytabs/httpfs.service.keytab

httpfs.hadoop.authentication.kerberos.principal

The HTTP Kerberos principal used by HttpFS in the HTTP endpoint. The HTTP Kerberos principal MUST start with HTTP/ as per Kerberos HTTP SPNEGO specification. httpfs.authentication.kerberos.principal is deprecated. Instead, use hadoop.http.authentication.kerberos.principal

HTTP/${httpfs.hostname}@${kerberos.realm}

ranger-hdfs-audit.xml

Parameter Description Default value

xasecure.audit.destination.solr.batch.filespool.dir

The spool directory path

/srv/ranger/hdfs_plugin/audit_solr_spool

xasecure.audit.destination.solr.urls

A URL of the Solr server to store audit events. Leave this property value empty or set it to NONE when using ZooKeeper to connect to Solr

—

xasecure.audit.destination.solr.zookeepers

Specifies the ZooKeeper connection string for the Solr destination

—

xasecure.audit.destination.solr.force.use.inmemory.jaas.config

Uses in-memory JAAS configuration file to connect to Solr

—

xasecure.audit.is.enabled

Enables Ranger audit

true

xasecure.audit.jaas.Client.loginModuleControlFlag

Specifies whether the success of the module is required, requisite, sufficient, or optional

—

xasecure.audit.jaas.Client.loginModuleName

The name of the authenticator class

—

xasecure.audit.jaas.Client.option.keyTab

The name of the keytab file to get the principal’s secret key

—

xasecure.audit.jaas.Client.option.principal

The name of the principal to be used

—

xasecure.audit.jaas.Client.option.serviceName

The name of a user or a service that wants to log in

—

xasecure.audit.jaas.Client.option.storeKey

Set this to true if you want the keytab or the principal’s key to be stored in the subject’s private credentials

false

xasecure.audit.jaas.Client.option.useKeyTab

Set this to true if you want the module to get the principal’s key from the keytab

false

ranger-hdfs-security.xml

Parameter

Description

Default value

ranger.plugin.hdfs.policy.rest.url

The URL to Ranger Admin

—

ranger.plugin.hdfs.service.name

The name of the Ranger service containing policies for this instance

—

ranger.plugin.hdfs.policy.cache.dir

The directory where Ranger policies are cached after successful retrieval from the source

/srv/ranger/hdfs/policycache

ranger.plugin.hdfs.policy.pollIntervalMs

Defines how often to poll for changes in policies

30000

ranger.plugin.hdfs.policy.rest.client.connection.timeoutMs

The HDFS Plugin RangerRestClient connection timeout (in milliseconds)

120000

ranger.plugin.hdfs.policy.rest.client.read.timeoutMs

The HDFS Plugin RangerRestClient read timeout (in milliseconds)

30000

ranger.plugin.hdfs.policy.rest.ssl.config.file

The path to the RangerRestClient SSL config file for the HDFS plugin

/etc/hadoop/conf/ranger-hdfs-policymgr-ssl.xml

httpfs-env.sh

Parameter Description Default value

Sources

A list of sources which will be written into httpfs-env.sh

—

HADOOP_CONF_DIR

Hadoop configuration directory

/etc/hadoop/conf

HADOOP_LOG_DIR

Location of the log directory

${HTTPFS_LOG}

HADOOP_PID_DIR

PID file directory location

${HTTPFS_TEMP}

HTTPFS_SSL_ENABLED

Defines if SSL is enabled for httpfs

false

HTTPFS_SSL_KEYSTORE_FILE

The path to the keystore file

admin

HTTPFS_SSL_KEYSTORE_PASS

The password to access the keystore

admin

Final HTTPFS_ENV_OPTS

Final value of the HTTPFS_ENV_OPTS parameter in httpfs-env.sh

—

hadoop-env.sh

Parameter Description Default value

Sources

A list of sources that will be written into hadoop-env.sh

—

HDFS_NAMENODE_OPTS

NameNode Heap Memory. Sets initial (-Xms) and maximum (-Xmx) Java heap memory size and environment options for the NameNode

-Xms1G -Xmx8G

HDFS_DATANODE_OPTS

DataNode Heap Memory. Sets initial (-Xms) and maximum (-Xmx) Java heap memory size and environment options for the DataNode

-Xms700m -Xmx8G

HDFS_HTTPFS_OPTS

HttpFS Heap Memory. Sets initial (-Xms) and maximum (-Xmx) Java heap memory size and environment options for the httpfs server

-Xms700m -Xmx8G

HDFS_JOURNALNODE_OPTS

JournalNode Heap Memory. Sets initial (-Xms) and maximum (-Xmx) Java heap memory size and environment options for the JournalNode

-Xms700m -Xmx8G

HDFS_ZKFC_OPTS

ZKFC Heap Memory. Sets initial (-Xms) and maximum (-Xmx) Java heap memory size and environment options for ZKFC

-Xms500m -Xmx8G

Final HADOOP_ENV_OPTS

Final value of the HADOOP_ENV_OPTS parameter in hadoop-env.sh

—

ssl-server.xml

Parameter

Description

Default value

ssl.server.truststore.location

The truststore to be used by NameNodes and DataNodes

—

ssl.server.truststore.password

The password to the truststore

—

ssl.server.truststore.type

The truststore file format

jks

ssl.server.truststore.reload.interval

The truststore reload check interval (in milliseconds)

10000

ssl.server.keystore.location

The path to the keystore file used by NameNodes and DataNodes

—

ssl.server.keystore.password

The password to the keystore

—

ssl.server.keystore.keypassword

The password to the key in the keystore

—

ssl.server.keystore.type

The keystore file format

—

Lists of decommissioned and in maintenance hosts

Parameter Description Default value

DECOMMISSIONED

When an administrator decommissions a DataNode, the DataNode will first be transitioned into DECOMMISSION_INPROGRESS state. After all blocks belonging to that DataNode are fully replicated elsewhere based on each block replication factor, the DataNode will be transitioned to DECOMMISSIONED state. After that, the administrator can shutdown the node to perform long-term repair and maintenance that could take days or weeks. After the machine has been repaired, the machine can be recommissioned back to the cluster

—

IN_MAINTENANCE

Sometimes administrators only need to take DataNodes down for minutes/hours to perform short-term repair/maintenance. For such scenarios, the HDFS block replication overhead, incurred by decommission, might not be necessary and a light-weight process is desirable. And that is what maintenance state is used for. When an administrator puts a DataNode in the maintenance state, the DataNode will first be transitioned to ENTERING_MAINTENANCE state. As long as all blocks belonging to that DataNode, are minimally replicated elsewhere, the DataNode will immediately be transitioned to IN_MAINTENANCE state. After the maintenance has completed, the administrator can take the DataNode out of the maintenance state. In addition, maintenance state supports the timeout that allows administrators to configure the maximum duration, in which a DataNode is allowed to stay in the maintenance state. After the timeout, the DataNode will be transitioned out of maintenance state automatically by HDFS without human intervention

—

Other

Parameter

Description

Default value

Additional nameservices

Additional (internal) names for an HDFS cluster that allows querying another HDFS cluster from the current one

—

Custom core-site.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file core-site.xml

—

Custom hdfs-site.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file hdfs-site.xml

—

Custom httpfs-site.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file httpfs-site.xml

—

Ranger plugin enabled

Whether or not Ranger plugin is enabled

—

Custom ranger-hdfs-audit.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file ranger-hdfs-audit.xml

—

Custom ranger-hdfs-security.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file ranger-hdfs-security.xml

—

Custom ranger-hdfs-policymgr-ssl.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file ranger-hdfs-policymgr-ssl.xml

—

Custom httpfs-env.sh

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file httpfs-env.sh

—

Custom hadoop-env.sh

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file hadoop-env.sh

—

Custom ssl-server.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file ssl-server.xml

—

Custom ssl-client.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file ssl-client.xml

—

Topology script

The topology script used in HDFS

—

Topology data

An otional text file to map host names to the rack number for topology script. Stored to /etc/hadoop/conf/topology.data

—

Custom log4j.properties

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file log4j.properties

log4j.properties

Custom httpfs-log4j.properties

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file httpfs-log4j.properties

httpfs-log4j.properties

HDFS DataNode component

Monitoring
Parameter	Description	Default value
Java agent path	Path to the JMX Prometheus Java agent	/usr/lib/adh-utils/jmx/jmx_prometheus_javaagent.jar
Prometheus metrics port	Port on which to display HDFS DataNode metrics in the Prometheus format	9202
Mapping config path	Path to the metrics mapping configuration file	/etc/hadoop/conf/jmx_hdfs_datanode_metric_config.yml
Mapping config	Metrics mapping configuration file	hdfs-mapping-config.yml

HDFS JournalNode component

Monitoring
Parameter	Description	Default value
Java agent path	Path to the JMX Prometheus Java agent	/usr/lib/adh-utils/jmx/jmx_prometheus_javaagent.jar
Prometheus metrics port	Port on which to display HDFS JournalNode metrics in the Prometheus format	9203
Mapping config path	Path to the metrics mapping configuration file	/etc/hadoop/conf/jmx_hdfs_journalnode_metric_config.yml
Mapping config	Metrics mapping configuration file	hdfs-mapping-config.yml

HDFS NameNode component

Monitoring
Parameter	Description	Default value
Java agent path	Path to the JMX Prometheus Java agent	/usr/lib/adh-utils/jmx/jmx_prometheus_javaagent.jar
Prometheus metrics port	Port on which to display HDFS NameNode metrics in the Prometheus format	9201
Mapping config path	Path to the metrics mapping configuration file	/etc/hadoop/conf/jmx_hdfs_namenode_metric_config.yml
Mapping config	Metrics mapping configuration file	hdfs-mapping-config.yml

Hive

hive-env.sh

Parameter

Description

Default value

Sources

A list of sources that will be written into hive-env.sh

—

HADOOP_CLASSPATH

A list of files/directories to be added to the classpath. To add more items to the classpath, click Plus icon

/etc/tez/conf/
/usr/lib/tez/*
/usr/lib/tez/lib/*

HIVE_HOME

The Hive home directory

/usr/lib/hive

METASTORE_PORT

The Hive Metastore port

9083

HADOOP_CLIENT_OPTS

Hadoop client options. For example, JVM startup parameters

$HADOOP_CLIENT_OPTS -Djava.io.tmpdir={{ cluster.config.java_tmpdir | d('/tmp') }}

hive-server2-env.sh

Parameter Description Default value

Sources

A list of sources that will be written into hive-server2-env.sh

—

HADOOP_CLIENT_OPTS

Hadoop client options for HiveServer2

-Xms256m -Xmx256m

HIVE_AUX_JARS_PATH

Allows including custom JAR files to the Hive’s classpath. A list of files/directories to be added to the classpath. To add more items to the classpath, click Plus icon

—

Final HIVE_SERVER2_ENV_OPTS

Final value of the HIVE_SERVER2_ENV_OPTS parameter in hive-server2-env.sh

—

hive-metastore-env.sh

Parameter Description Default value

Sources

A list of sources that will be written into hive-metastore-env.sh

—

HADOOP_CLIENT_OPTS

Hadoop client options for Hive Metastore

-Xms256m -Xmx256m

Final HIVE_METASTORE_ENV_OPTS

Final value of the HIVE_METASTORE_ENV_OPTS parameter in hive-metastore-env.sh

—

Credential Encryption

Parameter Description Default value

Encryption enable

Enables or disables the credential encryption feature. When enabled, Hive stores configuration passwords and credentials required for interacting with other services in the encrypted form

false

Credential provider path

The path to a keystore file with secrets

jceks://file/etc/hive/conf/hive.jceks

Ranger plugin credential provider path

The path to a Ranger keystore file with secrets

jceks://file/etc/hive/conf/ranger-hive.jceks

Custom jceks

Set to true to use a custom JCEKS file. Set to false to use the default auto-generated JCEKS file

false

Password file name

The name of the file in the service’s classpath that stores passwords

hive_credstore_pass

hive-site.xml

Parameter Description Default value

hive.cbo.enable

When set to true, enables the cost-based optimizer that uses the Calcite framework

true

hive.compute.query.using.stats

When set to true, Hive will answer a few queries like min, max, and count(1) purely using statistics stored in the Metastore. For basic statistics collection, set the configuration property hive.stats.autogather to true. For more advanced statistics collection, run the ANALYZE TABLE queries

false

hive.execution.engine

Selects the execution engine. Supported values are: mr (Map Reduce, default), tez (Tez execution, for Hadoop 2 only), or spark (Spark execution, for Hive 1.1.0 onward)

Tez

hive.log.explain.output

When enabled, logs the EXPLAIN EXTENDED output for the query at log4j INFO level and in the HiveServer2 web UI (Drilldown → Query Plan). Starting Hive 3.1.0, this configuration property only logs as the log4j INFO. To log the EXPLAIN EXTENDED output in WebUI/Drilldown/Query Plan in Hive 3.1.0 and later, use hive.server2.webui.explain.output

true

hive.metastore.event.db.notification.api.auth

Defines whether the Metastore should perform the authorization against database notification related APIs such as get_next_notification. If set to true, then only the superusers in proxy settings have the permission

false

hive.metastore.uris

The Metastore URI used to access metadata in a remote metastore setup. For a remote metastore, you should specify the Thrift metastore server URI: thrift://<hostname>:<port> where <hostname> is a name or IP address of the Thrift metastore server, <port> is the port, on which the Thrift server is listening

—

hive.metastore.warehouse.dir

The absolute HDFS file path of the default database for the warehouse, that is local to the cluster

/apps/hive/warehouse

hive.server2.enable.doAs

Impersonate the connected user

false

hive.stats.fetch.column.stats

Annotation of the operator tree with statistics information requires column statistics. Column statistics are fetched from the Metastore. Fetching column statistics for each needed column can be expensive, when the number of columns is high. This flag can be used to disable fetching of column statistics from the Metastore

false

hive.tez.container.size

By default, Tez will spawn containers of the size of a mapper. This parameter can be used to overwrite the default value

1024

hive.support.concurrency

Defines whether Hive should support concurrency or not. A ZooKeeper instance must be up and running for the default Hive Lock Manager to support read/write locks

false

hive.txn.manager

Set this to org.apache.hadoop.hive.ql.lockmgr.DbTxnManager as part of turning on Hive transactions. The default DummyTxnManager replicates pre-Hive-0.13 behavior and provides no transactions

org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager

hive.users.in.admin.role

A comma-separated list of users to be assigned the administrator role when the Metastore starts

—

javax.jdo.option.ConnectionUserName

The metastore database user name

APP

javax.jdo.option.ConnectionPassword

The password for the metastore user name

—

javax.jdo.option.ConnectionURL

The JDBC connection URI used to access the data stored in the local Metastore setup. Use the following connection URI: jdbc:<datastore type>://<node name>:<port>/<database name> where:

<node name> is the host name or IP address of the data store;
<data store type> is the type of the data store;
<port> is the port on which the data store listens for remote procedure calls (RPC);
<database name> is the name of the database.

For example, the following URI specifies a local metastore that uses MySQL as a data store: jdbc:mysql://hostname23:3306/metastore

jdbc:postgresql://{{ groups['adpg.adpg'][0] | d(omit) }}:5432/hive

javax.jdo.option.ConnectionDriverName

The JDBC driver class name used to access Hive Metastore

org.postgresql.Driver

hive.server2.transport.mode

Sets the transport mode

binary

hive.server2.thrift.port

The port number used for the binary connection with Thrift Server2

10000

hive.server2.thrift.http.port

The port number used for the HTTP connection with Thrift Server2

10001

hive.server2.thrift.http.path

The HTTP endpoint of the Thrift Server2 service

cliservice

hive.metastore.transactional.event.listeners

The listener class that stores events in a database

org.apache.hive.hcatalog.listener.DbNotificationListener

hive.metastore.dml.events

Indicates whether Hive should track DML events

true

hive.server2.authentication.kerberos.principal

HiveServer2 Kerberos principal

—

hive.server2.authentication.kerberos.keytab

The path to the Kerberos keytab file containing the HiveServer2 principal

—

hive.server2.authentication.spnego.principal

The SPNEGO Kerberos principal

—

hive.server2.webui.spnego.principal

The SPNEGO Kerberos principal to access Web UI

—

hive.server2.webui.spnego.keytab

The SPNEGO Kerberos keytab file to access Web UI

—

hive.server2.webui.use.spnego

Defines whether to use Kerberos SPNEGO for Web UI access

false

hive.server2.authentication.spnego.keytab

The path to SPNEGO principal

—

hive.server2.authentication

Sets the authentication mode

NONE

hive.metastore.sasl.enabled

If true, the Metastore Thrift interface will be secured with SASL. Clients must authenticate with Kerberos

false

hive.metastore.kerberos.principal

The service principal for the metastore Thrift server. The _HOST token will be automatically replaced with the appropriate host name

—

hive.metastore.kerberos.keytab.file

The path to the Kerberos keytab file containing the metastore Thrift server’s service principal

—

hive.server2.use.SSL

Defines whether to use SSL for HiveServer2

false

hive.server2.keystore.path

The keystore to be used by HiveServer2

—

hive.server2.keystore.password

The password to the HiveServer2 keystore

—

hive.server2.truststore.path

The truststore to be used by HiveServer2

—

hive.server2.truststore.password

The password to the HiveServer2 truststore

—

hive.server2.webui.use.ssl

Defines whether to use SSL for the Hive web UI

false

hive.server2.webui.keystore.path

The path to the keystore file used to access the Hive web UI

—

hive.server2.webui.keystore.password

The password to the keystore file used to access the Hive web UI

—

hive.ssl.protocol.blacklist

A comma-separated list of TLS versions that cannot be used by Hive

SSLv2Hello,SSLv3,TLSv1,TLSv1.1

metastore.keystore.path

The path to the Hive Metastore keystore

—

metastore.keystore.password

The password to the Hive Metastore keystore

—

metastore.truststore.path

The path to the Hive Metastore truststore

—

metastore.use.SSL

Defines whether to use SSL for interaction with Hive Metastore

false

metastore.ssl.protocol.blacklist

A comma-separated list of TLS versions that cannot be used for communication with Hive Metastore

SSLv2Hello,SSLv2,SSLv3,TLSv1,TLSv1.1

iceberg.engine.hive.enabled

Enables Iceberg tables support

true

hive.security.authorization.sqlstd.confwhitelist.append

A regex to append configuration properties to the white list in addition to hive.security.authorization.sqlstd.confwhitelist

kyuubi\.operation\.handle|kyuubi\.client\.version|kyuubi\.client\.ipAddress|tez\.application\.tags

hive.server2.support.dynamic.service.discovery

Defines whether to support dynamic service discovery via ZooKeeper

true

hive.zookeeper.quorum

A comma-separated list of ZooKeeper servers (<host>:<port>) running in the cluster

—

hive.server2.zookeeper.namespace

Specifies the root namespace on ZooKeeper

hiveserver2

hive.cluster.delegation.token.store.class

The name of the class that implements the delegation token store system

org.apache.hadoop.hive.metastore.security.ZooKeeperTokenStore

Custom log4j.properties

Parameter

Description

Default value

HiveServer2 hive-log4j.properties

The Log4j configuration used for logging HiveServer2’s activity

hive-log4j.properties

Hive Metastore hive-log4j2.properties

The Log4j2 configuration used for logging Hive Metastore’s activity

hive-log4j2.properties

Hive Beeline beeline-log4j2.properties

The Log4j2 configuration used for logging Hive Beeline’s activity

beeline-log4j2.properties

ranger-hive-audit.xml

Parameter Description Default value

xasecure.audit.destination.solr.batch.filespool.dir

The spool directory path

/srv/ranger/hdfs_plugin/audit_solr_spool

xasecure.audit.destination.solr.urls

A URL of the Solr server to store audit events. Leave this property value empty or set it to NONE when using ZooKeeper to connect to Solr

—

xasecure.audit.destination.solr.zookeepers

Specifies the ZooKeeper connection string for the Solr destination

—

xasecure.audit.destination.solr.force.use.inmemory.jaas.config

Uses in-memory JAAS configuration file to connect to Solr

—

xasecure.audit.is.enabled

Enables Ranger audit

true

xasecure.audit.jaas.Client.loginModuleControlFlag

Specifies whether the success of the module is required, requisite, sufficient, or optional

—

xasecure.audit.jaas.Client.loginModuleName

The name of the authenticator class

—

xasecure.audit.jaas.Client.option.keyTab

The name of the keytab file to get the principal’s secret key

—

xasecure.audit.jaas.Client.option.principal

The name of the principal to be used

—

xasecure.audit.jaas.Client.option.serviceName

The name of a user or a service that wants to log in

—

xasecure.audit.jaas.Client.option.storeKey

Set this to true if you want the keytab or the principal’s key to be stored in the subject’s private credentials

false

xasecure.audit.jaas.Client.option.useKeyTab

Set this to true if you want the module to get the principal’s key from the keytab

false

ranger-hive-security.xml

Parameter

Description

Default value

ranger.plugin.hive.policy.rest.url

The URL to Ranger Admin

—

ranger.plugin.hive.service.name

The name of the Ranger service containing policies for this instance

—

ranger.plugin.hive.policy.cache.dir

The directory where Ranger policies are cached after successful retrieval from the source

/srv/ranger/hive/policycache

ranger.plugin.hive.policy.pollIntervalMs

Defines how often to poll for changes in policies

30000

ranger.plugin.hive.policy.rest.client.connection.timeoutMs

The Hive Plugin RangerRestClient connection timeout (in milliseconds)

120000

ranger.plugin.hive.policy.rest.client.read.timeoutMs

The Hive Plugin RangerRestClient read timeout (in milliseconds)

30000

xasecure.hive.update.xapolicies.on.grant.revoke

Controls Hive Ranger policy update from SQL Grant/Revoke commands

true

ranger.plugin.hive.policy.rest.ssl.config.file

The path to the RangerRestClient SSL config file for the Hive plugin

/etc/hive/conf/ranger-hive-policymgr-ssl.xml

ranger-hive-policymgr-ssl.xml

Parameter

Description

Default value

xasecure.policymgr.clientssl.keystore

The path to the keystore file used by Ranger

—

xasecure.policymgr.clientssl.keystore.credential.file

The path to the keystore credentials file

/etc/hive/conf/ranger-hive.jceks

xasecure.policymgr.clientssl.truststore.credential.file

The path to the truststore credentials file

/etc/hive/conf/ranger-hive.jceks

xasecure.policymgr.clientssl.truststore

The path to the truststore file used by Ranger

—

xasecure.policymgr.clientssl.keystore.password

The password to the keystore file

—

xasecure.policymgr.clientssl.truststore.password

The password to the truststore file

—

tez-site.xml

Parameter

Description

Default value

tez.am.resource.memory.mb

The amount of memory in MB, that YARN will allocate to the Tez Application Master. The size increases with the size of the DAG

1024

tez.history.logging.service.class

Enables Tez to use the Timeline Server for History Logging

org.apache.tez.dag.history.logging.ats.ATSHistoryLoggingService

tez.lib.uris

HDFS paths containing the Tez JAR files

${fs.defaultFS}/apps/tez/tez-0.10.3.tar.gz

tez.task.resource.memory.mb

The amount of memory used by launched tasks in TEZ containers. Usually this value is set in the DAG

1024

tez.tez-ui.history-url.base

The URL where the Tez UI is hosted

—

tez.use.cluster.hadoop-libs

Specifies, whether Tez will use the cluster Hadoop libraries

true

nginx.conf

Parameter

Description

Default value

ssl_certificate

The path to the SSL certificate for Nginx

/etc/ssl/certs/host_cert.cert

ssl_certificate_key

The path to the SSL certificate key for Nginx

/etc/ssl/host_cert.key

ssl_protocols

A list of allowed TLS protocols to set up SSL connection

TLSv1.2

nginx_http_port

Nginx HTTP port

8089

nginx_https_port

Nginx HTTPS port

9999

Other

Parameter Description Default value

ACID Transactions

Defines whether to enable ACID transactions

false

Database type

The type of the external database used for Hive Metastore

postgres

Custom hive-site.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file hive-site.xml

—

Custom hive-env.sh

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file hive-env.sh

—

Ranger plugin enabled

Whether or not Ranger plugin is enabled

false

Custom ranger-hive-audit.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file ranger-hive-audit.xml

—

Custom ranger-hive-security.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file ranger-hive-security.xml

—

Custom ranger-hive-policymgr-ssl.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file ranger-hive-policymgr-ssl.xml

—

Custom tez-site.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file tez-site.xml

—

beeline-hs2-connection.xml

An XML template to generate property-value pairs from the hive_beeline_connection_conf object

beeline-hs2-connection.xml

Hive HiveServer2 component

Monitoring
Parameter	Description	Default value
Java agent path	Path to the JMX Prometheus Java agent	/usr/lib/adh-utils/jmx/jmx_prometheus_javaagent.jar
Prometheus metrics port	Port on which to display Hive HiveServer2 metrics in the Prometheus format	9208
Mapping config path	Path to the metrics mapping configuration file	/etc/hive/conf/jmx_hive_server_metric_config.yml
Mapping config	Metrics mapping configuration file	hive-mapping-config.yml

Hive Metastore component

Monitoring
Parameter	Description	Default value
Java agent path	Path to the JMX Prometheus Java agent	/usr/lib/adh-utils/jmx/jmx_prometheus_javaagent.jar
Prometheus metrics port	Port on which to display Hive Metastore metrics in the Prometheus format	9207
Mapping config path	Path to the metrics mapping configuration file	/etc/hive/conf/jmx_hive_server_metric_config.yml
Mapping config	Metrics mapping configuration file	hive-mapping-config.yml

HUE

The HUE Server component

hue.ini syntax

The hue.ini configuration file displayed in ADCM has different syntax from its original syntax. In the original file, the nesting level is determined by placing the section names into the corresponding number of square brackets. Example:

[notebook]
show_notebooks=true
[[interpreters]]
[[[mysql]]]
name = MySQL
interface=sqlalchemy
options='{"url": "mysql://root:secret@database:3306/hue"}'
[[[hive]]]
name=Hive
interface=hiveserver2

In ADCM, the nesting level is determined by separating the section names with periods. The structure from the above example will look the following way:

notebook.show_notebooks: true
notebook.interpreters.mysql.name: MySQL
notebook.interpreters.mysql.interface: sqlalchemy
notebook.interpreters.mysql.options: '{"url": "mysql://root:secret@database:3306/hue"}'
notebook.interpreters.hive.name: Hive
notebook.interpreters.hive.interface: hiveserver2

hue.ini
Parameter	Description	Default value
desktop.enable_prometheus	Defines whether Prometheus metrics are enabled	false
desktop.http_host	HUE Server listening IP address	0.0.0.0
desktop.http_port	HUE Server listening port	8000
desktop.use_cherrypy_server	Defines whether CherryPy (`true`) or Gunicorn (`false`) is used as the webserver	false
desktop.gunicorn_work_class	Gunicorn work class: `gevent`, `eventlet`, `gthread`, or `sync`	gthread
desktop.secret_key	Random string used for secure hashing in the session store	jFE93j;2[290-eiw.KEiwN2s3['d;/.q[eIW^y#e=+Iei*@Mn<qW5o
desktop.enable_xff_for_hive_impala	Defines whether the `X-Forwarded-For` header is used if Hive or Impala require it	false
desktop.enable_x_csrf_token_for_hive_impala	Defines whether the `X-CSRF-Token` header is used if Hive or Impala require it	false
desktop.app_blacklist	Comma-separated list of apps to not load at server startup	security,pig,sqoop,oozie,hbase,search
desktop.auth.backend	Comma-separated list of authentication backend combinations in order of priority	desktop.auth.backend.AllowFirstUserDjangoBackend
desktop.database.host	HUE Server database network or IP address	{{ groups['adpg.adpg'][0] \| d(omit) }}
desktop.database.port	HUE Server database network port	5432
desktop.database.engine	Engine used by the HUE Server database	postgresql_psycopg2
desktop.database.user	Admin username for the HUE Server database	hue
desktop.database.name	HUE Server database name	hue
desktop.database.password	Password for the `desktop.database.user` username	—
desktop.auth_username	Username for authentication in HUE UI	—
desktop.auth_password	Password for authentication in HUE UI	—

Interpreter Impala
Parameter	Description	Default value
notebook.interpreters.impala.name	Impala interpreter name	impala
notebook.interpreters.impala.interface	Interface for the Impala interpreter	hiveserver2
impala.server_host	Host of the Impala Server (one of the Impala Daemon components)	—
impala.server_port	Port of the Impala Server	21050
impala.impersonation_enabled	Enables the impersonation mechanism during interaction with Impala	true
impala.impala_conf_dir	Path to the Impala configuration directory that contains the impalad_flags file	/etc/hue/conf
impala.ssl.cacerts	Path to the CA certificates	/etc/pki/tls/certs/ca-bundle.crt
impala.ssl.validate	Defines whether HUE should validate certificates received from the server	false
impala.ssl.enabled	Enables SSL communication for this server	false
impala.impala_principal	Kerberos principal name for Impala	—
impala.auth_username	Username for authentication in Impala	—
impala.auth_password	Password for authentication in Impala	—

Interpreter HDFS
Parameter	Description	Default value
hadoop.hdfs_clusters.default.webhdfs_url	WebHDFS or HttpFS endpoint link for accessing HDFS data	—
hadoop.hdfs_clusters.default.hadoop_conf_dir	Path to the directory of the Hadoop configuration files	/etc/hadoop/conf
hadoop.hdfs_clusters.default.security_enabled	Defines whether the Hyperwave cluster is secured by Kerberos	false
hadoop.hdfs_clusters.default.ssl_cert_ca_verify	Defines whether to verify SSL certificates against the CA	false

Interpreter Hive
Parameter	Description	Default value
notebook.interpreters.hive.name	Hive interpreter name	hive
notebook.interpreters.hive.interface	Interface for the Hive interpreter	hiveserver2
beeswax.hive_discovery_hs2	Defines whether to use service discovery for HiveServer2	true
beeswax.hive_conf_dir	Path to the Hive configuration directory containing the hive-site.xml file	/etc/hive/conf
beeswax.use_sasl	Defines whether to use the SASL framework to establish connection to host	true
beeswax.hive_discovery_hiveserver2_znode	Hostname of the znode of the HiveServer2 if Hive is using ZooKeeper service discovery mode	hive.server2.zookeeper.namespace
libzookeeper.ensemble	List of ZooKeeper ensemble members hosts and ports	host1:2181,host2:2181,host3:2181
libzookeeper.principal_name	Kerberos principal name for ZooKeeper	—
beeswax.auth_username	Username for authentication in Hive	—
beeswax.auth_password	Password for authentication in Hive	—

Interpreter YARN
Parameter	Description	Default value
hadoop.yarn_clusters.default.resourcemanager_host	Network address of the host where the Resource Manager is running	—
hadoop.yarn_clusters.default.resourcemanager_port	Port listened by the Resource Manager IPC	8031
hadoop.yarn_clusters.default.submit_to	Defines whether the jobs are submitted to this cluster	true
hadoop.yarn_clusters.default.logical_name	Resource Manager logical name (required for High Availability mode)	—
hadoop.yarn_clusters.default.security_enabled	Defines whether the YARN cluster is secured by Kerberos	false
hadoop.yarn_clusters.default.ssl_cert_ca_verify	Defines whether to verify the SSL certificates from YARN Rest APIs against the CA when using the secure mode (HTTPS)	false
hadoop.yarn_clusters.default.resourcemanager_api_url	URL of the Resource Manager API	—
hadoop.yarn_clusters.default.proxy_api_url	URL of the first Resource Manager API	—
hadoop.yarn_clusters.default.history_server_api_url	URL of the History Server API	—
hadoop.yarn_clusters.default.spark_history_server_url	URL of the Spark History Server	—
hadoop.yarn_clusters.default.spark_history_server_security_enabled	Defines whether the Spark History Server is secured by Kerberos	false
hadoop.yarn_clusters.ha.resourcemanager_host	Network address of the host where the Resource Manager is running (High Availability mode)	—
hadoop.yarn_clusters.ha.resourcemanager_port	Port listened by the Resource Manager IPC (High Availability mode)	—
hadoop.yarn_clusters.ha.logical_name	Resource Manager logical name (required for High Availability mode)	—
hadoop.yarn_clusters.ha.security_enabled	Defines whether the YARN cluster is secured by Kerberos (High Availability mode)	false
hadoop.yarn_clusters.ha.submit_to	Defines whether the jobs are submitted to this cluster (High Availability mode)	true
hadoop.yarn_clusters.ha.ssl_cert_ca_verify	Defines whether to verify the SSL certificates from YARN Rest APIs against the CA when using the secure mode (HTTPS) (High Availability mode)	false
hadoop.yarn_clusters.ha.resourcemanager_api_url	URL of the Resource Manager API (High Availability mode)	—
hadoop.yarn_clusters.ha.history_server_api_url	URL of the History Server API (High Availability mode)	—
hadoop.yarn_clusters.ha.spark_history_server_url	URL of the Spark History Server (High Availability mode)	—
hadoop.yarn_clusters.ha.spark_history_server_security_enabled	Defines whether the Spark History Server is secured by Kerberos (High Availability mode)	false

Interpreter Kyuubi
Parameter	Description	Default value
notebook.dbproxy_extra_classpath	Classpath to be appended to the default DBProxy server classpath	/usr/share/java/kyuubi-hive-jdbc.jar
notebook.interpreters.kyuubi.name	Kyuubi interpreter name	Kyuubi[Spark3]
notebook.interpreters.kyuubi.options	Special parameters for connection to the Kyuubi server	—
notebook.interpreters.kyuubi.interface	Interface for the Kyuubi service	jdbc

Interpreter Trino
Parameter	Description	Default value
notebook.interpreters.trino.name	Trino interpreter name	Trino
notebook.interpreters.trino.interface	Interface for the Trino service	trino
notebook.interpreters.trino.options	Special parameters for connection to the Trino server	`{ "url": "", "auth_type": "basic", "kerberos_principal": "", "kerberos_service_name": "HTTP", "kerberos_force_preemptive": true, "kerberos_delegate": true, "ssl_cert_ca_verify": false }`

Interpreter Ozone
Parameter	Description	Default value
desktop.ozone.default.webhdfs_url	WebHDFS or HttpFS endpoint for accessing HDFS data	—
desktop.ozone.default.ozone_conf_dir	Path to the Ozone configuration directory	/etc/ozone/conf
desktop.ozone.default.security_enabled	Defines whether the Ozone cluster is secured by Kerberos	false
desktop.ozone.default.ssl_cert_ca_verify	Defines whether to verify SSL certificates against the CA	false
desktop.ozone.default.fs_defaultfs	Ozone service ID	—

hue.ini kerberos config
Parameter	Description	Default value
desktop.kerberos.hue_keytab	Path to HUE Kerberos keytab file	—
desktop.kerberos.hue_principal	Kerberos principal name for HUE	—
desktop.kerberos.kinit_path	Path to `kinit` utility	/usr/bin/kinit
desktop.kerberos.reinit_frequency	Time interval in seconds for HUE to renew its keytab	3600
desktop.kerberos.ccache_path	Path to cached Kerberos credentials	/tmp/hue_krb5_ccache
desktop.kerberos.krb5_renewlifetime_enabled	This must be set to `false` if the `renew_lifetime` parameter in krb5.conf file is set to `0m`	false
desktop.auth.auth	Authentication type	—

Authentication on WEB UIs
Parameter	Description	Default value
desktop.kerberos.kerberos_auth	Defines whether to use Kerberos authentication for HTTP clients based on the current ticket	false
desktop.kerberos.spnego_principal	Default Kerberos principal name for the HTTP client	—

hue.ini SSL config
Parameter	Description	Default value
desktop.ssl_certificate	Path to the SSL certificate file	/etc/ssl/certs/host_cert.cert
desktop.ssl_private_key	Path to the SSL RSA private key file	/etc/ssl/host_cert.key
desktop.ssl_password	SSL certificate password	—
desktop.ssl_no_renegotiation	Disables all renegotiation in TLSv1.2 and earlier	true
desktop.ssl_validate	Defines whether HUE should validate certificates received from the server	false
desktop.ssl_cacerts	This must be set to `false` if the `renew_lifetime` parameter in krb5.conf file is set to `0m`	/etc/pki/tls/certs/ca-bundle.crt
desktop.session.secure	Defines whether the cookie containing the user’s session ID and csrf cookie will use the `secure` flag	true
desktop.session.http_only	Defines whether the cookie containing the user’s session ID and csrf cookie will use the `HTTP only` flag	false

LDAP security
Parameter	Description	Default value
desktop.ldap.ldap_url	URL of the LDAP server	—
desktop.ldap.base_dn	The search base for finding users and groups	"DC=mycompany,DC=com"
desktop.ldap.nt_domain	The NT domain used for LDAP authentication	mycompany.com
desktop.ldap.ldap_cert	Certificate files in PEM format for the CA that HUE will trust for authentication over TLS	—
desktop.ldap.use_start_tls	Set this to `true` if you are not using Secure LDAP (LDAPS) but want to establish secure connections using TLS	true
desktop.ldap.bind_dn	Distinguished name of the user to bind as	"CN=ServiceAccount,DC=mycompany,DC=com"
desktop.ldap.bind_password	Password of the bind user	—
desktop.ldap.ldap_username_pattern	Pattern for username search. Specify the `<username>` placeholder for this parameter	"uid=<username>,ou=People,dc=mycompany,dc=com"
desktop.ldap.create_users_on_login	Defines whether to create users in HUE when they try to login with their LDAP credentials	true
desktop.ldap.sync_groups_on_login	Defines whether to synchronize users groups when they login	true
desktop.ldap.login_groups	A comma-separated list of LDAP groups containing users that are allowed to login	—
desktop.ldap.ignore_username_case	Defines whether to ignore the case of usernames when searching for existing users	true
desktop.ldap.force_username_lowercase	Defines whether to force use lowercase for usernames when creating new users from LDAP	true
desktop.ldap.force_username_uppercase	Defines whether to force use uppercase for usernames when creating new users from LDAP. This parameter cannot be combined with `desktop.ldap.force_username_lowercase`	false
desktop.ldap.search_bind_authentication	Enables search bind authentication	true
desktop.ldap.subgroups	Specifies the kind of subgrouping to use: nested or suboordinate (deprecated)	nested
desktop.ldap.nested_members_search_depth	The number of levels to search for nested members	10
desktop.ldap.follow_referrals	Defines whether to follow referrals	false
desktop.ldap.users.user_filter	Base filter for users search	"objectclass=*"
desktop.ldap.users.user_name_attr	The username attribute in the LDAP schema	sAMAccountName
desktop.ldap.groups.group_filter	Base filter for groups search	"objectclass=*"
desktop.ldap.groups.group_name_attr	The group name attribute in the LDAP schema	cn
desktop.ldap.groups.group_member_attr	The attribute of the group object that identifies the group members	member

Others
Parameter	Description	Default value
Enable custom ulimits	Switch on the corresponding toggle button to specify resource limits (ulimits) for the current process. If you do not set these values, the default system settings are used. Ulimit settings are described in the table below	`[Service] DefaultLimitCPU= DefaultLimitFSIZE= DefaultLimitDATA= DefaultLimitSTACK= DefaultLimitCORE= DefaultLimitRSS= DefaultLimitNOFILE= DefaultLimitAS= DefaultLimitNPROC= DefaultLimitMEMLOCK= DefaultLimitLOCKS= DefaultLimitSIGPENDING= DefaultLimitMSGQUEUE= DefaultLimitNICE= DefaultLimitRTPRIO= DefaultLimitRTTIME=`
Custom hue.ini	In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file hue.ini. List of available parameters can be found in the HUE documentation	—
log.conf	A configuration file with definitions for various logging objects	log.conf
Custom impalad_flags	Custom parameter values that overwrite the original ones	—

Ulimit settings
Parameter	Description	Corresponding option of the ulimit command in CentOS
DefaultLimitCPU	A limit in seconds on the amount of CPU time that a process can consume	cpu time ( -t)
DefaultLimitFSIZE	The maximum size of files that a process can create, in 512-byte blocks	file size ( -f)
DefaultLimitDATA	The maximum size of a process’s data segment, in kilobytes	data seg size ( -d)
DefaultLimitSTACK	The maximum stack size allocated to a process, in kilobytes	stack size ( -s)
DefaultLimitCORE	The maximum size of a core dump file allowed for a process, in 512-byte blocks	core file size ( -c)
DefaultLimitRSS	The maximum of resident set size, in kilobytes	max memory size ( -m)
DefaultLimitNOFILE	The maximum number of open file descriptors allowed for the process	open files ( -n)
DefaultLimitAS	The maximum size of the process virtual memory (address space), in kilobytes	virtual memory ( -v)
DefaultLimitNPROC	The maximum number of processes	max user processes ( -u)
DefaultLimitMEMLOCK	The maximum memory size that can be locked for the process, in kilobytes. Memory locking ensures the memory is always in RAM and a swap file is not used	max locked memory ( -l)
DefaultLimitLOCKS	The maximum number of files locked by a process	file locks ( -x)
DefaultLimitSIGPENDING	The maximum number of signals that are pending for delivery to the calling thread	pending signals ( -i)
DefaultLimitMSGQUEUE	The maximum number of bytes in POSIX message queues. POSIX message queues allow processes to exchange data in the form of messages	POSIX message queues ( -q)
DefaultLimitNICE	The maximum NICE priority level that can be assigned to a process	scheduling priority ( -e)
DefaultLimitRTPRIO	The maximum real-time scheduling priority level	real-time priority ( -r)
DefaultLimitRTTIME	The maximum pipe buffer size, in 512-byte blocks	pipe size ( -p)

Impala

Parameter

Description

Default value

impala-env.sh

The contents of the impala-env.sh file that contains Impala environment settings

impala-env.sh

Custom impala-env.sh

The contents of the custom impala-env.sh file that contains custom Impala environment settings

custom-impala-env.sh

Credential encryption

Parameter Description Default value

Encryption enable

Defines whether the credentials are encrypted

false

Credential provider path

Path to the credential provider for creating the .jceks files containing secret keys

jceks://hdfs/apps/impala/security/impala.jceks

Ranger plugin credential provider path

Path to the Ranger plugin credential provider

jceks://file/etc/impala/conf/ranger-impala.jceks

Custom jceks

Defines whether custom .jceks files located at the credential provider path are used (true) or auto-generated ones (false)

false

Password file name

Name of the password file in the classpath of the service if the password file is selected in the credstore options

impala_credstore_pass

ranger-hive-audit.xml

Parameter Description Default value

xasecure.audit.destination.solr.batch.filespool.dir

The spool directory path

/srv/ranger/hdfs_plugin/audit_solr_spool

xasecure.audit.destination.solr.urls

A URL of the Solr server to store audit events. Leave this property value empty or set it to NONE when using ZooKeeper to connect to Solr

—

xasecure.audit.destination.solr.zookeepers

Specifies the ZooKeeper connection string for the Solr destination

—

xasecure.audit.destination.solr.force.use.inmemory.jaas.config

Uses in-memory JAAS configuration file to connect to Solr

—

xasecure.audit.is.enabled

Enables Ranger audit

true

xasecure.audit.jaas.Client.loginModuleControlFlag

Specifies whether the success of the module is required, requisite, sufficient, or optional

—

xasecure.audit.jaas.Client.loginModuleName

The name of the authenticator class

—

xasecure.audit.jaas.Client.option.keyTab

The name of the keytab file to get the principal’s secret key

—

xasecure.audit.jaas.Client.option.principal

The name of the principal to be used

—

xasecure.audit.jaas.Client.option.serviceName

The name of a user or a service that wants to log in

—

xasecure.audit.jaas.Client.option.storeKey

Set this to true if you want the keytab or the principal’s key to be stored in the subject’s private credentials

false

xasecure.audit.jaas.Client.option.useKeyTab

Set this to true if you want the module to get the principal’s key from the keytab

false

ranger-hive-security.xml

Parameter Description Default value

ranger.plugin.hive.policy.rest.url

The URL to Ranger Admin

—

ranger.plugin.hive.service.name

Name of the Ranger service containing policies for this Impala instance

—

ranger.plugin.hive.policy.cache.dir

Directory, where Ranger policies are cached after a successful retrieval from the source

/srv/ranger/impala/policycache

ranger.plugin.hive.policy.pollIntervalMs

How often to poll for changes in policies in milliseconds

30000

ranger.plugin.hive.policy.rest.client.connection.timeoutMs

Impala plugin connection timeout in milliseconds

120000

ranger.plugin.hive.policy.rest.client.read.timeoutMs

Impala plugin read timeout in milliseconds

30000

xasecure.hive.update.xapolicies.on.grant.revoke

Specifies whether the Impala plugin should update the Ranger policies on the updates to permissions done using GRANT/REVOKE

true

ranger.plugin.hive.policy.rest.ssl.config.file

The path to the RangerRestClient SSL config file for HBase plugin

/etc/hbase/conf/ranger-hbase-policymgr-ssl.xml

ranger-hive-policymgr-ssl.xml

Parameter

Description

Default value

xasecure.policymgr.clientssl.keystore

The path to the keystore file used by Ranger

—

xasecure.policymgr.clientssl.keystore.credential.file

The path to the keystore credentials file

/etc/impala/conf/ranger-impala.jceks

xasecure.policymgr.clientssl.truststore.credential.file

The path to the truststore credentials file

/etc/impala/conf/ranger-impala.jceks

xasecure.policymgr.clientssl.truststore

The path to the truststore file used by Ranger

—

xasecure.policymgr.clientssl.keystore.password

The password to the keystore file

—

xasecure.policymgr.clientssl.truststore.password

The password to the truststore file

—

Enable LDAP

Parameter Description Default value

ldap_uri

URI of the LDAP server. Typically, the URI is prefixed with ldap:// or ldaps:// for SSL-based LDAP transport. The URI can optionally specify the port, for example: ldap://ldap_server.example.com:389

—

ldap_domain

Replaces the username with a string <username>@ldap_domain, where <username> is the name of the user trying to authenticate. Mutually exclusive with ldap_baseDN and ldap_bind_pattern

—

ldap_bind_dn

Distinguished name of the user to bind to for user/group searches. Required only if the user or group filters are being used and the LDAP server is not configured to allow anonymous searches

—

ldap_bind_password

Password of the user to bind to for user/group searches. Required only if the anonymous bind is not activated

—

ldap_bind_password_cmd

A Unix command the output of which returns the password to use with the --ldap_bind_dn option. The output of the command will be truncated to 1024 bytes and trimmed of trailing whitespace.

cat /etc/impala/conf/pass.pwd

ldap_user_search_basedn

The base DN for the LDAP subtree to search users

—

ldap_group_search_basedn

The base DN for the LDAP subtree to search groups

—

ldap_baseDN

Search base. Replaces the username with a DN of the form: uid=<userid>,ldap_baseDN, where <userid> is the username of the user trying to authenticate. Mutually exclusive with ldap_domain and ldap_bind_pattern

—

ldap_user_filter

A filter for both simple and search bind mechanisms. For a simple bind, it is a comma-separated list of user names. If specified, users must be on this list for authentication to succeed. For a search bind, it is an LDAP filter that will be used during an LDAP search, it can contain the {0} pattern which will be replaced with the user name

—

ldap_group_filter

Comma-separated list of groups. If specified, users must belong to one of these groups for authentication to succeed

—

ldap_allow_anonymous_binds

When true, LDAP authentication with a blank password (an anonymous bind) is allowed by Impala

false

ldap_search_bind_authentication

Allows switching between the search and simple bind user lookup methods when authenticating

true

ldap_ca_certificate

Specifies the location of the certificate in standard PEM format for SSL. Store this certificate on the local filesystem, in a location that only the impala user and other trusted users can read

—

ldap_passwords_in_clear_ok

Enables the webserver to start with the LDAP authentication even if SSL is not enabled. If set to true, the auth_creds_ok_in_clear parameter in the impalarc file gets set to true as well. A potentially unsecure configuration

false

ldap_bind_pattern

A string in which the #UID instance is replaced with the user id. For example, if this parameter is set to user=#UID,OU=foo,CN=bar and the user henry tries to authenticate, the constructed bind name will be user=henry,OU=foo,CN=bar. Mutually exclusive with ldap_domain and ldap_baseDN

—

allow_custom_ldap_filters_with_kerberos_auth

Specifies whether to allow custom LDAP user and group filters even if Kerberos is enabled

true

Other

Parameter

Description

Default value

Ranger plugin enabled

Whether or not Ranger plugin is enabled

false

Custom ranger-hive-audit.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file ranger-hive-audit.xml

—

Custom ranger-hive-security.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file ranger-hive-security.xml

—

Custom ranger-hive-policymgr-ssl.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file ranger-hive-policymgr-ssl.xml

—

The Impala Catalog Service component

catalogstore.conf
Parameter	Description	Default value
hostname	The hostname to use for the Catalog Service daemon. If Kerberos is enabled, it is also used as a part of the Kerberos principal. If this option is not set, the system default is used	—
state_store_host	The host where the Impala Statestore component is running	—
state_store_port	The port on which the Impala Statestore component is running	24000
catalog_service_host	The host where the Impala Catalog Service component is running	—
catalog_service_port	The port on which the Impala Catalog Service component listens	26000
enable_webserver	Enables or disables the Catalog Service web server. Its Web UI includes information about the databases, tables, and other objects managed by Impala, in addition to the resource usage and configuration settings of the Catalog Service	True
webserver_require_spnego	Enables the Kerberos authentication for Hadoop HTTP web consoles for all roles of this service using the SPNEGO protocol. Use this option only if Kerberos is enabled for the HDFS service	False
webserver_port	The port on which the Catalog Service web server is running	25020
log_dir	The directory where the Catalog Service daemon places its log files	/var/log/impala/catalogd/
log_filename	The Prefix of the log filename — the full path is `<log_dir>/<log_filename>`	catalogd
max_log_files	The number of log files that are kept for each severity level (`INFO`, `WARNING`, `ERROR`, and `FATAL`) before older log files are removed. The number should be greater than 1 to keep at least the current log file to remain open. If set to `0`, all log files are retained and log rotation is disabled	10
minidump_path	The directory for storing the Catalog Service daemon Breakpad dumps	/var/log/impala-minidumps
max_minidumps	The maximum number of Breakpad dump files stored by Catalog Service. A negative value or `0` is interpreted as an unlimited number	9
hms_event_polling_interval_s	When this parameter is set to a positive integer, Catalog Service fetches new notifications from Hive Metastore at the specified interval in seconds. If `hms_event_polling_interval_s` is set to `0`, the automatic metadata invalidation and updates are disabled. See Metadata management	2
load_auth_to_local_rules	If checked (True) and Kerberos is enabled for Impala, Impala uses the `auth_to_local` option from `hadoop.security.auth_to_local` rules of the HDFS configuration	True
load_catalog_in_background	If it is set to `True`, the metadata is loaded in the background, even if that metadata is not required for any query. If `False`, the metadata is loaded when it is referenced for the first time	False
catalog_topic_mode	The granularity of on-demand metadata fetches between the Impala Daemon coordinator and Impala Catalog Service. See Metadata management	minimal
statestore_subscriber_timeout_seconds	The timeout in seconds for Impala Daemon and Catalog Server connections to Statestore	30
state_store_subscriber_port	The port where StateStoreSubscriberService is running. StateStoreSubscriberService listens on this port for updates from the Statestore daemon	23020
enable_statestored_ha	Indicates whether two Impala Statestore components are present in the cluster, which allows them to work in the high availability mode	false
state_store_2_host	The host where the second Impala Statestore component is running (high availability mode)	—
state_store_2_port	The port on which the second Impala Statestore component is running (high availability mode)	24000
enable_catalogd_ha	Indicates whether two Impala Catalog Service components are present in the cluster, which allows them to work in the high availability mode	false
kerberos_reinit_interval	The number of minutes between reestablishing the ticket with the Kerberos server	60
principal	The service Kerberos principal	—
keytab_file	The service Kerberos keytab file	—
ssl_server_certificate	The path to the TLS/SSL file with the server certificate key used for TLS/SSL. It is used when Impala operates as a TLS/SSL server. The certificate file must be in the PEM format	—
ssl_private_key	The path to the TLS/SSL file with the private key used for TLS/SSL. It is used when Impala operates as a TLS/SSL server. The file must be in the PEM format	—
ssl_client_ca_certificate	The path to the certificate, in the PEM format, used to confirm the authenticity of SSL/TLS servers that the Impala daemons can connect to. Since the Impala daemons connect to each other, it should also include the CA certificate used to sign all the SSL/TLS certificates. SSL/TLS between Impala daemons cannot be enabled without this parameter	—
webserver_certificate_file	The path to the TLS/SSL file with the server certificate key used for TLS/SSL. It is used when the Catalog Service web server operates as a TLS/SSL server. The certificate file must be in the PEM format	—
webserver_private_key_file	The path to the TLS/SSL file with the private key used for TLS/SSL. It is used when the Catalog Service web server operates as a TLS/SSL server. The certificate file must be in the PEM format	—
ssl_minimum_version	The minimum version of TLS	TLSv1.2

Others
Parameter	Description	Default value
Custom catalogstore.conf	In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file catalogstore.conf	—
Enable custom ulimits	Switch on the corresponding toggle button to specify resource limits (ulimits) for the current process. If you do not set these values, the default system settings are used. Ulimit settings are described in the table below	`[Service] DefaultLimitCPU= DefaultLimitFSIZE= DefaultLimitDATA= DefaultLimitSTACK= DefaultLimitCORE= DefaultLimitRSS= DefaultLimitNOFILE= DefaultLimitAS= DefaultLimitNPROC= DefaultLimitMEMLOCK= DefaultLimitLOCKS= DefaultLimitSIGPENDING= DefaultLimitMSGQUEUE= DefaultLimitNICE= DefaultLimitRTPRIO= DefaultLimitRTTIME=`

Ulimit settings
Parameter	Description	Corresponding option of the ulimit command in CentOS
DefaultLimitCPU	A limit in seconds on the amount of CPU time that a process can consume	cpu time ( -t)
DefaultLimitFSIZE	The maximum size of files that a process can create, in 512-byte blocks	file size ( -f)
DefaultLimitDATA	The maximum size of a process’s data segment, in kilobytes	data seg size ( -d)
DefaultLimitSTACK	The maximum stack size allocated to a process, in kilobytes	stack size ( -s)
DefaultLimitCORE	The maximum size of a core dump file allowed for a process, in 512-byte blocks	core file size ( -c)
DefaultLimitRSS	The maximum of resident set size, in kilobytes	max memory size ( -m)
DefaultLimitNOFILE	The maximum number of open file descriptors allowed for the process	open files ( -n)
DefaultLimitAS	The maximum size of the process virtual memory (address space), in kilobytes	virtual memory ( -v)
DefaultLimitNPROC	The maximum number of processes	max user processes ( -u)
DefaultLimitMEMLOCK	The maximum memory size that can be locked for the process, in kilobytes. Memory locking ensures the memory is always in RAM and a swap file is not used	max locked memory ( -l)
DefaultLimitLOCKS	The maximum number of files locked by a process	file locks ( -x)
DefaultLimitSIGPENDING	The maximum number of signals that are pending for delivery to the calling thread	pending signals ( -i)
DefaultLimitMSGQUEUE	The maximum number of bytes in POSIX message queues. POSIX message queues allow processes to exchange data in the form of messages	POSIX message queues ( -q)
DefaultLimitNICE	The maximum NICE priority level that can be assigned to a process	scheduling priority ( -e)
DefaultLimitRTPRIO	The maximum real-time scheduling priority level	real-time priority ( -r)
DefaultLimitRTTIME	The maximum pipe buffer size, in 512-byte blocks	pipe size ( -p)

The Impala Client component

Parameter

Description

Default value

impala-shell-env.sh

The contents of the impala-shell-env.sh file that sets up necessary environment variables

impala-shell-env.sh

impalarc

The contents of the impalarc file with the LDAP settings for impala-shell

impalarc

The Impala Daemon component

impalastore.conf
Parameter	Description	Default value
hostname	The hostname to use for the Impala daemon. If Kerberos is enabled, it is also used as a part of the Kerberos principal. If this option is not set, the system default is used	—
beeswax_port	The port on which Impala daemons serve Beeswax client requests	21000
fe_port	The frontend port of the Impala daemon	21000
be_port	Internal use only. Impala daemons use this port for Thrift-based communication with each other	22000
krpc_port	Internal use only. Impala daemons use this port for KRPC-based communication with each other	27000
hs2_port	The port on which Impala daemons serve HiveServer2 client requests	21050
hs2_http_port	The port is used by client applications to transmit commands and receive results over HTTP via the HiveServer2 protocol	28000
enable_webserver	Enables or disables the Impala daemon web server. Its Web UI contains information about configuration settings, running and completed queries, and associated resource usage for them. It is primarily used for diagnosing query problems that can be traced to a particular node	True
webserver_require_spnego	Enables the Kerberos authentication for Hadoop HTTP web consoles for all roles of this service using the SPNEGO protocol. Use this option only if Kerberos is enabled for the HDFS service	False
webserver_port	The port where the Impala daemon web server is running	25000
catalog_service_host	The host where the Impala Catalog Service component is running	—
catalog_service_port	The port on which the Impala Catalog Service component listens	26000
state_store_host	The host where the Impala Statestore component is running	—
state_store_port	The port on which the Impala Statestore component is running	24000
state_store_subscriber_port	The port where StateStoreSubscriberService is running. StateStoreSubscriberService listens on this port for updates from the Statestore daemon	23030
scratch_dirs	The directory where Impala Daemons writes data to free up memory during large sort, join, aggregation, and other operations. The files are removed when the operation finishes. This can potentially be large amounts of data	/srv/impala/
log_dir	The directory where an Impala daemon places its log files	/var/log/impala/impalad/
profile_log_dir	The directory to which the profile log files are written	/var/log/impala/impalad/profiles
log_filename	The Prefix of the log filename — the full path is `<log_dir>/<log_filename>`	impalad
max_log_files	The number of log files that are kept for each severity level (`INFO`, `WARNING`, `ERROR`, and `FATAL`) before older log files are removed. The number should be greater than 1 to keep at least the current log file to remain open. If set to `0`, all log files are retained and log rotation is disabled	10
audit_event_log_dir	The directory in which Impala daemon audit event log files are written if the `Impala Audit Event Generation` property is enabled	/var/log/impala/impalad/audit
minidump_path	The directory for storing Impala daemon Breakpad dumps	/var/log/impala-minidumps
lineage_event_log_dir	The directory in which the Impala daemon generates its lineage log files if the `Impala Lineage Generation` property is enabled	/var/log/impala/impalad/lineage
local_library_dir	The local directory into which an Impala daemon copies user-defined function (UDF) libraries from HDFS	/usr/lib/impala/udfs
max_lineage_log_file_size	The maximum size (in entries) of the Impala daemon lineage log file. When the size is exceeded, a new file is created	5000
max_audit_event_log_file_size	The maximum size (in queries) of the Impala Daemon audit event log file. When the size is exceeded, a new file is created	5000
fe_service_threads	The maximum number of concurrent client connections allowed. The parameter determines how many queries can run simultaneously. When more clients try to connect to Impala, the later arriving clients have to wait until previous clients disconnect. Setting the `fe_service_threads` value too high could negatively impact query latency	64
mem_limit	The memory limit (in bytes) for an Impala daemon enforced by the daemon itself. This limit does not include memory consumed by the daemon’s embedded JVM. The Impala daemon uses up this amount of memory for query processing, cached data, network buffers, background operations, etc. If the limit is exceeded, queries will be killed until the used memory becomes under the limit	1473249280
idle_query_timeout	The time in seconds after which an idle query (no processing work is done and no updates are received from the client) is cancelled. If set to `0`, idle queries are never expired	0
idle_session_timeout	The time in seconds after which Impala closes an idle session and cancels all running queries. If set to `0`, idle sessions never expire	0
max_result_cache_size	The maximum number of query results a client can request to be cached on a per-query basis to support restarting fetches. This option guards against unreasonably large result caches. Requests exceeding this maximum are rejected	100000
max_cached_file_handles	The maximum number of cached HDFS file handles. Caching HDFS file handles reduces the number of new file handles opened and thus reduces the load on a HDFS NameNode. Each cached file handle consumes a small amount of memory. If set to `0`, the file handle caching is disabled	20000
unused_file_handle_timeout_sec	The maximum time in seconds during which an unused HDFS file handle remains in the HDFS file handle cache. When the underlying file for a cached file handle is deleted, the disk space may not be freed until the cached file handle is removed from the cache. This timeout allows the disk space occupied by deleted files to be freed in a predictable period of time. If set to `0`, unused cached HDFS file handles are not removed	21600
statestore_subscriber_timeout_seconds	The timeout in seconds for Impala Daemon and Catalog Server connections to Statestore	30
default_query_options	A list of key/value pairs representing additional query options to pass to the Impala Daemon command line, separated by commas	default_file_format=parquet,default_transactional_type=none
load_auth_to_local_rules	If checked (True) and Kerberos is enabled for Impala, Impala uses the `auth_to_local` option from `hadoop.security.auth_to_local` rules of the HDFS configuration	True
catalog_topic_mode	The granularity of on-demand metadata fetches between the Impala Daemon coordinator and Impala Catalog Service. See Metadata management	minimal
use_local_catalog	Allows coordinators to cache metadata from Impala Catalog Service. If this is set to `True`, coordinators pull metadata as needed from catalogd and cache it locally. The cached metadata is automatically removed under memory pressure or after an expiration time. See Metadata management	True
abort_on_failed_audit_event	Specifies whether shutdown Impala if there is a problem with recording an audit event	False
max_minidumps	The maximum number of Breakpad dump files stored by the Impala daemon. A negative value or `0` is interpreted as an unlimited number	9
authorized_proxy_user_config	Specifies the set of authorized proxy users (the users who can impersonate other users during authorization), and users who they are allowed to impersonate. The example of syntax for the option is: `authenticated_user1=delegated_user1,delegated_user2;authenticated_user2=.` See Configuring Impala delegation for clients. The list can contain short usernames or `` to indicate all users	knox=;zeppelin=
queue_wait_timeout_ms	The maximum amount of time (in milliseconds) that a request waits to be admitted before timing out. Must be a positive integer	60000
disk_spill_encryption	Specifies whether to encrypt and verify the integrity of all data spilled to the disk as part of a query	False
abort_on_config_error	Specifies whether to abort Impala startup if there are incorrect configs or Impala is running on unsupported hardware	True
llama_site_path	Path to the llama-site.xml configuration file	/etc/impala/conf/llama-site.xml
fair_scheduler_allocation_path	Path to the fair-scheduler.xml configuration file	/etc/impala/conf/fair-scheduler.xml
enable_statestored_ha	Indicates whether two Impala Statestore components are present in the cluster, which allows them to work in the high availability mode	false
state_store_2_host	The host where the second Impala Statestore component is running (high availability mode)	—
state_store_2_port	The port on which the second Impala Statestore component is running (high availability mode)	24000
kerberos_reinit_interval	The number of minutes between reestablishing the ticket with the Kerberos server	60
principal	The service Kerberos principal	—
keytab_file	The service Kerberos keytab file	—
ssl_server_certificate	The path to the TLS/SSL file with the server certificate key used for TLS/SSL. It is used when Impala operates as a TLS/SSL server. The certificate file must be in the PEM format	—
ssl_private_key	The path to the TLS/SSL file with the private key used for TLS/SSL. It is used when Impala operates as a TLS/SSL server. The file must be in the PEM format	—
ssl_client_ca_certificate	The path to the certificate, in the PEM format, used to confirm the authenticity of SSL/TLS servers that the Impala daemons can connect to. Since the Impala daemons connect to each other, it should also include the CA certificate used to sign all the SSL/TLS certificates. SSL/TLS between Impala daemons cannot be enabled without this parameter	—
webserver_certificate_file	The path to the TLS/SSL file with the server certificate key used for TLS/SSL. It is used when the Impala daemon web server operates as a TLS/SSL server. The certificate file must be in the PEM format	—
webserver_private_key_file	The path to the TLS/SSL file with the private key used for TLS/SSL. It is used when the Impala daemon web server operates as a TLS/SSL server. The certificate file must be in the PEM format	—
ssl_minimum_version	The minimum version of TLS	TLSv1.2

Others
Parameter	Description	Default value
Custom impalastore.conf	In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file impalastore.conf	—
log4j.properties	Apache Log4j utility settings	log.threshold=INFO main.logger=FA impala.root.logger=DEBUG,FA log4j.rootLogger=DEBUG,FA log.dir=/var/log/impala/impalad max.log.file.size=200MB log4j.appender.FA=org.apache.log4j.FileAppender log4j.appender.FA.File=/var/log/impalad/impalad.INFO log4j.appender.FA.layout=org.apache.log4j.PatternLayout log4j.appender.FA.layout.ConversionPattern=%p%d{MMdd HH:mm:ss.SSS'000'} %t %c] %m%n log4j.appender.console=org.apache.log4j.ConsoleAppender log4j.appender.console.target=System.err log4j.appender.console.layout=org.apache.log4j.PatternLayout log4j.appender.console.layout.ConversionPattern=%d{yy/MM/dd HH:mm:ss} %p %c{2}: %m%n
llama-site.xml	Resource pools configuration	`<?xml version="1.0" encoding="UTF-8"?> <configuration> </configuration>`
fair_scheduler.xml	Resource pools configuration	`<allocations> </allocations>`
Enable custom ulimits	Switch on the corresponding toggle button to specify resource limits (ulimits) for the current process. If you do not set these values, the default system settings are used. Ulimit settings are described in the table below	`[Service] DefaultLimitCPU= DefaultLimitFSIZE= DefaultLimitDATA= DefaultLimitSTACK= DefaultLimitCORE= DefaultLimitRSS= DefaultLimitNOFILE= DefaultLimitAS= DefaultLimitNPROC= DefaultLimitMEMLOCK= DefaultLimitLOCKS= DefaultLimitSIGPENDING= DefaultLimitMSGQUEUE= DefaultLimitNICE= DefaultLimitRTPRIO= DefaultLimitRTTIME=`

Ulimit settings
Parameter	Description	Corresponding option of the ulimit command in CentOS
DefaultLimitCPU	A limit in seconds on the amount of CPU time that a process can consume	cpu time ( -t)
DefaultLimitFSIZE	The maximum size of files that a process can create, in 512-byte blocks	file size ( -f)
DefaultLimitDATA	The maximum size of a process’s data segment, in kilobytes	data seg size ( -d)
DefaultLimitSTACK	The maximum stack size allocated to a process, in kilobytes	stack size ( -s)
DefaultLimitCORE	The maximum size of a core dump file allowed for a process, in 512-byte blocks	core file size ( -c)
DefaultLimitRSS	The maximum of resident set size, in kilobytes	max memory size ( -m)
DefaultLimitNOFILE	The maximum number of open file descriptors allowed for the process	open files ( -n)
DefaultLimitAS	The maximum size of the process virtual memory (address space), in kilobytes	virtual memory ( -v)
DefaultLimitNPROC	The maximum number of processes	max user processes ( -u)
DefaultLimitMEMLOCK	The maximum memory size that can be locked for the process, in kilobytes. Memory locking ensures the memory is always in RAM and a swap file is not used	max locked memory ( -l)
DefaultLimitLOCKS	The maximum number of files locked by a process	file locks ( -x)
DefaultLimitSIGPENDING	The maximum number of signals that are pending for delivery to the calling thread	pending signals ( -i)
DefaultLimitMSGQUEUE	The maximum number of bytes in POSIX message queues. POSIX message queues allow processes to exchange data in the form of messages	POSIX message queues ( -q)
DefaultLimitNICE	The maximum NICE priority level that can be assigned to a process	scheduling priority ( -e)
DefaultLimitRTPRIO	The maximum real-time scheduling priority level	real-time priority ( -r)
DefaultLimitRTTIME	The maximum pipe buffer size, in 512-byte blocks	pipe size ( -p)

The Impala Statestore component

statestore.conf
Parameter	Description	Default value
hostname	The hostname to use for the Statestore daemon. If Kerberos is enabled, it is also used as a part of the Kerberos principal. If this option is not set, the system default is used	—
state_store_host	The host where the Impala Statestore component is running	—
state_store_port	The port on which the Impala Statestore component is running	24000
catalog_service_host	The host where the Impala Catalog Service component is running	—
catalog_service_port	The port on which the Impala Catalog Service component listens	26000
enable_webserver	Enables or disables the Statestore daemon web server. Its Web UI contains information about memory usage, configuration settings, and ongoing health checks performed by Statestore	True
webserver_require_spnego	Enables the Kerberos authentication for Hadoop HTTP web consoles for all roles of this service using the SPNEGO protocol. Use this option only if Kerberos is enabled for the HDFS service	False
webserver_port	The port on which the Statestore web server is running	25010
log_dir	The directory where the Statestore daemon places its log files	/var/log/impala/statestored/
log_filename	The Prefix of the log filename — the full path is `<log_dir>/<log_filename>`	statestored
max_log_files	The number of log files that are kept for each severity level (`INFO`, `WARNING`, `ERROR`, and `FATAL`) before older log files are removed. The number should be greater than 1 to keep at least the current log file to remain open. If set to `0`, all log files are retained and log rotation is disabled	10
minidump_path	The directory for storing Statestore daemon Breakpad dumps	/var/log/impala-minidumps
max_minidumps	The maximum number of Breakpad dump files stored by Statestore daemon. A negative value or `0` is interpreted as an unlimited number	9
state_store_num_server_worker_threads	The number of worker threads for the thread manager of the Statestore Thrift server	4
state_store_pending_task_count_max	The maximum number of tasks allowed to be pending by the thread manager of the Statestore Thrift server. The `0` value allows an infinite number of pending tasks	0
enable_statestored_ha	Indicates whether two Impala Statestore components are present in the cluster, which allows them to work in the high availability mode	false
state_store_ha_port	RPC port of the peer Statestore instance in the high availability mode	24020
kerberos_reinit_interval	The number of minutes between reestablishing the ticket with the Kerberos server	60
principal	The service Kerberos principal	—
keytab_file	The service Kerberos keytab file	—
ssl_server_certificate	The path to the TLS/SSL file with the server certificate key used for TLS/SSL. It is used when Impala operates as a TLS/SSL server. The certificate file must be in the PEM format	—
ssl_private_key	The path to the TLS/SSL file with the private key used for TLS/SSL. It is used when Impala operates as a TLS/SSL server. The file must be in the PEM format	—
ssl_client_ca_certificate	The path to the certificate, in the PEM format, used to confirm the authenticity of SSL/TLS servers that the Impala daemons can connect to. Since the Impala daemons connect to each other, it should also include the CA certificate used to sign all the SSL/TLS certificates. SSL/TLS between Impala daemons cannot be enabled without this parameter	—
webserver_certificate_file	The path to the TLS/SSL file with the server certificate key used for TLS/SSL. It is used when the Statestore web server operates as a TLS/SSL server. The certificate file must be in the PEM format	—
webserver_private_key_file	The path to the TLS/SSL file with the private key used for TLS/SSL. It is used when the Statestore web server operates as a TLS/SSL server. The certificate file must be in the PEM format	—
ssl_minimum_version	The minimum version of TLS	TLSv1.2

Others
Parameter	Description	Default value
Custom statestore.conf	In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file statestore.conf	—
Enable custom ulimits	Switch on the corresponding toggle button to specify resource limits (ulimits) for the current process. If you do not set these values, the default system settings are used. Ulimit settings are described in the table below	`[Service] DefaultLimitCPU= DefaultLimitFSIZE= DefaultLimitDATA= DefaultLimitSTACK= DefaultLimitCORE= DefaultLimitRSS= DefaultLimitNOFILE= DefaultLimitAS= DefaultLimitNPROC= DefaultLimitMEMLOCK= DefaultLimitLOCKS= DefaultLimitSIGPENDING= DefaultLimitMSGQUEUE= DefaultLimitNICE= DefaultLimitRTPRIO= DefaultLimitRTTIME=`

Ulimit settings
Parameter	Description	Corresponding option of the ulimit command in CentOS
DefaultLimitCPU	A limit in seconds on the amount of CPU time that a process can consume	cpu time ( -t)
DefaultLimitFSIZE	The maximum size of files that a process can create, in 512-byte blocks	file size ( -f)
DefaultLimitDATA	The maximum size of a process’s data segment, in kilobytes	data seg size ( -d)
DefaultLimitSTACK	The maximum stack size allocated to a process, in kilobytes	stack size ( -s)
DefaultLimitCORE	The maximum size of a core dump file allowed for a process, in 512-byte blocks	core file size ( -c)
DefaultLimitRSS	The maximum of resident set size, in kilobytes	max memory size ( -m)
DefaultLimitNOFILE	The maximum number of open file descriptors allowed for the process	open files ( -n)
DefaultLimitAS	The maximum size of the process virtual memory (address space), in kilobytes	virtual memory ( -v)
DefaultLimitNPROC	The maximum number of processes	max user processes ( -u)
DefaultLimitMEMLOCK	The maximum memory size that can be locked for the process, in kilobytes. Memory locking ensures the memory is always in RAM and a swap file is not used	max locked memory ( -l)
DefaultLimitLOCKS	The maximum number of files locked by a process	file locks ( -x)
DefaultLimitSIGPENDING	The maximum number of signals that are pending for delivery to the calling thread	pending signals ( -i)
DefaultLimitMSGQUEUE	The maximum number of bytes in POSIX message queues. POSIX message queues allow processes to exchange data in the form of messages	POSIX message queues ( -q)
DefaultLimitNICE	The maximum NICE priority level that can be assigned to a process	scheduling priority ( -e)
DefaultLimitRTPRIO	The maximum real-time scheduling priority level	real-time priority ( -r)
DefaultLimitRTTIME	The maximum pipe buffer size, in 512-byte blocks	pipe size ( -p)

Kyuubi

The Kyuubi Server component

kyuubi-defaults.conf
Parameter	Description	Default value
kyuubi.frontend.rest.bind.port	Port on which the REST frontend service runs	10099
kyuubi.frontend.thrift.binary.bind.port	Port on which the Thrift frontend service runs via a binary protocol	10099
kyuubi.frontend.thrift.http.bind.port	Port on which the Thrift frontend service runs via HTTP	10010
kyuubi.frontend.thrift.http.path	The `path` component of the URL endpoint for the HTTP version of Thrift	cliservice
kyuubi.engine.share.level	An engine share level. Possible values: `CONNECTION` (one engine per connection), `USER` (one engine per user), `GROUP` (one engine per group), `SERVER` (one engine per server)	USER
kyuubi.engine.type	An engine type supported by Kyuubi. Possible values: `SPARK_SQL`, `FLINK_SQL`, `TRINO`, `HIVE_SQL`, `JDBC`	SPARK_SQL
kyuubi.operation.language	Programming language used to interpret inputs. Possible values: `SQL`, `SCALA`, `PYTHON`	SQL
kyuubi.frontend.protocols	A comma-separated list for supported frontend protocols. Possible values: `THRIFT_BINARY`, `THRIFT_HTTP`, `REST`	THRIFT_BINARY
kyuubi.frontend.thrift.binary.ssl.disallowed.protocols	Forbidden SSL versions for Thrift binary frontend	SSLv2,SSLv3,TLSv1.1
kyuubi.frontend.thrift.http.ssl.protocol.blacklist	Forbidden SSL versions for Thrift HTTP frontend	SSLv2,SSLv3,TLSv1.1
kyuubi.ha.addresses	External Kyuubi instance addresses	<hostname_1>:2181, …, <hostname_N>:2181
kyuubi.ha.namespace	The root directory for the service to deploy its instance URI	kyuubi
kyuubi.metadata.store.jdbc.database.type	A database type for the server metadata store. Possible values: `SQLITE`, `MYSQL`, `POSTGRESQL`	POSTGRESQL
kyuubi.metadata.store.jdbc.url	A JDBC URL for the server metadata store	jdbc:postgresql://{{ groups['adpg.adpg'][0] \| d(omit) }}:5432/kyuubi
kyuubi.metadata.store.jdbc.driver	A JDBC driver classname for the server metadata store	org.postgresql.Driver
kyuubi.metadata.store.jdbc.user	A username for the server metadata store	kyuubi
kyuubi.metadata.store.jdbc.password	A password for the server metadata store	—
kyuubi.metrics.enabled	Enables metrics collection from Kyuubi Server	false
kyuubi.metrics.reporters	Monitoring service for collecting metrics	PROMETHEUS
kyuubi.metrics.prometheus.port	Prometheus service port	10019
kyuubi.metrics.prometheus.path	Prometheus service metrics endpoint	/metrics
kyuubi.frontend.thrift.binary.ssl.enabled	Indicates whether to use the SSL encryption in the Thrift binary mode	false
kyuubi.frontend.thrift.http.use.SSL	Indicates whether to use the SSL encryption in the Thrift HTTP mode	false
kyuubi.frontend.ssl.keystore.type	Type of the SSL certificate keystore	—
kyuubi.frontend.ssl.keystore.path	Path to the SSL certificate keystore	—
kyuubi.frontend.ssl.keystore.password	Password for the SSL certificate keystore	—
kyuubi.frontend.thrift.http.ssl.keystore.path	Path to the SSL certificate keystore	—
kyuubi.frontend.thrift.http.ssl.keystore.password	Password for the SSL certificate keystore	—
kyuubi.authentication	Authentication type. Possible values: `NONE`, `LDAP`, or `LDAP,KERBEROS`	NONE
kyuubi.ha.zookeeper.acl.enabled	Indicates whether the ZooKeeper ensemble is kerberized	false
kyuubi.ha.zookeeper.auth.type	ZooKeeper authentication type. Possible values: `NONE`, `KERBEROS`	NONE
kyuubi.ha.zookeeper.auth.principal	Kerberos principal name used for ZooKeeper authentication	—
kyuubi.ha.zookeeper.auth.keytab	Path to Kyuubi Server’s keytab used for ZooKeeper authentication	—
kyuubi.kinit.principal	Name of the Kerberos principal	—
kyuubi.kinit.keytab	Path to Kyuubi Server’s keytab	—
kyuubi.spnego.principal	Name of the SPNego service principal. Set only if using SPNego in authentication	—
kyuubi.spnego.keytab	Path to the SPNego service keytab. Set only if using SPNego in authentication	—
kyuubi.engine.hive.java.options	Extra Java options for the Hive query engine	—

LDAP Security
Parameter	Description	Default value
kyuubi.authentication.ldap.url	A whitespace-separated list of LDAP connection URLs	—
kyuubi.authentication.ldap.domain	An LDAP domain	—
kyuubi.authentication.ldap.binddn	The distinguished name of the user to bind to for user/group searches. If not set, the name of the user trying to authenticate will be used. For example, `CN=bindUser,CN=Users,DC=subdomain,DC=domain,DC=com`	—
kyuubi.authentication.ldap.bindpw	The password for the bind user. Required only if `kyuubi.authentication.ldap.binddn` was specified	—
kyuubi.authentication.ldap.baseDN	An LDAP base DN	—
kyuubi.authentication.ldap.groupClassKey	An LDAP attribute name on the group entry to be used in the LDAP group searches. For example: `group`, `groupOfNames`, or `groupOfUniqueNames`	—
kyuubi.authentication.ldap.groupDNPattern	A colon-separated list of the patterns to use to find DNs for group entities in this directory. Use `%s` where the actual group name needs to be put. For example: `CN=%s,CN=Groups,DC=subdomain,DC=domain,DC=com`	—
kyuubi.authentication.ldap.groupFilter	A comma-separated list of the LDAP group names (short names, not full DNs). For example: `HiveAdmins,HadoopAdmins,Administrators`	—
kyuubi.authentication.ldap.groupMembershipKey	An LDAP attribute name on the group object that contains the list of distinguished names for the user, group, and contact objects that are members of the group. For example: `member`, `uniqueMember`, or `memberUid`	—
kyuubi.authentication.ldap.guidKey	An LDAP attribute name whose values are unique in this LDAP server. For example: `uid` or `CN`	—
kyuubi.authentication.ldap.userDNPattern	A colon-separated list of patterns to use to find DNs for users in this directory. Use `%s` where the actual user name needs to be put. For example: `CN=%s,CN=Users,DC=subdomain,DC=domain,DC=com`	—
kyuubi.authentication.ldap.userFilter	A comma-separated list of the LDAP usernames (short names, not full DNs). For example: `hiveuser,impalauser,hiveadmin,hadoopadmin`	—
kyuubi.authentication.ldap.userMembershipKey	An LDAP attribute name on the user object that contains groups of which the user is a direct member, except for the primary group, which is represented by the `primaryGroupId`. For example: `memberOf`	—

kyuubi-env.sh
Parameter	Description	Default value
KYUUBI_HOME	Kyuubi home directory	/usr/lib/kyuubi
KYUUBI_CONF_DIR	Directory that stores Kyuubi configurations	/etc/kyuubi/conf
KYUUBI_LOG_DIR	Kyuubi server log directory	/var/log/kyuubi
KYUUBI_PID_DIR	Directory that stores the Kyuubi instance .pid-file	/var/run/kyuubi
KYUUBI_ADDITIONAL_CLASSPATH	Additional classpath entries to be added to Kyuubi’s classpath. For example, SSM libraries for statistics collection. A list of files/directories to be added to the classpath. To add more items to the classpath, click	/usr/lib/ssm/lib/smart*
HADOOP_HOME	Hadoop home directory	/usr/lib/hadoop
HADOOP_LIB_DIR	Directory that stores Hadoop libraries	${HADOOP_HOME}/lib
KYUUBI_JAVA_OPTS	Java parameters for Kyuubi	-Djava.library.path=${HADOOP_LIB_DIR}/native/ -Djava.io.tmpdir={{ cluster.config.java_tmpdir \| d('/tmp') }}
HADOOP_CLASSPATH	A list of files/directories to be added to the classpath. To add more items to the classpath, click	$HADOOP_CLASSPATH /usr/lib/ssm/lib/smart*
HADOOP_CONF_DIR	Directory that stores Hadoop configurations	/etc/hadoop/conf
SPARK_HOME	Spark home directory	/usr/lib/spark3
SPARK_CONF_DIR	Directory that stores Spark configurations	/etc/spark3/conf
FLINK_HOME	Flink home directory	/usr/lib/flink
FLINK_CONF_DIR	Directory that stores Flink configurations	/etc/flink/conf
FLINK_HADOOP_CLASSPATH	Additional Hadoop JAR files required to use the Kyuubi Flink engine. A list of files/directories to be added to the classpath. To add more items to the classpath, click	$(hadoop classpath) /usr/lib/ssm/lib/smart*
HIVE_HOME	Hive home directory	/usr/lib/hive
HIVE_CONF_DIR	Directory that stores Hive configurations	/etc/hive/conf
HIVE_HADOOP_CLASSPATH	Additional Hadoop JAR files required to use the Kyuubi Hive engine A list of files/directories to be added to the classpath. To add more items to the classpath, click	$(hadoop classpath) /etc/tez/conf/ /usr/lib/tez/* /usr/lib/tez/lib/* /usr/lib/ssm/lib/smart*

Others
Parameter	Description	Default value
Custom properties	In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed for the Kyuubi service	—
Custom kyuubi-env.sh	In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file kyuubi-env.sh	—
Enable custom ulimits	Switch on the corresponding toggle button to specify resource limits (ulimits) for the current process. If you do not set these values, the default system settings are used. Ulimit settings are described in the table below	`[Service] DefaultLimitCPU= DefaultLimitFSIZE= DefaultLimitDATA= DefaultLimitSTACK= DefaultLimitCORE= DefaultLimitRSS= DefaultLimitNOFILE= DefaultLimitAS= DefaultLimitNPROC= DefaultLimitMEMLOCK= DefaultLimitLOCKS= DefaultLimitSIGPENDING= DefaultLimitMSGQUEUE= DefaultLimitNICE= DefaultLimitRTPRIO= DefaultLimitRTTIME=`
Custom log4j2-repl.xml	The contents of the log4j2-repl.xml configuration file	log4j2-repl.xml
log4j2.xml	The contents of the log4j2.xml configuration file	log4j2.xml

Ulimit settings
Parameter	Description	Corresponding option of the ulimit command in CentOS
DefaultLimitCPU	A limit in seconds on the amount of CPU time that a process can consume	cpu time ( -t)
DefaultLimitFSIZE	The maximum size of files that a process can create, in 512-byte blocks	file size ( -f)
DefaultLimitDATA	The maximum size of a process’s data segment, in kilobytes	data seg size ( -d)
DefaultLimitSTACK	The maximum stack size allocated to a process, in kilobytes	stack size ( -s)
DefaultLimitCORE	The maximum size of a core dump file allowed for a process, in 512-byte blocks	core file size ( -c)
DefaultLimitRSS	The maximum of resident set size, in kilobytes	max memory size ( -m)
DefaultLimitNOFILE	The maximum number of open file descriptors allowed for the process	open files ( -n)
DefaultLimitAS	The maximum size of the process virtual memory (address space), in kilobytes	virtual memory ( -v)
DefaultLimitNPROC	The maximum number of processes	max user processes ( -u)
DefaultLimitMEMLOCK	The maximum memory size that can be locked for the process, in kilobytes. Memory locking ensures the memory is always in RAM and a swap file is not used	max locked memory ( -l)
DefaultLimitLOCKS	The maximum number of files locked by a process	file locks ( -x)
DefaultLimitSIGPENDING	The maximum number of signals that are pending for delivery to the calling thread	pending signals ( -i)
DefaultLimitMSGQUEUE	The maximum number of bytes in POSIX message queues. POSIX message queues allow processes to exchange data in the form of messages	POSIX message queues ( -q)
DefaultLimitNICE	The maximum NICE priority level that can be assigned to a process	scheduling priority ( -e)
DefaultLimitRTPRIO	The maximum real-time scheduling priority level	real-time priority ( -r)
DefaultLimitRTTIME	The maximum pipe buffer size, in 512-byte blocks	pipe size ( -p)

MySQL

root user

Parameter Description Default value

Password

A password for the root user

—

Ozone

Parameter Description Default value

ozone-env.sh

The contents of the ozone-env.sh file that contains Ozone-specific environment variables

ozone-env.sh

Topology script

The script content that should be invoked to resolve DNS names to NetworkTopology names. Example: the script takes host.foo.bar as an argument and returns /rack1 as the output

—

Topology data

Optional auxiliary text file to map hostnames to rack numbers for topology script. Will be placed at /etc/hadoop/conf/topology.data

—

Ranger plugin enabled

Defines whether the Ranger plugin for Ozone is enabled

false

ozone-site.xml

Parameter

Description

Default value

hdds.prometheus.endpoint.enabled

Enables the Prometheus endpoint for HDDS

false

ozone.administrators

A comma-separated list of user principals who have administrator privileges in Ozone

ozone.administrators.groups

A comma-separated list of groups whose members have administrator privileges in Ozone

hadoop

ozone.replication

The default replication factor for data in Ozone. Higher values provide higher redundancy

ozone.service.id

Unique identifier for the Ozone service which is used for multi-cluster configurations

—

ozone.http.basedir

The base directory for the HTTP Jetty server to extract contents

/srv/ozone/meta/webserver

ozone.network.topology.aware.read

Defines whether the data is read from the closest pipeline

false

ozone.security.enabled

Defines whether secure connections are enabled for Ozone

false

hadoop.security.authentication

Authentication mechanism for Hadoop and Ozone

simple

ozone.security.http.kerberos.enabled

Defines whether Kerberos-based HTTP authentication is used for Ozone services

false

ozone.http.filter.initializers

HTTP filter initializer for Kerberos-based authentication

—

ozone.http.policy

Specifies the HTTP filter initializer for Kerberos-based authentication

HTTP_ONLY

hdds.grpc.tls.enabled

Defines whether TLS is enabled for HDDS GRPC server communication

false

ozone.https.client.need-auth

Specifies whether HTTPS clients need to authenticate using certificates

false

ssl-server.xml

Parameter

Description

Default value

ssl.server.keystore.location

Path to the keystore file

—

ssl.server.keystore.password

Password for the keystore file

—

ssl.server.truststore.location

Path to the truststore

—

ssl.server.truststore.password

Password for the truststore

—

ranger-ozone-audit.xml

Parameter Description Default value

xasecure.audit.destination.solr.batch.filespool.dir

The spool directory path

/srv/ranger/ozone_plugin/audit_solr_spool

xasecure.audit.destination.solr.urls

A URL of the Solr server to store audit events. Leave this property value empty or set it to NONE when using ZooKeeper to connect to Solr

—

xasecure.audit.destination.solr.zookeepers

Specifies the ZooKeeper connection string for the Solr destination

—

xasecure.audit.destination.solr.force.use.inmemory.jaas.config

Whether to use in-memory JAAS configuration file to connect to Solr

—

xasecure.audit.is.enabled

Enables Ranger audit

true

xasecure.audit.jaas.Client.loginModuleControlFlag

Specifies whether the success of the module is required, requisite, sufficient, or optional

—

xasecure.audit.jaas.Client.loginModuleName

The name of the authenticator class

—

xasecure.audit.jaas.Client.option.keyTab

The name of the keytab file to get the principal’s secret key

—

xasecure.audit.jaas.Client.option.principal

The name of the principal to be used

—

xasecure.audit.jaas.Client.option.serviceName

The name of a user or a service for login

—

xasecure.audit.jaas.Client.option.storeKey

Set this to true if you want the keytab or the principal’s key to be stored in the subject’s private credentials

—

xasecure.audit.jaas.Client.option.useKeyTab

Set this to true if you want the module to get the principal’s key from the keytab

—

ranger-ozone-security.xml

Parameter

Description

Default value

ranger.plugin.ozone.policy.rest.url

The URL to Ranger Admin

—

ranger.plugin.ozone.service.name

Name of the Ranger service containing policies for this Ozone instance

—

ranger.plugin.ozone.policy.cache.dir

Directory where Ranger policies are cached after a successful retrieval from the source

/srv/ranger/ozone/policycache

ranger.plugin.ozone.policy.pollIntervalMs

How often to poll for changes in policies in milliseconds

30000

ranger.plugin.ozone.policy.rest.client.connection.timeoutMs

Ozone plugin connection timeout in milliseconds

120000

ranger.plugin.ozone.policy.rest.client.read.timeoutMs

Ozone plugin read timeout in milliseconds

30000

ranger.plugin.ozone.policy.rest.ssl.config.file

The path to the RangerRestClient SSL config file for the Ozone plugin

/etc/ozone/conf/ranger-hbase-policymgr-ssl.xml

ranger-ozone-policymgr-ssl.xml

Parameter

Description

Default value

xasecure.policymgr.clientssl.keystore

The path to the keystore file used by Ranger

—

xasecure.policymgr.clientssl.keystore.credential.file

The path to the keystore credentials file

/etc/ozone/conf/ranger-ozone.jceks

xasecure.policymgr.clientssl.truststore.credential.file

The path to the truststore credentials file

/etc/ozone/conf/ranger-ozone.jceks

xasecure.policymgr.clientssl.truststore

The path to the truststore file used by Ranger

—

xasecure.policymgr.clientssl.keystore.password

The password to the keystore file

—

xasecure.policymgr.clientssl.truststore.password

The password to the truststore file

—

Credential encryption

Parameter Description Default value

Encryption enable

Defines whether the credentials are encrypted

false

Credential provider path

Path to the credential provider for creating the .jceks files containing secret keys

jceks://file/etc/ozone/conf/ozone.jceks

Ranger plugin credential provider path

Path to the Ranger plugin credential provider

jceks://file/etc/ozone/conf/ranger-ozone.jceks

Custom jceks

Defines whether custom .jceks files located at the credential provider path (true) or auto-generated ones (false) are used

false

Password file name

Name of the password file in the classpath of the service if the password file is selected in the credstore options

ozone_credstore_pass

Other

Parameter

Description

Default value

Custom ozone-site.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file ozone-site.xml

—

Custom ssl-server.xml.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file ssl-server.xml.xml

—

Custom ranger-ozone-audit.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file ranger-ozone-audit.xml

—

Custom ranger-ozone-security.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file ranger-ozone-security.xml

—

Custom ranger-ozone-policymgr-ssl.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file ranger-ozone-policymgr-ssl.xml

—

The Ozone Datanode component

ozone-site.xml
Parameter	Description	Default value
hdds.datanode.http-address	HTTP address of the Datanode web interface	0.0.0.0:9882
hdds.datanode.https-address	HTTPS address of the Datanode web interface	0.0.0.0:9883
ozone.scm.datanode.id.dir	Path to the directory where the Datanode stores its unique identifier file	/srv/ozone/meta/node
hdds.datanode.dir	Determines, where on the local filesystem the Datanode should store its data. If multiple directories are specified, then data will be stored in all named directories, typically on different devices. The directories should be tagged with corresponding storage types (`SSD`/`DISK`/`ARCHIVE`/`RAM_DISK`). The default storage type will be `DISK` if the directory does not have a storage type tagged explicitly. Directories, that do not exist, will be created if the local filesystem permission allows	/srv/ozone/data:DISK
hdds.datanode.container.db.dir	Determines, where on the local filesystem the Datanode should store container database files. If multiple directories are specified, then data will be stored in all named directories, typically on different devices. The directories should be tagged with corresponding storage types (`SSD`/`DISK`/`ARCHIVE`/`RAM_DISK`). The default storage type will be `DISK` if the directory does not have a storage type tagged explicitly. Directories, that do not exist, will be created if the local filesystem permission allows	/srv/ozone/data/db:DISK
hdds.container.ratis.datanode.storage.dir	Path to the directory for storing Ratis logs for the Datanode	/srv/ozone/data/logs
hdds.container.ratis.enabled	Defines whether the Ratis for the Datanode is enabled to provide high availability and replication	true
hdds.container.ratis.datastream.enabled	Defines whether the DataStream is supported in Ratis for efficient data streaming between Datanodes	true
hdds.container.ratis.datastream.port	Port used for Ratis DataStream on the Datanode	9855
ozone.fs.datastream.enabled	Defines whether the data streaming over the filesystem for Ozone is enabled	true
ozone.container.cache.size	Size of the container cache for metadata and other frequently accessed data, in megabytes	8192
ozone.container.cache.lock.stripes	Number of lock stripes for container cache. It is used to manage concurrent access	8192
hdds.datanode.du.factory.classname	Defines the factory class for calculating disk usage on the Datanode	org.apache.hadoop.hdds.fs.DedicatedDiskSpaceUsageFactory
hdds.container.report.interval	Interval for the Datanode to send the container reports to Storage Container Manager	10m
hdds.container.ratis.leader.pending.bytes.limit	Limit for the pending bytes for the Ratis leader in a Datanode	2GB
ozone.recon.address	Address for connecting to the Recon server for the Datanode metrics and reports	{{ groups['ozone.ozone_recon'][0] \| d('0.0.0.0') }}:9891
dfs.datanode.kerberos.principal	Datanode service Kerberos principal	—
dfs.datanode.kerberos.keytab.file	Path to the keytab file used by the Datanode daemon as its service principal to log in with	—
hdds.datanode.http.auth.type	Authentication mechanism for the Datanode HTTP server	—
hdds.datanode.http.auth.kerberos.principal	Datanode HTTP server service principal	—
hdds.datanode.http.auth.kerberos.keytab	Path to the keytab file used by the Datanode HTTP server as its service principal to log in with	—

Others
Parameter	Description	Default value
Custom ozone-site.conf	In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file ozone-site.xml	—
Enable custom ulimits	Switch on the corresponding toggle button to specify resource limits (ulimits) for the current process. If you do not set these values, the default system settings are used. Ulimit settings are described in the table below	`[Service] DefaultLimitCPU= DefaultLimitFSIZE= DefaultLimitDATA= DefaultLimitSTACK= DefaultLimitCORE= DefaultLimitRSS= DefaultLimitNOFILE= DefaultLimitAS= DefaultLimitNPROC= DefaultLimitMEMLOCK= DefaultLimitLOCKS= DefaultLimitSIGPENDING= DefaultLimitMSGQUEUE= DefaultLimitNICE= DefaultLimitRTPRIO= DefaultLimitRTTIME=`

Ulimit settings
Parameter	Description	Corresponding option of the ulimit command in CentOS
DefaultLimitCPU	A limit in seconds on the amount of CPU time that a process can consume	cpu time ( -t)
DefaultLimitFSIZE	The maximum size of files that a process can create, in 512-byte blocks	file size ( -f)
DefaultLimitDATA	The maximum size of a process’s data segment, in kilobytes	data seg size ( -d)
DefaultLimitSTACK	The maximum stack size allocated to a process, in kilobytes	stack size ( -s)
DefaultLimitCORE	The maximum size of a core dump file allowed for a process, in 512-byte blocks	core file size ( -c)
DefaultLimitRSS	The maximum of resident set size, in kilobytes	max memory size ( -m)
DefaultLimitNOFILE	The maximum number of open file descriptors allowed for the process	open files ( -n)
DefaultLimitAS	The maximum size of the process virtual memory (address space), in kilobytes	virtual memory ( -v)
DefaultLimitNPROC	The maximum number of processes	max user processes ( -u)
DefaultLimitMEMLOCK	The maximum memory size that can be locked for the process, in kilobytes. Memory locking ensures the memory is always in RAM and a swap file is not used	max locked memory ( -l)
DefaultLimitLOCKS	The maximum number of files locked by a process	file locks ( -x)
DefaultLimitSIGPENDING	The maximum number of signals that are pending for delivery to the calling thread	pending signals ( -i)
DefaultLimitMSGQUEUE	The maximum number of bytes in POSIX message queues. POSIX message queues allow processes to exchange data in the form of messages	POSIX message queues ( -q)
DefaultLimitNICE	The maximum NICE priority level that can be assigned to a process	scheduling priority ( -e)
DefaultLimitRTPRIO	The maximum real-time scheduling priority level	real-time priority ( -r)
DefaultLimitRTTIME	The maximum pipe buffer size, in 512-byte blocks	pipe size ( -p)

The Ozone HttpFS component

httpfs-env.sh
Parameter	Description	Default value
HTTPFS_CONFIG	Path to the directory with the HttpFS configuration files	${OZONE_CONF_DIR}
HTTPFS_LOG	Path to the directory with the HttpFS logs	${OZONE_LOG_DIR}
HTTPFS_TEMP	Path to the HttpFS temporary directory	${OZONE_LOG_DIR}
HADOOP_PID_DIR	Path to the directory where HttpFS stores process ID files	${OZONE_LOG_DIR}
HDFS_HTTPFS_OPTS	JVM options for the HttpFS service	-Xms700m -Xmx8G

httpfs-site.xml
Parameter	Description	Default value
httpfs.http.port	Port of the HttpFS service	14001
hadoop.http.temp.dir	Path to the temporary directory for the HttpFS service	${hadoop.tmp.dir}/httpfs
httpfs.hadoop.config.dir	Path to the Hadoop configuration directory for HttpFS	/etc/ozone/conf
httpfs.http.administrators	List of administrators for HttpFS service	*
httpfs.proxyuser.om.groups	List of user groups for Ozone Manager authentication in HttpFS	*
httpfs.proxyuser.om.hosts	List of hosts for Ozone Manager authentication in HttpFS	*
httpfs.proxyuser.hue.groups	List of user groups for HUE authentication in HttpFS	*
httpfs.proxyuser.hue.hosts	List of hosts for HUE authentication in HttpFS	*
httpfs.hadoop.authentication.kerberos.keytab	Kerberos keytab for Ozone Manager authentication in HttpFS	—
httpfs.hadoop.authentication.kerberos.principal	Kerberos principal for Ozone Manager authentication in HttpFS	—
httpfs.hadoop.authentication.type	Authentication mechanism for Ozone Manager	simple
hadoop.http.authentication.kerberos.keytab	Path to Kerberos keytab for HttpFS authentication	—
hadoop.http.authentication.kerberos.principal	Kerberos principal for HttpFS authentication	—
hadoop.http.authentication.type	Authentication mechanism for HttpFS	simple
httpfs.ssl.enabled	Defines whether SSL is enabled for HttpFS	false

Others
Parameter	Description	Default value
Custom httpfs-env.sh	In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file httpfs-env.sh	—
Custom httpfs-site.xml	In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file httpfs-site.xml	—
Enable custom ulimits	Switch on the corresponding toggle button to specify resource limits (ulimits) for the current process. If you do not set these values, the default system settings are used. Ulimit settings are described in the table below	`[Service] DefaultLimitCPU= DefaultLimitFSIZE= DefaultLimitDATA= DefaultLimitSTACK= DefaultLimitCORE= DefaultLimitRSS= DefaultLimitNOFILE= DefaultLimitAS= DefaultLimitNPROC= DefaultLimitMEMLOCK= DefaultLimitLOCKS= DefaultLimitSIGPENDING= DefaultLimitMSGQUEUE= DefaultLimitNICE= DefaultLimitRTPRIO= DefaultLimitRTTIME=`

Ulimit settings
Parameter	Description	Corresponding option of the ulimit command in CentOS
DefaultLimitCPU	A limit in seconds on the amount of CPU time that a process can consume	cpu time ( -t)
DefaultLimitFSIZE	The maximum size of files that a process can create, in 512-byte blocks	file size ( -f)
DefaultLimitDATA	The maximum size of a process’s data segment, in kilobytes	data seg size ( -d)
DefaultLimitSTACK	The maximum stack size allocated to a process, in kilobytes	stack size ( -s)
DefaultLimitCORE	The maximum size of a core dump file allowed for a process, in 512-byte blocks	core file size ( -c)
DefaultLimitRSS	The maximum of resident set size, in kilobytes	max memory size ( -m)
DefaultLimitNOFILE	The maximum number of open file descriptors allowed for the process	open files ( -n)
DefaultLimitAS	The maximum size of the process virtual memory (address space), in kilobytes	virtual memory ( -v)
DefaultLimitNPROC	The maximum number of processes	max user processes ( -u)
DefaultLimitMEMLOCK	The maximum memory size that can be locked for the process, in kilobytes. Memory locking ensures the memory is always in RAM and a swap file is not used	max locked memory ( -l)
DefaultLimitLOCKS	The maximum number of files locked by a process	file locks ( -x)
DefaultLimitSIGPENDING	The maximum number of signals that are pending for delivery to the calling thread	pending signals ( -i)
DefaultLimitMSGQUEUE	The maximum number of bytes in POSIX message queues. POSIX message queues allow processes to exchange data in the form of messages	POSIX message queues ( -q)
DefaultLimitNICE	The maximum NICE priority level that can be assigned to a process	scheduling priority ( -e)
DefaultLimitRTPRIO	The maximum real-time scheduling priority level	real-time priority ( -r)
DefaultLimitRTTIME	The maximum pipe buffer size, in 512-byte blocks	pipe size ( -p)

The Ozone Manager component

ozone-site.xml
Parameter	Description	Default value
ozone.om.address	Address of the Ozone Manager	0.0.0.0:9862
ozone.om.enable.filesystem.paths	Defines whether the filesystem path-style operations are enabled	true
ozone.om.http-address	HTTP address of the Ozone Manager web interface	0.0.0.0:9874
ozone.om.https-address	HTTPS address of the Ozone Manager web interface	0.0.0.0:9875
ozone.om.ratis.port	Port for Ratis communication used by Ozone Manager	9858
ozone.om.db.dirs	Path to the directory where the Ozone Manager stores its database files	/srv/ozone/meta/db
ozone.om.db.dirs.permissions	Permission mode for the directory specified by the `ozone.om.db.dirs` parameter	750
ozone.metadata.dirs	Path to the directory for general metadata storage in Ozone	/srv/ozone/meta
ozone.metadata.dirs.permissions	Permission mode for the directory specified by the `ozone.metadata.dirs` parameter	750
ozone.om.snapshot.diff.db.dir	Path to the directory for storing the snapshot diff database for Ozone Manager	/srv/ozone/meta/snapshot
ozone.om.ratis.enable	Defines whether the Ratis-based replication for Ozone Manager is enabled to ensure high availability	true
ozone.om.service.ids	Identifiers for the Ozone Manager services used in multi-instance setups for high availability	—
ozone.om.kerberos.principal	Ozone Manager service principal	—
ozone.om.kerberos.keytab.file	Path to the keytab file used by the Ozone Manager daemon as its service principal to log in with	—
ozone.om.http.auth.type	Authentication mechanism for the Ozone Manager HTTP server	simple
ozone.om.http.auth.kerberos.principal	Principal of the Ozone Manager HTTP server service if SPNEGO is enabled	—
ozone.om.http.auth.kerberos.keytab	Path to the keytab file used by the Ozone Manager HTTP server as its service principal to log in with if SPNEGO is enabled	—

Others
Parameter	Description	Default value
Custom ozone-site.conf	In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file ozone-site.xml	—
Enable custom ulimits	Switch on the corresponding toggle button to specify resource limits (ulimits) for the current process. If you do not set these values, the default system settings are used. Ulimit settings are described in the table below	`[Service] DefaultLimitCPU= DefaultLimitFSIZE= DefaultLimitDATA= DefaultLimitSTACK= DefaultLimitCORE= DefaultLimitRSS= DefaultLimitNOFILE= DefaultLimitAS= DefaultLimitNPROC= DefaultLimitMEMLOCK= DefaultLimitLOCKS= DefaultLimitSIGPENDING= DefaultLimitMSGQUEUE= DefaultLimitNICE= DefaultLimitRTPRIO= DefaultLimitRTTIME=`

Ulimit settings
Parameter	Description	Corresponding option of the ulimit command in CentOS
DefaultLimitCPU	A limit in seconds on the amount of CPU time that a process can consume	cpu time ( -t)
DefaultLimitFSIZE	The maximum size of files that a process can create, in 512-byte blocks	file size ( -f)
DefaultLimitDATA	The maximum size of a process’s data segment, in kilobytes	data seg size ( -d)
DefaultLimitSTACK	The maximum stack size allocated to a process, in kilobytes	stack size ( -s)
DefaultLimitCORE	The maximum size of a core dump file allowed for a process, in 512-byte blocks	core file size ( -c)
DefaultLimitRSS	The maximum of resident set size, in kilobytes	max memory size ( -m)
DefaultLimitNOFILE	The maximum number of open file descriptors allowed for the process	open files ( -n)
DefaultLimitAS	The maximum size of the process virtual memory (address space), in kilobytes	virtual memory ( -v)
DefaultLimitNPROC	The maximum number of processes	max user processes ( -u)
DefaultLimitMEMLOCK	The maximum memory size that can be locked for the process, in kilobytes. Memory locking ensures the memory is always in RAM and a swap file is not used	max locked memory ( -l)
DefaultLimitLOCKS	The maximum number of files locked by a process	file locks ( -x)
DefaultLimitSIGPENDING	The maximum number of signals that are pending for delivery to the calling thread	pending signals ( -i)
DefaultLimitMSGQUEUE	The maximum number of bytes in POSIX message queues. POSIX message queues allow processes to exchange data in the form of messages	POSIX message queues ( -q)
DefaultLimitNICE	The maximum NICE priority level that can be assigned to a process	scheduling priority ( -e)
DefaultLimitRTPRIO	The maximum real-time scheduling priority level	real-time priority ( -r)
DefaultLimitRTTIME	The maximum pipe buffer size, in 512-byte blocks	pipe size ( -p)

The Ozone Recon component

ozone-site.xml
Parameter	Description	Default value
ozone.recon.db.dir	Path to the directory where Recon stores its database files for metrics and reports	/srv/ozone/recon/db
ozone.recon.task.pipelinesync.interval	Interval for the pipeline sync task, which synchronizes the pipeline information from Storage Container Manager	120s
ozone.recon.task.missingcontainer.interval	Interval for the missing container check, used to identify and report missing containers in the cluster	3600s
ozone.recon.http-address	HTTP address for the Recon web interface, used to view metrics and reports for the Ozone cluster	0.0.0.0:9888
ozone.recon.https-address	HTTPS address for the Recon web interface	0.0.0.0:9889
ozone.recon.kerberos.principal	Recon service principal	—
ozone.recon.kerberos.keytab.file	Path to the keytab file used by the Recon daemon as its service principal to log in with	—
ozone.recon.http.auth.type	Authentication mechanism for the Recon HTTP server	simple
ozone.recon.http.auth.kerberos.principal	Principal of the Recon HTTP server service for Kerberos authentication	—
ozone.recon.http.auth.kerberos.keytab	Path to the keytab file used by the Recon HTTP server as its service principal to log in with	—

Others
Parameter	Description	Default value
Custom ozone-site.conf	In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file ozone-site.xml	—
Enable custom ulimits	Switch on the corresponding toggle button to specify resource limits (ulimits) for the current process. If you do not set these values, the default system settings are used. Ulimit settings are described in the table below	`[Service] DefaultLimitCPU= DefaultLimitFSIZE= DefaultLimitDATA= DefaultLimitSTACK= DefaultLimitCORE= DefaultLimitRSS= DefaultLimitNOFILE= DefaultLimitAS= DefaultLimitNPROC= DefaultLimitMEMLOCK= DefaultLimitLOCKS= DefaultLimitSIGPENDING= DefaultLimitMSGQUEUE= DefaultLimitNICE= DefaultLimitRTPRIO= DefaultLimitRTTIME=`

Ulimit settings
Parameter	Description	Corresponding option of the ulimit command in CentOS
DefaultLimitCPU	A limit in seconds on the amount of CPU time that a process can consume	cpu time ( -t)
DefaultLimitFSIZE	The maximum size of files that a process can create, in 512-byte blocks	file size ( -f)
DefaultLimitDATA	The maximum size of a process’s data segment, in kilobytes	data seg size ( -d)
DefaultLimitSTACK	The maximum stack size allocated to a process, in kilobytes	stack size ( -s)
DefaultLimitCORE	The maximum size of a core dump file allowed for a process, in 512-byte blocks	core file size ( -c)
DefaultLimitRSS	The maximum of resident set size, in kilobytes	max memory size ( -m)
DefaultLimitNOFILE	The maximum number of open file descriptors allowed for the process	open files ( -n)
DefaultLimitAS	The maximum size of the process virtual memory (address space), in kilobytes	virtual memory ( -v)
DefaultLimitNPROC	The maximum number of processes	max user processes ( -u)
DefaultLimitMEMLOCK	The maximum memory size that can be locked for the process, in kilobytes. Memory locking ensures the memory is always in RAM and a swap file is not used	max locked memory ( -l)
DefaultLimitLOCKS	The maximum number of files locked by a process	file locks ( -x)
DefaultLimitSIGPENDING	The maximum number of signals that are pending for delivery to the calling thread	pending signals ( -i)
DefaultLimitMSGQUEUE	The maximum number of bytes in POSIX message queues. POSIX message queues allow processes to exchange data in the form of messages	POSIX message queues ( -q)
DefaultLimitNICE	The maximum NICE priority level that can be assigned to a process	scheduling priority ( -e)
DefaultLimitRTPRIO	The maximum real-time scheduling priority level	real-time priority ( -r)
DefaultLimitRTTIME	The maximum pipe buffer size, in 512-byte blocks	pipe size ( -p)

The Ozone S3G component

ozone-site.xml
Parameter	Description	Default value
ozone.s3g.http-address	HTTP address for the S3 Gateway, which provides the S3-compatible API for Ozone	0.0.0.0:9878
ozone.s3g.https-address	HTTPS address for the S3 Gateway	0.0.0.0:9879
ozone.s3g.volume.name	Specifies the volume name used by the S3 Gateway for storing S3-compatible buckets in Ozone	s3v
ozone.s3g.client.buffer.size	Size of the client buffer used by the S3 Gateway for data transfers	4KB
ozone.s3g.kerberos.principal	S3 Gateway service principal	—
ozone.s3g.kerberos.keytab.file	Path to the keytab file used by the S3 Gateway as its service principal to log in with	—
ozone.s3g.http.auth.type	Authentication mechanism for the S3G HTTP server	simple
ozone.s3g.http.auth.kerberos.principal	The S3 Gateway service principal if SPNEGO is enabled for the HTTP server	—
ozone.s3g.http.auth.kerberos.keytab	Path to the keytab file used by the S3 Gateway HTTP server as its service principal to log in with if SPNEGO is enabled	—

Others
Parameter	Description	Default value
Custom ozone-site.conf	In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file ozone-site.xml	—
Enable custom ulimits	Switch on the corresponding toggle button to specify resource limits (ulimits) for the current process. If you do not set these values, the default system settings are used. Ulimit settings are described in the table below	`[Service] DefaultLimitCPU= DefaultLimitFSIZE= DefaultLimitDATA= DefaultLimitSTACK= DefaultLimitCORE= DefaultLimitRSS= DefaultLimitNOFILE= DefaultLimitAS= DefaultLimitNPROC= DefaultLimitMEMLOCK= DefaultLimitLOCKS= DefaultLimitSIGPENDING= DefaultLimitMSGQUEUE= DefaultLimitNICE= DefaultLimitRTPRIO= DefaultLimitRTTIME=`

Ulimit settings
Parameter	Description	Corresponding option of the ulimit command in CentOS
DefaultLimitCPU	A limit in seconds on the amount of CPU time that a process can consume	cpu time ( -t)
DefaultLimitFSIZE	The maximum size of files that a process can create, in 512-byte blocks	file size ( -f)
DefaultLimitDATA	The maximum size of a process’s data segment, in kilobytes	data seg size ( -d)
DefaultLimitSTACK	The maximum stack size allocated to a process, in kilobytes	stack size ( -s)
DefaultLimitCORE	The maximum size of a core dump file allowed for a process, in 512-byte blocks	core file size ( -c)
DefaultLimitRSS	The maximum of resident set size, in kilobytes	max memory size ( -m)
DefaultLimitNOFILE	The maximum number of open file descriptors allowed for the process	open files ( -n)
DefaultLimitAS	The maximum size of the process virtual memory (address space), in kilobytes	virtual memory ( -v)
DefaultLimitNPROC	The maximum number of processes	max user processes ( -u)
DefaultLimitMEMLOCK	The maximum memory size that can be locked for the process, in kilobytes. Memory locking ensures the memory is always in RAM and a swap file is not used	max locked memory ( -l)
DefaultLimitLOCKS	The maximum number of files locked by a process	file locks ( -x)
DefaultLimitSIGPENDING	The maximum number of signals that are pending for delivery to the calling thread	pending signals ( -i)
DefaultLimitMSGQUEUE	The maximum number of bytes in POSIX message queues. POSIX message queues allow processes to exchange data in the form of messages	POSIX message queues ( -q)
DefaultLimitNICE	The maximum NICE priority level that can be assigned to a process	scheduling priority ( -e)
DefaultLimitRTPRIO	The maximum real-time scheduling priority level	real-time priority ( -r)
DefaultLimitRTTIME	The maximum pipe buffer size, in 512-byte blocks	pipe size ( -p)

The Ozone Storage Container Manager component

ozone-site.xml
Parameter	Description	Default value
ozone.scm.http-address	HTTP address for the Storage Container Manager web interface	0.0.0.0:9876
ozone.scm.https-address	HTTPS address for the Storage Container Manager web interface	0.0.0.0:9877
ozone.scm.ratis.port	Port for Ratis communication used by Storage Container Manager for high availability	9894
ozone.scm.db.dirs	Path to the directory where Storage Container Manager stores its database files	/srv/ozone/scm/db
ozone.scm.db.dirs.permissions	Permission mode for the directory specified by the `ozone.scm.db.dirs` parameter	750
ozone.scm.ha.ratis.storage.dir	Path to the directory for storing Ratis logs for high availability in Storage Container Manager	/srv/ozone/scm/ratis
ozone.scm.ha.ratis.snapshot.dir	Path to the directory for storing Ratis snapshots in Storage Container Manager	/srv/ozone/scm/ratis/snapshot
ozone.scm.service.ids	Identifiers for Storage Container Manager services used in multi-instance setups for high availability	—
ozone.scm.primordial.node.id	Specifies the ID of the first Storage Container Manager node in the cluster	{{ groups['ozone.ozone_scm'][0] }}
ozone.scm.names	Comma-separated list of fully qualified domain names (FQDNs) for the Storage Container Manager nodes in the cluster	{{ groups['ozone.ozone_scm'] \| join(',') }}
ozone.scm.client.address	Comma-separated list of addresses for the Storage Container Manager clients, typically FQDNs of Storage Container Manager nodes	{{ groups['ozone.ozone_scm'] \| join(',') }}
ozone.scm.ratis.enable	Defines whether the Ratis-based replication is enabled for Storage Container Manager to ensure high availability	true
ozone.scm.datanode.pipeline.limit	Maximum number of pipelines each Datanode can be part of in Storage Container Manager	10
ozone.scm.pipeline.owner.container.count	Maximum number of containers owned by each pipeline	10
ozone.scm.pipeline.creation.auto.factor.one	Defines whether Storage Container Manager should automatically create pipelines with a replication factor of one	false
ozone.scm.container.placement.impl	Specifies the container placement policy for Storage Container Manager, typically to optimize disk usage based on available capacity	org.apache.hadoop.hdds.scm.container.placement.algorithms.SCMContainerPlacementCapacity
ozone.scm.kerberos.principal	Storage Container Manager service principal	—
ozone.scm.kerberos.keytab.file	Path to the keytab file used by the Storage Container Manager daemon as its service principal to log in with	—
ozone.scm.http.auth.type	Authentication mechanism for the Storage Container Manager HTTP server	simple
ozone.scm.http.auth.kerberos.principal	Principal of the Storage Container Manager service if SPNEGO is enabled for the HTTP server	—
ozone.scm.http.auth.kerberos.keytab	Path to the keytab file used by the Storage Container Manager HTTP server as its service principal to log in with if SPNEGO is enabled	—

Others
Parameter	Description	Default value
Custom ozone-site.conf	In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file ozone-site.xml	—
Enable custom ulimits	Switch on the corresponding toggle button to specify resource limits (ulimits) for the current process. If you do not set these values, the default system settings are used. Ulimit settings are described in the table below	`[Service] DefaultLimitCPU= DefaultLimitFSIZE= DefaultLimitDATA= DefaultLimitSTACK= DefaultLimitCORE= DefaultLimitRSS= DefaultLimitNOFILE= DefaultLimitAS= DefaultLimitNPROC= DefaultLimitMEMLOCK= DefaultLimitLOCKS= DefaultLimitSIGPENDING= DefaultLimitMSGQUEUE= DefaultLimitNICE= DefaultLimitRTPRIO= DefaultLimitRTTIME=`

Ulimit settings
Parameter	Description	Corresponding option of the ulimit command in CentOS
DefaultLimitCPU	A limit in seconds on the amount of CPU time that a process can consume	cpu time ( -t)
DefaultLimitFSIZE	The maximum size of files that a process can create, in 512-byte blocks	file size ( -f)
DefaultLimitDATA	The maximum size of a process’s data segment, in kilobytes	data seg size ( -d)
DefaultLimitSTACK	The maximum stack size allocated to a process, in kilobytes	stack size ( -s)
DefaultLimitCORE	The maximum size of a core dump file allowed for a process, in 512-byte blocks	core file size ( -c)
DefaultLimitRSS	The maximum of resident set size, in kilobytes	max memory size ( -m)
DefaultLimitNOFILE	The maximum number of open file descriptors allowed for the process	open files ( -n)
DefaultLimitAS	The maximum size of the process virtual memory (address space), in kilobytes	virtual memory ( -v)
DefaultLimitNPROC	The maximum number of processes	max user processes ( -u)
DefaultLimitMEMLOCK	The maximum memory size that can be locked for the process, in kilobytes. Memory locking ensures the memory is always in RAM and a swap file is not used	max locked memory ( -l)
DefaultLimitLOCKS	The maximum number of files locked by a process	file locks ( -x)
DefaultLimitSIGPENDING	The maximum number of signals that are pending for delivery to the calling thread	pending signals ( -i)
DefaultLimitMSGQUEUE	The maximum number of bytes in POSIX message queues. POSIX message queues allow processes to exchange data in the form of messages	POSIX message queues ( -q)
DefaultLimitNICE	The maximum NICE priority level that can be assigned to a process	scheduling priority ( -e)
DefaultLimitRTPRIO	The maximum real-time scheduling priority level	real-time priority ( -r)
DefaultLimitRTTIME	The maximum pipe buffer size, in 512-byte blocks	pipe size ( -p)

Solr

solr-env.sh

Parameter

Description

Default value

SOLR_HOME

The location for index data and configs

/srv/solr/server

SOLR_AUTH_TYPE

Specifies the authentication type for Solr

—

SOLR_AUTHENTICATION_OPTS

Solr authentication options

—

SOLR_AUTHENTICATION_OPTS_CUSTOM

Custom authentication options for Solr

—

GC_TUNE

JVM parameters for Solr

-XX:-UseLargePages

SOLR_SSL_KEY_STORE

The path to the Solr keystore file

—

SOLR_SSL_KEY_STORE_TYPE

The type of the keystore

JKS

SOLR_SSL_KEY_STORE_PASSWORD

The password to the Solr keystore file

—

SOLR_SSL_TRUST_STORE

The path to the Solr truststore file

—

SOLR_SSL_TRUST_STORE_TYPE

The type of the truststore

JKS

SOLR_SSL_TRUST_STORE_PASSWORD

The password to the Solr truststore file

—

SOLR_SSL_NEED_CLIENT_AUTH

Defines if client authentication is enabled

false

SOLR_SSL_WANT_CLIENT_AUTH

Enables clients to authenticate (but not requires)

false

SOLR_SSL_CLIENT_HOSTNAME_VERIFICATION

Defines whether to enable hostname verification

false

SOLR_HOST

Specifies the host name of the Solr server

—

SOLR_PORT

The port number for Solr nodes to listen

8983

LOG4J_PROPS

A list of comma-separated paths to Log4j configuration files used by Solr

/etc/solr/conf/log4j2.xml,/etc/solr/conf/log4j2-console.xml

External zookeeper

Parameter Description Default value

ZK_HOST

Comma-separated locations of all servers in the ensemble and the ports on which they communicate. You can put ZooKeeper chroot at the end of your ZK_HOST connection string. For example, host1.mydomain.com:2181,host2.mydomain.com:2181,host3.mydomain.com:2181/solr

—

The external zookeeper is kerberized

If the external ZooKeeper is kerberized, the value must be set to true

false

Solr server heap memory settings

Parameter

Description

Default value

Solr Server Heap Memory

Sets initial (-Xms) and maximum (-Xmx) Java heap size for Solr Server

-Xms512m -Xmx512m

Solr collections ttl settings

Parameter

Description

Default value

collection_name

The name of the collection to configure auto-purging

—

ttl

The TTL (time-to-live) for documents in the collection

—

auto_delete_period

The time period to automatically delete documents from the collection

—

Credential Encryption

Parameter Description Default value

Credstore password

The encryption provider password

false

Credstore options

Defines a way to store the encryption provider password. The following options are available:

no password — no password is used;
password in the environment — the password is set in an environment variable.

no password

Credential provider path

The path to a keystore file with secrets

jceks://file/etc/solr/conf/solr.jceks

Ranger plugin credential provider path

The path to a Ranger keystore file with secrets

jceks://file/etc/solr/conf/ranger-solr.jceks

Custom jceks

Set to true to use a custom JCEKS file. Set to false to use the default auto-generated JCEKS file

false

ranger-solr-audit.xml

Parameter

Description

Default value

xasecure.audit.is.enabled

Enables Ranger audit

true

xasecure.audit.solr.solr_url

A path to a Solr collection to store audit logs

—

xasecure.audit.solr.async.max.queue.size

The maximum size of internal queue used for storing audit logs

xasecure.audit.solr.async.max.flush.interval.ms

The maximum time interval between flushes to disk (in milliseconds)

100

xasecure.audit.solr.is.enabled

Enables Ranger audit for Solr

true

ranger.solr.plugin.audit.excluded.users

A list of users to exclude audit records

HTTP,rangeradmin,rangerkms

ranger-solr-security.xml

Parameter

Description

Default value

ranger.plugin.solr.policy.rest.url

The URL to Ranger Admin

—

ranger.plugin.solr.service.name

The name of the Ranger service containing policies for this instance

—

ranger.plugin.solr.policy.cache.dir

The directory where Ranger policies are cached after successful retrieval from the source

/srv/ranger/solr/policycache

ranger.plugin.solr.policy.pollIntervalMs

Defines how often to poll for changes in policies

30000

ranger.plugin.solr.policy.rest.client.connection.timeoutMs

The Solr Plugin RangerRestClient connection timeout (in milliseconds)

120000

ranger.plugin.solr.policy.rest.client.read.timeoutMs

The Solr Plugin RangerRestClient read timeout (in milliseconds)

30000

ranger.plugin.solr.policy.rest.ssl.config.file

Path to the file containing SSL details to contact Ranger Admin

/usr/lib/solr/server/resources/ranger-solr-policymgr-ssl.xml

ranger.plugin.solr.policy.source.impl

Class to retrieve policies from the source

org.apache.ranger.admin.client.RangerAdminRESTClient

ranger-solr-policymgr-ssl.xml

Parameter

Description

Default value

xasecure.policymgr.clientssl.keystore

The path to the keystore file used by Ranger

—

xasecure.policymgr.clientssl.keystore.credential.file

The path to the keystore credentials file

/etc/solr/conf/ranger-solr.jceks

xasecure.policymgr.clientssl.truststore.credential.file

The path to the truststore credentials file

/etc/solr/conf/ranger-solr.jceks

xasecure.policymgr.clientssl.truststore

The path to the truststore file used by Ranger

—

xasecure.policymgr.clientssl.keystore.password

The password to the keystore file

—

xasecure.policymgr.clientssl.truststore.password

The password to the truststore file

—

Other

Parameter

Description

Default value

solr.xml

The content of solr.xml

solr.xml

Custom solr-env.sh

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file solr-env.sh

—

Ranger plugin enabled

Shows status of the Ranger plugin

false

Custom ranger-solr-audit.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the ranger-solr-audit.xml configuration file

—

Custom ranger-solr-security.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the ranger-solr-security.xml configuration file

—

Custom ranger-solr-policymgr-ssl.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the ranger-solr-policymgr-ssl.xml configuration file

—

Spark

Common

Parameter

Description

Default value

Dynamic allocation (spark.dynamicAllocation.enabled)

Defines whether to use dynamic resource allocation that scales the number of executors, registered with this application, up and down, based on the workload

false

Credential Encryption

Parameter Description Default value

Encryption enable

Enables or disables the credential encryption feature. When enabled, Spark stores configuration passwords and credentials required for interacting with other services in the encrypted form

false

Credential provider path

The path to a keystore file with secrets

jceks://hdfs/apps/spark/security/spark.jceks

Custom jceks

Set to true to use a custom JCEKS file. Set to false to use the default auto-generated JCEKS file

false

spark-defaults.conf

Parameter Description Default value

spark.yarn.archive

The archive containing needed Spark JARs for distribution to the YARN cache. If set, this configuration replaces spark.yarn.jars and the archive is used in all the application containers. The archive should contain JAR files in its root directory. The archive can also be hosted on HDFS to speed up file distribution

hdfs:///apps/spark/spark-yarn-archive.tgz

spark.yarn.historyServer.address

The address of Spark History Server

—

spark.master

The cluster manager to connect to

yarn

spark.dynamicAllocation.enabled

Defines whether to use dynamic resource allocation that scales the number of executors, registered with this application, up and down, based on the workload

false

spark.shuffle.service.enabled

Enables the external shuffle service. This service preserves the shuffle files written by executors so that executors can be safely removed, or so that shuffle fetches can continue in the event of executor failure. The external shuffle service must be set up in order to enable it

false

spark.eventLog.enabled

Defines whether to log Spark events, useful for reconstructing the Web UI after the application has finished

true

spark.eventLog.dir

The base directory where Spark events are logged, if spark.eventLog.enabled=true. Within this base directory, Spark creates a sub-directory for each application, and logs the events specific to the application in this directory. You may want to set this to a unified location like an HDFS directory so history files can be read by the History Server

hdfs:///var/log/spark/apps

spark.serializer

The class to use for serializing objects that will be sent over the network or need to be cached in serialized form. The default of Java serialization works with any Serializable Java object but is quite slow, so we recommend using org.apache.spark.serializer.KryoSerializer and configuring Kryo serialization when speed is necessary. Can be any subclass of org.apache.spark.Serializer

org.apache.spark.serializer.KryoSerializer

spark.dynamicAllocation.executorIdleTimeout

If dynamic allocation is enabled and an executor has been idle for more than this duration, the executor will be removed. For more details, see Spark documentation

120s

spark.dynamicAllocation.cachedExecutorIdleTimeout

If dynamic allocation is enabled and an executor which has cached data blocks has been idle for more than this duration, the executor will be removed. For more details, see Spark documentation

600s

spark.history.provider

The name of the class that implements the application history backend. Currently, there is only one implementation provided with Spark that looks for application logs stored in the file system

org.apache.spark.deploy.history.FsHistoryProvider

spark.history.fs.cleaner.enabled

Specifies whether the History Server should periodically clean up event logs from storage

true

spark.history.store.path

A local directory where to cache application history data. If set, the History Server will store application data on disk instead of keeping it in memory. The data written to disk will be re-used in case of the History Server restart

/var/log/spark3/history

spark.driver.extraClassPath

Extra classpath entries to prepend to the classpath of the driver

/usr/lib/hive/lib/hive-shims-scheduler.jar
/usr/lib/hadoop-yarn/hadoop-yarn-server-resourcemanager.jar

spark.executor.extraClassPath

Extra classpath entries to prepend to the classpath of the executor

—

spark.history.ui.port

The port number of the History Server web UI

18082

spark.history.fs.logDirectory

The log directory of the History Server

hdfs:///var/log/spark/apps

spark.driver.extraLibraryPath:

The path to extra native libraries for driver

/usr/lib/hadoop/lib/native/

spark.yarn.am.extraLibraryPath:

The path to extra native libraries for Application Master

/usr/lib/hadoop/lib/native/

spark.executor.extraLibraryPath

The path to extra native libraries for Executor

/usr/lib/hadoop/lib/native/

spark.yarn.appMasterEnv.HIVE_CONF_DIR

A directory on the Application Master with Hive configs required for running Hive in the cluster mode

/etc/spark3/conf

spark.yarn.historyServer.allowTracking

Allows using Spark History Server for tracking UI even if web UI is disabled for a job

true

spark.ssl.enabled

Defines whether to use SSL for Spark

false

spark.ssl.protocol

TLS protocol to be used. The protocol must be supported by JVM

TLSv1.2

spark.ssl.ui.port

The port where the SSL service will listen on

4040

spark.ssl.historyServer.port

The port to access History Server web UI

18082

spark.ssl.keyPassword

The password to the private key in the key store

—

spark.ssl.keyStore

The path to the keystore file

—

spark.ssl.keyStoreType

The type of the keystore

JKS

spark.ssl.trustStorePassword

The password to the truststore used by Spark

—

spark.ssl.trustStore

The path to the truststore file

—

spark.ssl.trustStoreType

The type of the truststore

JKS

spark.history.kerberos.enabled

Indicates whether the History Server should use Kerberos to login. This is required if the History Server is accessing HDFS files on a secure Hyperwave cluster

false

spark.acls.enable

Enables Spark ACL

false

spark.modify.acls

Defines who has access to modify a running Spark application

spark,hdfs

spark.modify.acls.groups

A comma-separated list of user groups that have modify access to the Spark application

spark,hdfs

spark.history.ui.acls.enable

Specifies whether ACLs should be checked to authorize users viewing the applications in the History Server. If enabled, access control checks are performed regardless of what the individual applications had set for spark.ui.acls.enable. If disabled, no access control checks are made for any application UIs available through the History Server

false

spark.history.ui.admin.acls

A comma-separated list of users that have view access to all the Spark applications in History Server

spark,hdfs,dr.who

spark.history.ui.admin.acls.groups

A comma-separated list of groups that have view access to all the Spark applications in History Server

spark,hdfs,dr.who

spark.ui.view.acls

A comma-separated list of users that have view access to the Spark application. By default, only the user that started the Spark job has view access. Using * as a value means that any user can have view access to this Spark job

spark,hdfs,dr.who

spark.ui.view.acls.groups

A comma-separated list of groups that have view access to the Spark web UI to view the Spark Job details. This can be used if you have a set of administrators or developers or users who can monitor the Spark job submitted. Using * in the list means any user in any group can view the Spark job details on the Spark web UI. The user groups are obtained from the instance of the groups mapping provider specified by spark.user.groups.mapping

spark,hdfs,dr.who

Spark Heap Memory settings

Parameter

Description

Default value

Spark History Server Heap Memory

Sets initial (-Xms) and maximum (-Xmx) Java heap size for Spark History Server

Livy Server Heap Memory

Sets initial (-Xms) and maximum (-Xmx) Java heap size for Livy Server

-Xms300m -Xmx4G

Custom log4j.properties

Parameter

Description

Default value

Spark spark-log4j.properties

Stores the Log4j configuration used for logging Spark’s activity

spark-log4j.properties

Livy livy-log4.properties

Stores the Log4j configuration used for logging Livy’s activity

livy-log4j.properties

livy.conf

Parameter Description Default value

livy.server.host

The host address to start the Livy server. By default, Livy will bind to all network interfaces

0.0.0.0

livy.server.port

The port to run the Livy server

8998

livy.spark.master

The Spark master to use for Livy sessions

yarn-cluster

livy.impersonation.enabled

Defines if Livy should impersonate users when creating a new session

false

livy.server.csrf-protection.enabled

Defines whether to enable the CSRF protection. If enabled, clients should add the X-Requested-By HTTP header for POST/DELETE/PUT/PATCH HTTP methods

true

livy.repl.enable-hive-context

Defines whether to enable HiveContext in the Livy interpreter. If set to true, hive-site.xml and the Livy server classpath will be detected on user request automatically

true

livy.server.recovery.mode

Sets the recovery mode for Livy

recovery

livy.server.recovery.state-store

Defines where Livy should store the state for recovery

filesystem

livy.server.recovery.state-store.url

For the filesystem state store, the path of the state store directory. Do not use a filesystem that does not support atomic rename like S3. For example: file:///tmp/livy or hdfs:///. For ZooKeeper, specify the address to the ZooKeeper servers. For example: host1:port1,host2:port2

/livy-recovery

livy.server.auth.type

Sets the Livy authentication type

—

livy.server.access_control.enabled

Defines whether to enable the access control for a Livy server. If set to true, then all the incoming requests will be checked if the requested user has permission

false

livy.server.access_control.users

Users allowed to access Livy. By default, any user is allowed to access Livy. If a user wants to limit the access, the user should list all the permitted users separated by a comma

livy,hdfs,spark

livy.superusers

A list of comma-separated users that have the permissions to change other user’s submitted session, like submitting statements, deleting session, and so on

livy,hdfs,spark

livy.keystore

A path to the keystore file. The path can be absolute or relative to the directory in which the process is started

—

livy.keystore.password

The password to access the keystore

—

livy.key-password

The password to access the key in the keystore

—

livy.server.thrift.ssl.protocol.blacklist

The list of banned TLS protocols

SSLv2,SSLv3,TLSv1,TLSv1.1

Other

Parameter

Description

Default value

Custom spark-defaults.conf

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file spark-defaults.conf

—

spark-env.sh

Enter the contents for the spark-env.sh file that is used to initialize environment variables on worker nodes

spark-env.sh

Custom livy.conf

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file livy.conf

—

livy-env.sh

Enter the contents for the livy-env.sh file that is used to prepare the environment for Livy startup

livy-env.sh

spark-history-env.sh

Enter the contents for the spark-history-env.sh file that is used to prepare the environment for History Server startup

spark-history-env.sh

Ranger plugin enabled

Enables or disables the Ranger plugin

false

Spark3

Common

Parameter

Description

Default value

Dynamic allocation (spark.dynamicAllocation.enabled)

Defines whether to use dynamic resource allocation that scales the number of executors, registered with this application, up and down, based on the workload

false

Credential Encryption

Parameter Description Default value

Encryption enable

Enables or disables the credential encryption feature. When enabled, Spark3 stores configuration passwords and credentials required for interacting with other services in the encrypted form

false

Credential provider path

The path to a keystore file with secrets

jceks://hdfs/apps/spark/security/spark.jceks

Custom jceks

Set to true to use a custom JCEKS file. Set to false to use the default auto-generated JCEKS file

false

spark3_iceberg_extensions

Parameter

Description

Default value

version

The version of the spark-iceberg extension package

1.5.2_arenadata1

spark-defaults.conf

Parameter Description Default value

spark.yarn.archive

The archive containing all the required Spark JARs for distribution to the YARN cache. If set, this configuration replaces spark.yarn.jars and the archive is used in all the application containers. The archive should contain JAR files in its root directory. The archive can also be hosted on HDFS to speed up file distribution

hdfs:///apps/spark/spark3-yarn-archive.tgz

spark.yarn.historyServer.address

Spark History server address

—

spark.master

The cluster manager to connect to

yarn

spark.dynamicAllocation.enabled

Defines whether to use dynamic resource allocation that scales the number of executors, registered with this application, up and down, based on the workload

false

spark.shuffle.service.enabled

false

spark.eventLog.enabled

Defines whether to log Spark events, useful for reconstructing the Web UI after the application has finished

true

spark.eventLog.dir

hdfs:///var/log/spark/apps

spark.dynamicAllocation.executorIdleTimeout

If dynamic allocation is enabled and an executor has been idle for more than this duration, the executor will be removed. For more details, see Spark documentation

120s

spark.dynamicAllocation.cachedExecutorIdleTimeout

If dynamic allocation is enabled and an executor which has cached data blocks has been idle for more than this duration, the executor will be removed. For more details, see Spark documentation

600s

spark.history.provider

The name of the class that implements the application history backend. Currently there is only one implementation provided with Spark that looks for application logs stored in the file system

org.apache.spark.deploy.history.FsHistoryProvider

spark.history.fs.cleaner.enabled

Specifies whether the History Server should periodically clean up event logs from storage

true

spark.history.store.path

/var/log/spark3/history

spark.serializer

The class used for serializing objects that will be sent over the network or need to be cached in the serialized form. By default, works with any Serializable Java object but it may be quite slow, so we recommend using org.apache.spark.serializer.KryoSerializer and configuring Kryo serialization when speed is necessary. Can be any subclass of org.apache.spark.Serializer

org.apache.spark.serializer.KryoSerializer

spark.driver.extraClassPath

Extra classpath entries to be added to the classpath of the driver

/usr/lib/hive/lib/hive-shims-scheduler.jar
/usr/lib/hadoop-yarn/hadoop-yarn-server-resourcemanager.jar
/usr/lib/spark3/jars/adb-spark-connector-assembly-release-1.0.5-spark-3.5.2_arenadata1.jar
/usr/lib/spark3/jars/adqm-spark-connector-assembly-release-1.0.0-spark-3.5.2_arenadata1.jar

spark.executor.extraClassPath

Extra classpath entries to add to the classpath of the executors

/usr/lib/spark3/jars/adb-spark-connector-assembly-release-1.0.5-spark-3.5.2_arenadata1.jar
/usr/lib/spark3/jars/adqm-spark-connector-assembly-release-1.0.0-spark-3.5.2_arenadata1.jar

spark.history.ui.port

The port number of the History Server web UI

18092

spark.ui.port

The port number of the Spark web UI

4140

spark.history.fs.logDirectory

The log directory of the History Server

hdfs:///var/log/spark/apps

spark.sql.extensions

A comma-separated list of Iceberg SQL extensions classes

org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions

spark.sql.catalog.spark_catalog

The Iceberg catalog implementation class

org.apache.iceberg.spark.SparkSessionCatalog

spark.sql.hive.metastore.jars

The location of the JARs that should be used to instantiate HiveMetastoreClient

path

spark.sql.hive.metastore.jars.path

A list of comma-separated paths to JARs used to instantiate HiveMetastoreClient

file:///usr/lib/hive/lib/*.jar

spark.sql.hive.metastore.version

The Hive Metastore version

3.1.2

spark.driver.extraLibraryPath

The path to extra native libraries for driver

/usr/lib/hadoop/lib/native/

spark.yarn.am.extraLibraryPath

The path to extra native libraries for Application Master

/usr/lib/hadoop/lib/native/

spark.executor.extraLibraryPath

The path to extra native libraries for Executor

/usr/lib/hadoop/lib/native/

spark.yarn.appMasterEnv.HIVE_CONF_DIR

A directory on the Application Master with Hive configs required for running Hive in the cluster mode

/etc/spark3/conf

spark.yarn.historyServer.allowTracking

Allows using Spark History Server for tracking UI even if web UI is disabled for a job

True

spark.connect.grpc.binding.port

The port number to connect to Spark Connect via gRPC

15002

spark.artifactory.dir.path

The path to an artifact directory used by Spark Connect

tmp

spark.sql.security.confblacklist

Prevents overriding specified parameters from an application point of view or for information security reasons

spark.sql.extensions

spark.history.kerberos.enabled

Indicates whether the History Server should use Kerberos to login. This is required if the History Server is accessing HDFS files on a secure Hyperwave cluster

false

spark.acls.enable

Defines whether Spark ACLs should be enabled. If enabled, checks to see if the user has access permissions to view or modify the job. Note this requires the user to be known, so if the user comes across as null no checks are done. Filters can be used within the UI to authenticate and set the user

false

spark.modify.acls

Defines who has access to modify a running Spark application

spark,hdfs

spark.modify.acls.groups

A comma-separated list of user groups that have modify access to the Spark application

spark,hdfs

spark.history.ui.acls.enable

false

spark.history.ui.admin.acls

A comma-separated list of users that have view access to all the Spark applications in History Server

spark,hdfs,dr.who

spark.history.ui.admin.acls.groups

A comma-separated list of groups that have view access to all the Spark applications in History Server

spark,hdfs,dr.who

spark.ui.view.acls

spark,hdfs,dr.who

spark.ui.view.acls.groups

spark,hdfs,dr.who

spark.ssl.keyPassword

The password to the private key in the keystore

—

spark.ssl.keyStore

Path to the keystore file. The path can be absolute or relative to the directory in which the process is started

—

spark.ssl.keyStoreType

The type of keystore used

JKS

spark.ssl.trustStorePassword

The password to the private key in the truststore

—

spark.ssl.trustStoreType

The type of the truststore

JKS

spark.ssl.enabled

Defines whether to use SSL for Spark

—

spark.ssl.protocol

Defines the TLS protocol to use. The protocol must be supported by JVM

TLSv1.2

spark.ssl.ui.port

The port number used by Spark web UI in case of active SSL

4141

spark.ssl.historyServer.port

The port number used by Spark History Server web UI in case of active SSL

18092

Custom log4j.properties

Parameter

Description

Default value

Spark3 spark-log4j2.properties

Stores the Log4j configuration used for logging Spark3’s activity

spark-log4j2.properties

Livy livy-log4j.properties

Stores the Log4j configuration used for logging Livy’s activity

livy-log4j.properties

livy.conf

Parameter Description Default value

livy.server.host

The host address to start the Livy server. By default, Livy will bind to all network interfaces

0.0.0.0

livy.server.port

The port to run the Livy server

8999

livy.spark.master

The Spark master to use for Livy sessions

yarn

livy.impersonation.enabled

Defines if Livy should impersonate users when creating a new session

true

livy.server.csrf-protection.enabled

Defines whether to enable the CSRF protection. If enabled, clients should add the X-Requested-By HTTP header for POST/DELETE/PUT/PATCH HTTP methods

true

livy.repl.enable-hive-context

Defines whether to enable HiveContext in the Livy interpreter. If set to true, hive-site.xml and the Livy server classpath will be detected on user request automatically

true

livy.server.recovery.mode

Sets the recovery mode for Livy

recovery

livy.server.recovery.state-store

Defines where Livy should store the state for recovery

filesystem

livy.server.recovery.state-store.url

/livy-recovery

livy.server.auth.type

Sets the Livy authentication type

—

livy.server.access_control.enabled

Defines whether to enable the access control for a Livy server. If set to true, then all the incoming requests will be checked if the requested user has permission

false

livy.server.access_control.users

Users allowed to access Livy. By default, any user is allowed to access Livy. If a user wants to limit the access, the user should list all the permitted users separated by a comma

livy,hdfs,spark

livy.superusers

A list of comma-separated users that have the permissions to change other user’s submitted sessions, for example, submitting statements, deleting the session, and so on

livy,hdfs,spark

livy.keystore

A path to the keystore file. The path can be absolute or relative to the directory in which the process is started

—

livy.keystore.password

The password to access the keystore

—

livy.key-password

The password to access the key in the keystore

—

livy.server.thrift.ssl.protocol.blacklist

The list of banned TLS protocols

SSLv2,SSLv3,TLSv1,TLSv1.1

Spark heap memory settings

Parameter

Description

Default value

Spark History Server Heap Memory

Sets the maximum Java heap size for Spark History Server

Spark3 Connect Heap Memory

Sets the maximum Java heap size for a Spark Connect server

ranger-spark-audit.xml

Parameter Description Default value

xasecure.audit.destination.solr.batch.filespool.dir

The spool directory path

/srv/ranger/hdfs_plugin/audit_solr_spool

xasecure.audit.destination.solr.urls

A URL of the Solr server to store audit events. Leave this property value empty or set it to NONE when using ZooKeeper to connect to Solr

—

xasecure.audit.destination.solr.zookeepers

Specifies the ZooKeeper connection string for the Solr destination

—

xasecure.audit.destination.solr.force.use.inmemory.jaas.config

Uses in-memory JAAS configuration file to connect to Solr

—

xasecure.audit.is.enabled

Enables Ranger audit

true

xasecure.audit.jaas.Client.loginModuleControlFlag

Specifies whether the success of the module is required, requisite, sufficient, or optional

—

xasecure.audit.jaas.Client.loginModuleName

The name of the authenticator class

—

xasecure.audit.jaas.Client.option.keyTab

The name of the keytab file to get the principal’s secret key

—

xasecure.audit.jaas.Client.option.principal

The name of the principal to be used

—

xasecure.audit.jaas.Client.option.serviceName

The name of a user or a service that wants to log in

—

xasecure.audit.jaas.Client.option.storeKey

Set this to true if you want the keytab or the principal’s key to be stored in the subject’s private credentials

false

xasecure.audit.jaas.Client.option.useKeyTab

Set this to true if you want the module to get the principal’s key from the keytab

false

ranger-spark-security.xml

Parameter Description Default value

ranger.plugin.spark.policy.rest.url

The URL to Ranger Admin

—

ranger.plugin.spark.service.name

The name of the Ranger service containing policies for this instance

—

ranger.plugin.spark.policy.cache.dir

The directory where Ranger policies are cached after successful retrieval from the source

/srv/ranger/spark/policycache

ranger.plugin.hive.policy.cache.dir

The directory where Ranger policies are cached after successful retrieval from the source

The directory where Ranger policies for Hive are cached after successful retrieval from the source

ranger.plugin.spark.policy.pollIntervalMs

Defines how often to poll for changes in policies

30000

ranger.plugin.spark.policy.rest.client.connection.timeoutMs

The Spark plugin RangerRestClient connection timeout (in milliseconds)

120000

ranger.plugin.spark.policy.rest.client.read.timeoutMs

The Spark plugin RangerRestClient read timeout (in milliseconds)

30000

ranger.add-yarn-authorization

Set true to use only Ranger ACLs (i.e. ignore YARN ACLs)

false

ranger.plugin.spark.enable.implicit.userstore.enricher

Enables UserStoreEnricher for fetching user and group attributes when using macros or scripts in row filters starting Ranger 2.3

true

ranger.plugin.spark.policy.rest.ssl.config.file

The path to the RangerRestClient SSL configuration file for the Spark plugin

/etc/spark3/conf/ranger-spark-policymgr-ssl.xml

ranger-spark3-policymgr-ssl.xml

Parameter

Description

Default value

xasecure.policymgr.clientssl.keystore

The path to the keystore file used by Ranger

—

xasecure.policymgr.clientssl.keystore.credential.file

The path to the keystore credentials file

/etc/spark3/conf/ranger-spark3.jceks

xasecure.policymgr.clientssl.truststore.credential.file

The path to the truststore credentials file

/etc/spark3/conf/ranger-spark3.jceks

xasecure.policymgr.clientssl.truststore

The path to the truststore file used by Ranger

—

xasecure.policymgr.clientssl.keystore.password

The password to the keystore file

—

xasecure.policymgr.clientssl.truststore.password

The password to the truststore file

—

Other

Parameter

Description

Default value

Custom spark-defaults.conf

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file spark-defaults.conf

—

spark-env.sh

The contents of the spark-env.sh file used to initialize environment variables on worker nodes

spark-env.sh

Custom livy.conf

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file livy.conf

—

livy-env.sh

The contents of the livy-env.sh file used to initialize environment variables for the Livy server operation

livy-env.sh

spark-history-env.sh

The contents of the spark-history-env.sh file used to initialize environment variables for the Spark3 History Server operation

spark-history-env.sh

Ranger plugin enabled

Enables or disables the Ranger plugin for Spark3

false

SSM

Credentials Encryption

Parameter Description Default value

Encryption enable

Set to true to enable credentials encryption

false

Credential provider path

The path to a keystore file used to encrypt credentials

jceks://file/etc/ssm/conf/ssm.jceks

Custom jceks

Set to true to use a custom JCEKS file. Set to false to use the auto-generated JCEKS keystore

false

smart-site.xml

Parameter Description Default value

smart.hadoop.conf.path

The path to the Hadoop configuration directory

/etc/hadoop/conf

smart.conf.dir

The path to the SSM configuration directory

/etc/ssm/conf

smart.server.rpc.address

The RPC address of the SSM Server

0.0.0.0:7042

smart.file.access.count.aggregator.failover

A failover strategy for the file access event aggregator. Possible values: FAIL — throws an exception, no failover. SAVE_FAILED_WITH_RETRY — saves all file access events that caused the exception

SAVE_FAILED_WITH_RETRY

smart.agent.master.address

The active SSM server’s address

smart.agent.address

Defines the address of SSM Agent components on each host

0.0.0.0

smart.agent.port

The port number used by SSM agents to communicate with the SSM Server

7048

smart.agent.master.port

The port number used by the SSM Server to communicate with SSM agents

7051

smart.rest.server.port

The port of the SSM REST server

7045

smart.rest.server.security.enabled

The parameter enables or disables the SSM REST server security

false

smart.rest.server.auth.spnego.enabled

The parameter enables or disables the SPNEGO authentication for the SSM REST server

false

smart.rest.server.auth.predefined.enabled

The parameter enables or disables the basic authentication for users, listed in the smart.rest.server.auth.predefined.users option

false

smart.rest.server.auth.predefined.users

The list of users and their credentials that have access to the SSM REST server if the smart.rest.server.auth.predefined.enabled parameter is set to True

—

smart.ignore.dirs

A list of comma-separated HDFS directories to ignore. SSM will ignore all files under the given HDFS directories

—

smart.cover.dirs

A list of comma-separated HDFS directories where SSM scans for files. By default, all HDFS files are covered

—

smart.work.dir

The HDFS directory used by SSM as a working directory to store temporary files. SSM will ignore HDFS inotify events for all files under the working directory. Only one directory can be set

/system/ssm

smart.client.concurrent.report.enabled

Used to enable/disable concurrent reports for Smart Client. If enabled, Smart Client concurrently attempts to connect to multiple configured Smart Servers to find the active Smart Server, which is an optimization. Only the active Smart Server will respond to establish the connection. If the report has been successfully delivered to the active Smart Server, connection attempts to other Smart Servers are canceled

—

smart.server.rpc.handler.count

The number of RPC handlers on the server

smart.namespace.fetcher.batch

The batch size of the namespace fetcher. SSM fetches namespaces from the NameNode during the startup. Large namespaces may lead to long startup time. A larger batch size can speed up the fetcher efficiency and reduce the startup time

500

smart.namespace.fetcher.producers.num

The number of producers in the namespace fetcher

smart.namespace.fetcher.consumers.num

The number of consumers in the namespace fetcher

smart.rule.executors

The maximum number of rules that can be executed in parallel

smart.cmdlet.executors

The maximum number of cmdlets that can be executed in parallel

smart.dispatch.cmdlets.extra.num

The number of extra cmdlets dispatched by Smart Server

smart.cmdlet.dispatchers

The maximum number of cmdlet dispatchers that work in parallel

smart.cmdlet.mover.max.concurrent.blocks.per.srv.inst

The maximum number of file mover cmdlets that can be executed in parallel per SSM service. The 0 value removes the limit

smart.action.move.throttle.mb

The throughput limit (in MB) for the SSM move operation

smart.action.copy.throttle.mb

The throughput limit (in MB) for the SSM copy operation

smart.action.ec.throttle.mb

The throughput limit (in MB) for the SSM EC operation

smart.action.local.execution.disabled

Defines whether the active Smart Server can also execute actions like an agent. If set to true, the active SSM Server will NOT be able to execute actions. This configuration has no impact on a standby Smart Server

false

smart.cmdlet.max.num.pending

The maximum number of pending cmdlets in an SSM Server

20000

smart.cmdlet.hist.max.num.records

The maximum number of historic cmdlet records kept in an SSM server. SSM deletes the oldest cmdlets when this threshold is exceeded

100000

smart.cmdlet.hist.max.record.lifetime

The maximum lifetime of historic cmdlet records kept in an SSM server. The SSM Server deletes cmdlet records after the specified interval. Valid time units are day, hour, min, sec. The minimum update granularity is 5sec

30day

smart.cmdlet.cache.batch

The maximum batch size of the cmdlet batch insert

600

smart.copy.scheduler.base.sync.batch

The maximum batch size of the Copy Scheduler base sync batch insert

500

smart.file.diff.max.num.records

The maximum file diff records with useless state

10000

smart.status.report.period

The status report period for actions in milliseconds

smart.status.report.period.multiplier

The report period multiplied by this value defines the largest report interval

smart.status.report.ratio

If the finished actions ratio equals or exceeds this value, a status report will be triggered

0.2

smart.top.hot.files.num

The number of top hot files displayed in web UI

200

smart.cmdlet.dispatcher.log.disp.result

Defines whether to log dispatch results for each cmdlet dispatched

false

smart.cmdlet.dispatcher.log.disp.metrics.interval

The time interval in milliseconds to log statistic metrics of the cmdlet dispatcher. If no cmdlets were dispatched within this interval, no output is generated for this interval. The 0 value disables the logger

5000

smart.compression.codec

The default compression codec for SSM compression (Zlib, Lz4, Bzip2, snappy). You can also specify codecs as action arguments, which overrides this setting

Zlib

smart.compression.max.split

The maximum number of chunks split for compression

1000

smart.compact.batch.size

The maximum number of small files to be compacted by the compact action

200

smart.compact.container.file.threshold.mb

The maximum size of a container file in MB

1024

smart.access.count.day.tables.num

The maximum number of tables that can be created in the Metastore database to store the file access count per day

smart.access.count.hour.tables.num

The maximum number of tables that can be created in the Metastore database to store the file access count per hour

smart.access.count.minute.tables.num

The maximum number of tables that can be created in the Metastore database to store the file access count per minute

120

smart.access.count.second.tables.num

The maximum number of tables that can be created in the Metastore database to store the file access count per second

smart.access.event.fetch.interval.ms

The interval in milliseconds between access event fetches

1000

smart.cached.file.fetch.interval.ms

The interval in milliseconds between fetches of cached files from HDFS

5000

smart.namespace.fetch.interval.ms

The interval in milliseconds between namespace fetches from HDFS

smart.mover.scheduler.storage.report.fetch.interval.ms

The interval in milliseconds between fetches of storage reports from HDFS DataNodes in the mover scheduler

120000

smart.metastore.small-file.insert.batch.size

The maximum size of the Metastore insert batch with information about small files

200

smart.agent.master.ask.timeout.ms

The maximum time in milliseconds for a Smart Agent to wait for a response from the Smart Server during the submission action

5000

smart.ignore.path.templates

A list of comma-separated regex templates of HDFS paths to be completely ignored by SSM

—

smart.internal.path.templates

A list of comma-separated regex templates of internal files to be completely ignored by SSM

.*/\..*,.*/__.*,.*_COPYING_.*

smart.security.enable

Enables Kerberos authentication for SSM

false

smart.server.keytab.file

The path to the SSM Server’s keytab file

—

smart.server.kerberos.principal

The SSM Server’s Kerberos principal

—

smart.agent.keytab.file

The path to the SSM Agent’s keytab file

—

smart.agent.kerberos.principal

The SSM Agent’s Kerberos principal

—

smart.rest.server.auth.spnego.principal

SSM REST server Kerberos principal

—

smart.rest.server.auth.spnego.keytab

SSM REST server keytab

—

smart.proxy.user.strategy

The scope of the LDAP user search. Possible values:

DISABLED — impersonation is disabled, all actions are performed by the SSM node user (either the Kerberos principal or the user who started SSM).
NODE_SCOPE — impersonation is enabled at the node level, all actions are performed by the user specified in the smart.proxy.user option.
CMDLET_SCOPE — impersonation is enabled at the cmdlet level, all actions are performed by the cmdlet owner (currently, the cmdlet creator).

DISABLED

smart.proxy.users.cache.ttl

The minimum amount of time that must pass after the last access to a proxy users cache entry before it is evicted. The value must be specified in the [Amount][TimeUnit] format, where Amount is a number and TimeUnit is one of the following:

day or d — for days;
hour or h — for hours;
min or m — for minutes;
sec or s — for seconds.

smart.proxy.users.cache.size

The maximum size of the proxy users cache

smart-env.sh

Parameter

Description

Default value

LD_LIBRARY_PATH

The path to extra native libraries for SSM

/usr/lib/hadoop/lib/native

HADOOP_HOME

The path to the Hadoop home directory

/usr/lib/hadoop

Other

Parameter Description Default value

Enable SmartFileSystem for Hadoop

When enabled, requests from different clients (Spark, HDFS, Hive, etc.) are taken into account when calculating AccessCount for files. Otherwise, the AccessCount value gets incremented only when a file is accessed from SSM

false

log4j.properties

The contents of the log4j.properties configuration file

—

Custom smart-site.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file smart-site.xml

—

Custom smart-env.sh

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file smart-env.sh

—

SSM Server component

Druid configuration
Parameter	Description	Default value
db_url	The URL to the Metastore database	jdbc:postgresql://{{ groups['adpg.adpg'][0] \| d(omit) }}:5432/ssm
db_user	The user name to connect to the database	ssm
db_password	The user password to connect to the database	—
initialSize	The initial number of connections created when the pool is started	10
minIdle	The minimum number of established connections that should be kept in the pool at all times. The connection pool can shrink below this number if validation queries fail	4
maxActive	The maximum number of active connections that can be allocated from this pool at the same time	50
maxWait	The maximum time in milliseconds the pool will wait (when there are no available connections) for a connection to be returned before throwing an exception	60000
timeBetweenEvictionRunsMillis	The time in milliseconds to sleep between the runs of the idle connection validation/cleaner thread. This value should not be set less than 1 second. It specifies how often to check for idle and abandoned connections, and how often to validate idle connections	90000
minEvictableIdleTimeMillis	The minimum amount of time an object may remain idle in the pool before it is eligible for eviction	300000
validationQuery	The SQL query used to validate connections from the pool before returning them to the caller	SELECT 1
testWhileIdle	Indicates whether connection objects are validated by the idle object evictor (if any)	true
testOnBorrow	Indicates whether objects are validated before being borrowed from the pool	false
testOnReturn	Indicates whether objects are validated before being returned to the pool	false
poolPreparedStatements	Enables the prepared statement pooling	true
maxPoolPreparedStatementPerConnectionSize	The maximum number of prepared statements that can be pooled per connection	30
removeAbandoned	A flag to remove abandoned connections if they exceed `removeAbandonedTimeout`	true
removeAbandonedTimeout	The timeout in seconds before an abandoned (in use) connection can be removed	180
logAbandoned	A flag to log stack traces for application code which abandoned a connection. Logging of abandoned connections adds extra overhead for every borrowed connection	true
filters	Sets the filters that are applied to the data source	stat

Trino

Hive configuration

Parameter

Description

Default value

connector.name

Connector type

hive

hive.metastore.uri

URI for the Hive Metastore service

—

hive.storage-format

Storage format for the Hive data

PARQUET

hive.compression-codec

Compression codec for the Hive data

SNAPPY

hive.metastore.thrift.impersonation.enabled

Defines whether the Thrift impersonation mechanism is enabled for the Hive Metastore requests

true

fs.hadoop.enabled

Defines whether the HDFS access is supported

false

hive.config.resources

Optional comma-separated list of HDFS configuration files

/etc/hadoop/conf/core-site.xml,/etc/hadoop/conf/hdfs-site.xml

hive.non-managed-table-writes-enabled

Defines whether writing to non-managed (external) Hive tables is enabled

true

hive.metastore.authentication.type

Defines whether Kerberos authentication is enabled for Hive Metastore access

—

hive.metastore.service.principal

Kerberos principal for the Hive Metastore service

—

hive.metastore.client.principal

Kerberos principal for the Trino client connecting to Hive Metastore

—

hive.metastore.client.keytab

Path to the keytab file for the Trino Kerberos client

—

hive.hdfs.authentication.type

Defines whether the Kerberos authentication is enabled for HDFS access

—

hive.hdfs.impersonation.enabled

Defines whether the impersonation mechanism is used for end users to access the HDFS

false

hive.hdfs.trino.principal

Trino client Kerberos principal for connection to the Hive Metastore

—

hive.hdfs.trino.keytab

Path to the keytab file for the Trino Kerberos client

—

hive.metastore.thrift.client.ssl.enabled

Defines whether SSL is enabled for the Hive Metastore Thrift client

false

hive.metastore.thrift.client.ssl.key

Path to the keystore for SSL authentication

—

hive.metastore.thrift.client.ssl.key-password

Password for the keystore used in SSL authentication

—

hive.metastore.thrift.client.ssl.trust-certificate

Path to the truststore for SSL authentication

—

hive.metastore.thrift.client.ssl.trust-certificate-password

Password for the truststore used in SSL authentication

—

Iceberg configuration

Parameter

Description

Default value

connector.name

Connector type

iceberg

hive.metastore.uri

URI for the Hive Metastore service

—

iceberg.catalog.type

Catalog type for Iceberg

hive_metastore

iceberg.file-format

File format for Iceberg tables

PARQUET

iceberg.compression-codec

Compression codec for Iceberg tables

SNAPPY

hive.metastore.thrift.impersonation.enabled

Defines whether the Thrift impersonation mechanism is enabled for the Hive Metastore requests

true

fs.hadoop.enabled

Defines whether the HDFS access is supported

false

hive.config.resources

Optional comma-separated list of HDFS configuration files

/etc/hadoop/conf/core-site.xml,/etc/hadoop/conf/hdfs-site.xml

hive.metastore.authentication.type

Defines whether Kerberos authentication is enabled for Hive Metastore access

—

hive.metastore.service.principal

Kerberos principal for the Hive Metastore service

—

hive.metastore.client.principal

Kerberos principal for the Trino client connecting to Hive Metastore

—

hive.metastore.client.keytab

Path to the keytab file for the Trino Kerberos client

—

hive.hdfs.authentication.type

Defines whether the Kerberos authentication is enabled for HDFS access

—

hive.hdfs.impersonation.enabled

Defines whether the impersonation mechanism is used for end users to access the HDFS

false

hive.hdfs.trino.principal

Trino client Kerberos principal for connection to the Hive Metastore

—

hive.hdfs.trino.keytab

Path to the keytab file for the Trino Kerberos client

—

hive.metastore.thrift.client.ssl.enabled

Defines whether SSL is enabled for the Hive Metastore Thrift client

false

hive.metastore.thrift.client.ssl.key

Path to the keystore for SSL authentication

—

hive.metastore.thrift.client.ssl.key-password

Password for the keystore used in SSL authentication

—

hive.metastore.thrift.client.ssl.trust-certificate

Path to the truststore for SSL authentication

—

hive.metastore.thrift.client.ssl.trust-certificate-password

Password for the truststore used in SSL authentication

—

Custom catalogs

Parameter

Description

Default value

Connector properties

In this section you can add custom catalog records by clicking +1 and entering the properties for the new connector

—

resource-groups.properties

Parameter

Description

Default value

resource-groups.configuration-manager

Source of the configuration for the resource group manager: JSON file or database

file

resource-groups.config-db-url

URL of the database to load the configuration from

—

resource-groups.config-db-user

Username for the database connection

—

resource-groups.config-db-password

Password for the database connection

—

resource-groups.config-file

Path to the JSON configuration file for the resource group manager

/etc/trino/conf/resource-groups.json

resource-groups.max-refresh-interval

Maximum time period for which the cluster will continue to accept queries after configuration refresh failures. After this period the configuration will become invalidated

—

resource-groups.refresh-interval

Time interval (in seconds) for reloading the configuration

—

session-property-config.properties

Parameter

Description

Default value

Session property managers

Source of the configuration for the session property managers

file

session-property-manager.config-file

Path to the JSON configuration file for the session property managers

/etc/trino/conf/session-property-config.json

Fault-tolerant execution

Parameter Description Default value

retry-policy

Defines whether Trino should retry entire query, its individual tasks, or do nothing in the event of failure. The parameter accepts the following values:

NONE — disables FTE (default).
QUERY — if a Trino worker fails, Trino retries the entire query. This policy is recommended when the majority of Trino requests are small queries.
TASK — if a Trino worker fails, Trino retries individual query tasks. This policy is best suited for retrying large batch queries, however, it can result in higher latency for big number of short-running queries.

NONE

exchange.deduplication-buffer-size

Data size of the coordinator in-memory buffer used by fault-tolerant execution to store output of query stages

32MB

fault-tolerant-execution-exchange-encryption-enabled

Defines whether to use encryption for spooling data

false

task.low-memory-killer.policy

Defines whether to kill tasks in case of low memory availability. Takes one of the following values:

NONE — do not kill tasks.
total-reservation-on-blocked-nodes — kill the tasks that are part of the queries which are currently using the most memory specifically on nodes that are now out of memory.
least-waste — kill the tasks that are part of the queries which use significant amount of memory on nodes which are now out of memory. This policy avoids killing tasks which are executing for a long time in order not to waste the work done by them.

This parameter applies only if the retry-policy parameter has the TASK value

total-reservation-on-blocked-nodes

query-retry-attempts

Maximum number of times that Trino can attempt to retry a query before it is considered failed. This parameter applies only if the retry-policy parameter has the QUERY value

task-retry-attempts-per-task

Maximum number of times that Trino can attempt to retry a single task before its parent query is considered failed. This parameter applies only if the retry-policy parameter has the TASK value

retry-initial-delay

Minimum interval of time that a failed query or task must wait before it is retried. This parameter applies only if the retry-policy parameter has the TASK or QUERY value

10s

retry-max-delay

Maximum interval of time that a failed query or task must wait before it is retried. Wait time is increased on each subsequent failure. This parameter applies only if the retry-policy parameter has the TASK or QUERY value

retry-delay-scale-factor

Factor by which retry delay is increased on each query or task failure. This parameter applies only if the retry-policy parameter has the TASK or QUERY value

exchange-manager.properties

Parameter

Description

Default value

exchange-manager.name

Type of storage for the intermediate data exchange between stages of a distributed query: HDFS or local filesystem

hdfs

exchange.base-directories

Comma-separated list of location URIs that the Exchange manager uses to store spooling data

hdfs://<adh_cluster_nameservice>/exchange-spooling-directory

hdfs.config.resources

Comma-separated list of paths to HDFS configuration files. The files must exist on all nodes of the Trino cluster

/etc/hadoop/conf/core-site.xml, /etc/hadoop/conf/hdfs-site.xml

exchange.hdfs.auto-create-storage-dirs

Defines whether to automatically create HDFS storage directories if they do not exist

false

exchange.hdfs.auto-create-storage-dirs-perms

File system permissions for automatically created storage directories

755

exchange.sink-buffer-pool-min-size

Minimum buffer pool size for an exchange sink. The larger the buffer pool size, the larger the write parallelism and memory usage

exchange.sink-buffers-per-partition

Number of buffers per partition in the buffer pool. The larger the buffer pool size, the larger the write parallelism and memory usage

log.properties

Parameter

Description

Default value

io.trino

Global logging level for Trino

INFO

io.trino.server

Logging level for Trino server components

INFO

ranger-trino-audit.xml

Parameter Description Default value

xasecure.audit.destination.solr.batch.filespool.dir

The spool directory path

/srv/ranger/trino_plugin/audit_solr_spool

xasecure.audit.destination.solr.urls

A URL of the Solr server to store audit events. Leave this property value empty or set it to NONE when using ZooKeeper to connect to Solr

—

xasecure.audit.destination.solr.zookeepers

Specifies the ZooKeeper connection string for the Solr destination

—

xasecure.audit.destination.solr.force.use.inmemory.jaas.config

Whether to use in-memory JAAS configuration file to connect to Solr

—

xasecure.audit.is.enabled

Enables Ranger audit

true

xasecure.audit.jaas.Client.loginModuleControlFlag

Specifies whether the success of the module is required, requisite, sufficient, or optional

—

xasecure.audit.jaas.Client.loginModuleName

The name of the authenticator class

—

xasecure.audit.jaas.Client.option.keyTab

The name of the keytab file to get the principal’s secret key

—

xasecure.audit.jaas.Client.option.principal

The name of the principal to be used

—

xasecure.audit.jaas.Client.option.serviceName

The name of a user or a service for login

—

xasecure.audit.jaas.Client.option.storeKey

Set this to true if you want the keytab or the principal’s key to be stored in the subject’s private credentials

—

xasecure.audit.jaas.Client.option.useKeyTab

Set this to true if you want the module to get the principal’s key from the keytab

—

xasecure.audit.jaas.Client.option.useTicketCache

Set this to true if you want the module to get the principal’s key from the cache

—

ranger-trino-security.xml

Parameter Description Default value

ranger.plugin.trino.access.cluster.name

Name to identify the Trino cluster. It is recorded in audit logs generated by the plugin

—

ranger.plugin.trino.super.users

Comma-separated list of usernames that have the superuser privileges

—

ranger.plugin.trino.super.groups

Comma-separated list of group names. Users from these groups are treated as superusers

—

ranger.plugin.trino.use.rangerGroups

Defines whether to obtain user-to-groups mapping from Apache Ranger

—

ranger.plugin.trino.policy.rest.url

The URL to Ranger Admin

—

ranger.plugin.trino.service.name

Name of the Ranger service containing policies for this Trino instance

—

ranger.plugin.trino.policy.cache.dir

Directory where Ranger policies are cached after a successful retrieval from the source

/srv/ranger/trino/policycache

ranger.plugin.trino.policy.pollIntervalMs

Defines how often to poll for changes in policies (in milliseconds)

30000

ranger.plugin.trino.policy.rest.client.connection.timeoutMs

Trino plugin connection timeout in milliseconds

120000

ranger.plugin.trino.policy.rest.client.read.timeoutMs

Trino plugin read timeout in milliseconds

30000

ranger.plugin.trino.enable.implicit.userstore.enricher

Defines whether to use the UserStoreEnricher for fetching user and group attributes if using macros or scripts in row-filters since Ranger 2.3

true

ranger.plugin.trino.policy.rest.ssl.config.file

The path to the RangerRestClient SSL config file for the Trino plugin

/etc/trino/conf/coordinator/ranger-trino-policymgr-ssl.xml

ranger.plugin.trino.ugi.initialize

Defines whether to initialize the Kerberos ID used to authenticate with Ranger Admin server

true

ranger.plugin.trino.ugi.login.type

Type of login. Use the keytab value

keytab

ranger.plugin.trino.ugi.keytab.principal

Kerberos principal

—

ranger.plugin.trino.ugi.keytab.file

Location of the keytab file

—

ranger-trino-policymgr-ssl.xml

Parameter

Description

Default value

xasecure.policymgr.clientssl.keystore

The path to the keystore file used by Ranger

—

xasecure.policymgr.clientssl.keystore.credential.file

The path to the keystore credentials file

/etc/trino/conf/ranger-trino.jceks

xasecure.policymgr.clientssl.truststore.credential.file

The path to the truststore credentials file

/etc/trino/conf/ranger-trino.jceks

xasecure.policymgr.clientssl.truststore

The path to the truststore file used by Ranger

—

xasecure.policymgr.clientssl.keystore.password

The password to the keystore file

—

xasecure.policymgr.clientssl.truststore.password

The password to the truststore file

—

access-control.properties

Parameter Description Default value

access-control.name

Type of access control. Possible values:

default — all operations are permitted, except for user impersonation and triggering graceful shutdown.
allow-all — all operations are permitted.
read-only — reading data or metadata is permitted, writing data or metadata is forbidden.
file — authorization rules are specified in a configuration file.
opa — Open Policy Agent (OPA) is used for authorization.
ranger — Apache Ranger policies are used for authorization.

default

ranger.service.name

Name of the Ranger service that has the policies to enforce

—

ranger.plugin.config.resource

Comma-separated list of Ranger plugin configuration files. Relative paths are resolved dynamically by searching in the classpath

—

ranger.hadoop.config.resource

Comma-separated list of Hadoop configuration files. Relative paths are resolved dynamically by searching in the classpath

—

Others

Parameter

Description

Default value

Custom Hive configuration

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed for Hive configuration

—

Custom Iceberg configuration

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed for Iceberg configuration

—

Custom resource-groups.properties

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed for the resource group manager properties

—

Custom session-property-config.properties

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed for the session property managers properties

—

Custom exchange-manager.properties

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed for the Exchange manager properties

—

Custom log.properties

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed for the logging properties

—

Custom ranger-trino-audit.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file ranger-trino-audit.xml

—

Custom ranger-trino-security.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file ranger-trino-security.xml

—

Custom ranger-trino-policymgr-ssl.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file ranger-trino-policymgr-ssl.xml

—

Ranger plugin enabled

Defines whether to use the Ranger plugin for Trino

false

resource-groups.json

JSON file containing the configuration for the resource group manager

resource-groups.json

session-property-config.json

JSON file containing the configuration for the session property managers

session-property-config.json

The Trino Coordinator component

config.properties
Parameter	Description	Default value
coordinator	Specifies whether this node acts as a coordinator	true
http-server.authentication.allow-insecure-over-http	Defines whether non-secure HTTP connections are allowed when authentication is enabled	true
node-scheduler.include-coordinator	Defines whether the coordinator node should also schedule and run queries	true
http-server.http.port	HTTP port of the server	18188
discovery.uri	URI for the discovery service. Typically, the fully qualified domain name (FQDN) with the server port	—
join-distribution-type	Type of distributed JOIN to use	AUTOMATIC
redistribute-writes	Defines whether to redistribute data before writing	true
query.max-cpu-time	Maximum amount of CPU time that a query can use across the entire cluster. Queries that exceed this limit are killed	1_000_000_000d
query.max-memory-per-node	Maximum amount of user memory a query can use on a worker	—
query.max-memory	Maximum amount of user memory a query can use across the entire cluster	20GB
query.max-total-memory	Maximum amount of memory a query can use across the entire cluster, including revocable memory	40GB
memory.heap-headroom-per-node	Amount of memory set aside as headroom (buffer) in the JVM heap for allocations that are not tracked by Trino	—
exchange.deduplication-buffer-size	Size of the buffer used for spooled data during fault-tolerant execution	—
query.client.timeout	Timeout for the cluster to run without contact from the client application before it cancels its work	5m
query.execution-policy	Algorithm for organization of the processing of all the stages in a query	phased
query.determine-partition-count-for-write-enabled	Determines whether the number of partitions is based on amount of data read and processed by the query for write queries	false
query.max-hash-partition-count	Maximum number of partitions to use for processing distributed operations	100
query.min-hash-partition-count-for-write	Minimum number of partitions to use for processing distributed operations in write queries	50
query.max-writer-task-count	Maximum number of tasks that will take part in writing data during `INSERT`, `CREATE TABLE AS SELECT`, and `EXECUTE` queries. The limit is only applicable when the `redistribute-writes` or `scale-writers` parameter is enabled	100
query.low-memory-killer.policy	Defines the mode of dealing with running queries in the event of low memory availability	total-reservation-on-blocked-nodes
query.max-execution-time	Maximum time of query execution in the cluster before it is terminated. Execution time does not include analysis, query planning, or waiting in queue	100d
query.max-length	Maximum number of characters allowed for the SQL query text. Longer queries are not processed and terminated with the `QUERY_TEXT_TOO_LARGE` error	1000000
query.max-planning-time	Maximum allowed time for a query to plan the execution	10m
query.max-run-time	Maximum allowed time for a query to be processed in the cluster before it is terminated	100d
query.max-scan-physical-bytes	Maximum number of bytes that can be scanned by a query during its execution	—
query.remote-task.max-error-duration	Timeout for the remote tasks that fail to communicate with the coordinator. If the coordinator is unable to receive updates from a remote task before this value is reached, the coordinator treats the task as failed	1m
query.max-stage-count	Maximum number of stages allowed to be generated per query	150
query.max-history	Maximum number of queries to keep in the query history to provide statistics and other information, and make the data available in the web UI	100
query.min-expire-age	Minimum age of a query in the history before it is expired. An expired query is removed from the query history buffer and is no longer available in the web UI	15m
query.remote-task.enable-adaptive-request-size	Defines whether dynamic splitting up of server requests sent by tasks is enabled. This can prevent out-of-memory errors for large schemas	true
query.remote-task.guaranteed-splits-per-task	Minimum number of splits that should be assigned to each remote task to ensure that each task has a minimum amount of work to perform	3
query.remote-task.max-request-size	Maximum size of a single request made by a remote task. Requires the `query.remote-task.enable-adaptive-request-size` parameter to be enabled	8MB
query.remote-task.request-size-headroom	Determines the amount of headroom that should be allocated beyond the size of the request data. Requires the `query.remote-task.enable-adaptive-request-size` parameter to be enabled	2MB
query.info-url-template	URL for redirection of clients to an alternative location for query information	—
retry-policy	Retry policy to use for fault-tolerant execution	NONE
catalog.prune.update-interval	Interval for pruning dropped catalogs. Requires the `catalog.management` parameter to be set to `dynamic`. Dropping a catalog does not interrupt any running queries that use it, but makes it unavailable to any new queries	5s
catalog.store	When set to `file`, creating and dropping catalogs using the SQL commands adds and removes catalog property files on the coordinator node. Requires the `catalog.management` parameter to be set to `dynamic`	—
sql.forced-session-time-zone	Defines whether to force set the time zone for any query processing to the configured value, and therefore override the time zone of the client. The time zone must be specified as a string	—
sql.default-catalog	Default catalog for all clients. Any default catalog configuration provided by a client overrides this default	—
sql.default-schema	Default schema for all clients. Must be set to a schema name that is valid for the default catalog. Any default schema configuration provided by a client overrides this default	—
sql.default-function-catalog	Default catalog for user-defined functions storage for all clients	—
sql.default-function-schema	Default schema for UDF storage for all clients	—
sql.path	Default collection of paths to functions or table functions in specific catalogs and schemas. Paths are specified as `catalog_name.schema_name`	—
spill-enabled	Defines whether spilling memory to disk to avoid exceeding memory limits for the query is enabled	false
spiller-spill-path	Path for spilling memory to disk	—
spiller-max-used-space-threshold	If disk space usage ratio of a given spill path is above this threshold, this spill path is not eligible for spilling	0.9
spiller-threads	Number of spiller threads. Increase this value if the default is not enough to saturate the underlying spilling device (for example, when using RAID)	4
max-spill-per-node	Maximum spill space to use by all queries on a single node. This only needs to be configured on worker nodes	100GB
query-max-spill-per-node	Maximum spill space to use by a single query on a single node. This only needs to be configured on worker nodes	100GB
aggregation-operator-unspill-memory-limit	Limit for memory used for unspilling a single aggregation operator instance	4MB
spill-compression-codec	Compression codec to use when spilling pages to disk	NONE
spill-encryption-enabled	Set this to `true` to randomly generate a secret key (per spill file) to encrypt and decrypt data spilled to disk	false
exchange.client-threads	Number of threads used by exchange clients to fetch data from other Trino nodes. A higher value can improve performance for large clusters or clusters with very high concurrency, but excessively high values may cause a drop in performance due to context switches and additional memory usage	25
exchange.concurrent-request-multiplier	Multiplier determining the number of concurrent requests relative to available buffer memory	3
exchange.compression-codec	Compression codec to use for file compression and decompression when exchanging data between nodes and the exchange storage with fault-tolerant execution mode	LZ4
exchange.data-integrity-verification	Resulting behavior when encountering data integrity issues	ABORT
exchange.max-buffer-size	Size of buffer in the exchange client that holds data fetched from other nodes before it is processed	32MB
exchange.max-response-size	Maximum size of a response returned from an exchange request	16MB
sink.max-buffer-size	Output buffer size for task data that is waiting to be pulled by upstream tasks	32MB
sink.max-broadcast-buffer-size	Broadcast output buffer size for task data that is waiting to be pulled by upstream tasks	200MB
task.concurrency	Default local concurrency for parallel operators, such as joins and aggregations	—
task.http-response-threads	Maximum number of threads that may be created to handle HTTP responses	100
task.http-timeout-threads	Number of threads used to handle timeouts when generating HTTP responses	3
task.info-update-interval	Time interval for task information renewal, which is used in scheduling. Larger values can reduce coordinator CPU load, but may result in suboptimal split scheduling	3s
task.max-drivers-per-task	Maximum number of drivers a task can run concurrently	2147483647
task.max-partial-aggregation-memory	Maximum size of partial aggregation results for distributed aggregations	16MB
task.max-worker-threads	Number of threads used by workers to process splits	—
task.min-drivers	Target number of running leaf splits on a worker	—
task.min-drivers-per-task	Minimum number of drivers guaranteed to run concurrently for a single task given the task has remaining splits to process	3
task.min-writer-count	Minimum number of concurrent writer threads per worker per query when preferred partitioning and task writer scaling are not used	1
task.max-writer-count	Maximum number of concurrent writer threads per worker per query when preferred partitioning and task writer scaling are not used	—
task.interrupt-stuck-split-tasks-enabled	Set this to `true` to allow Trino to detect and fail tasks containing splits that have been stuck	true
task.interrupt-stuck-split-tasks-warning-threshold	Defines the splits running time threshold after which call stacks at /v1/maxActiveSplits endpoint are printed out and JMX metrics are generated	10m
task.interrupt-stuck-split-tasks-timeout	Timeout for a blocked split processing thread before Trino fails the task	10m
task.interrupt-stuck-split-tasks-detection-interval	Interval of Trino checks for splits that have processing time exceeding the `task.interrupt-stuck-split-tasks-timeout` parameter value	2m
use-preferred-write-partitioning	Defines whether preferred write partitioning is enabled. When set to `true`, each partition is written by a separate writer	true
scale-writers	Defines whether writer scaling by dynamically increasing the number of writer tasks on the cluster is enabled	true
task.scale-writers.enabled	Defines whether scaling the number of concurrent writers within a task is enabled	true
writer-scaling-min-data-processed	Minimum amount of uncompressed data that must be processed by a writer before another writer can be added	100MB
optimizer.dictionary-aggregation	Defines whether optimization for aggregations on dictionaries is used	false
optimizer.optimize-hash-generation	Set this to `true` to compute hash codes for distribution, joins, and aggregations early during execution, allowing result to be shared between operations later in the query	false
optimizer.optimize-metadata-queries	Defines whether to push the aggregation below the outer join when an aggregation is above an outer join and all columns from the outer side of the join are in the grouping clause	true
optimizer.distinct-aggregations-strategy	Strategy to use for multiple distinct aggregations	AUTOMATIC
optimizer.push-table-write-through-union	Defines whether to parallelize writes when using `UNION ALL` in queries that write data	true
optimizer.join-reordering-strategy	Join reordering strategy to use. `NONE` maintains the order the tables are listed in the query	AUTOMATIC
optimizer.max-reordered-joins	Maximum number of joins that can be reordered at once when the `optimizer.join-reordering-strategy` is set to `cost-based`	8
optimizer.optimize-duplicate-insensitive-joins	Defines whether to reduce the number of rows produced by joins when optimizer detects that duplicated join output rows can be skipped	true
optimizer.use-exact-partitioning	Defines whether to re-partition the data unless the partitioning of the upstream stage exactly matches what the downstream stage expects	false
optimizer.use-table-scan-node-partitioning	Defines whether to use the connector provided table node partitioning when reading tables	true
optimizer.table-scan-node-partitioning-min-bucket-to-task-ratio	Specifies minimal bucket to task ratio that has to be matched or exceeded in order to use table scan node partitioning	0.5
optimizer.filter-conjunction-independence-factor	Scales the strength of independence assumption for estimating the selectivity of the conjunction of multiple predicates	0.75
optimizer.join-multi-clause-independence-factor	Scales the strength of independence assumption for estimating the output of a multi-clause join	0.25
optimizer.non-estimatable-predicate-approximation.enabled	Defines whether to use the cost based optimizer to determine if repartitioning the output of an already partitioned stage is necessary	true
optimizer.join-partitioned-build-min-row-count	Minimum number of join build side rows required to use partitioned join lookup	1000000
optimizer.min-input-size-per-task	Minimum input size required per task	5GB
optimizer.min-input-rows-per-task	Minimum number of input rows required per task	10000000
log.annotation-file	Name of an optional properties file that contains annotations to include with each log message for TCP output or file output in JSON format, defined by the `log.path` and `log.format` parameters	—
log.path	Path to the log file used by Trino. The path is relative to the data directory, configured to `var/log/server.log` by the launcher script	—
log.max-size	Maximum size for the general application log file	100MB
log.max-total-size	Maximum total size for the general application log files	1GB
log.compression	Compression format for rotated log files. Can be set to either `GZIP` or `NONE`. When set to `NONE`, compression is disabled	GZIP
http-server.log.enabled	Defines whether the logging for the HTTP server is enabled	true
http-server.log.compression.enabled	Defines whether the compression of the log files of the HTTP server is enabled	true
http-server.log.path	Path to the log file of the HTTP server	/var/log/trino/http-request.log
http-server.log.max-history	Maximum number of log files for the HTTP server to use before log rotation	15
http-server.log.max-size	Maximum file size for the log file of the HTTP server	—
re2j.dfa-states-limit	Maximum number of states to use when RE2J builds the fast, but potentially memory intensive, deterministic finite automaton (DFA) for regular expression matching	2147483647
re2j.dfa-retries	Number of times that RE2J retries the DFA algorithm, when it reaches a states limit before using the slower, but less memory intensive NFA algorithm, for all future inputs for that search	5
http-server.authentication.type	Authentication mechanism to allow user access to Trino	—
password-authenticator.config-files	Password Authenticator configuration file	/etc/trino/conf/coordinator/password-authenticator.properties
node.internal-address-source	Set this to `FQDN` to ensure correct operation and usage of valid DNS host names by Kerberos	FQDN

Kerberos Configuration
Parameter	Description	Default value
http-server.authentication.krb5.service-name	Kerberos service name for authentication	HTTP
http-server.authentication.krb5.principal-hostname	Principal hostname for Kerberos authentication	—
http-server.authentication.krb5.keytab	Path to the keytab file for Kerberos authentication	/etc/security/keytabs/HTTP.service.keytab
http.authentication.krb5.config	Path to the Kerberos configuration file	/etc/krb5.conf
http-server.authentication.krb5.user-mapping.pattern	Regular expression pattern for mapping Kerberos principal names to local usernames	^(.?)\/.$
web-ui.authentication.type	Authentication mechanism that is used to allow user access to the UI	—
web-ui.user	User name for authentication in the web UI	trino
internal-communication.shared-secret	String used by coordinators and workers of the same cluster to authenticate within it	—

SSL Configuration
Parameter	Description	Default value
http-server.http.enabled	Defines whether HTTP is enabled for the HTTP server	true
internal-communication.https.required	Defines whether SSL/TLS is used for all internal communications	false
http-server.https.enabled	Defines whether HTTPS is enabled for the HTTP server	false
http-server.https.port	HTTPS port of the server	18188
http-server.https.keystore.path	Path to the keystore file for HTTPS	—
http-server.https.keystore.key	Password for the keystore used in HTTPS	—
http-server.https.truststore.path	Path to the truststore file for HTTPS	—
http-server.https.truststore.key	Password for the truststore used in HTTPS	—
internal-communication.https.keystore.path	Path to the keystore file for internal communication within Trino cluster via HTTPS	—
internal-communication.https.keystore.key	Password for the keystore used in internal communication within Trino cluster via HTTPS	—

Enable LDAP
Parameter	Description	Default value
password-authenticator.name	Specifies the name of the authenticator implementation	ldap
ldap.url	URL of the LDAP server	—
ldap.ssl.truststore.path	Path to the PEM or JKS keystore file	—
ldap.user-bind-pattern	LDAP user bind string for password authentication. Must contain the pattern `${USER}` which is replaced by the actual username during the password authentication. Can contain multiple patterns separated by a colon	${USER}@example.com
ldap.allow-insecure	Defines whether connecting to the LDAP server without TLS is allowed	false
ldap.user-base-dn	Base LDAP distinguished name for the user who tries to connect to the server. Example: `OU=Peoples,DC=example,DC=com`	—
ldap.bind-dn	Bind distinguished name used by Trino when issuing group membership queries. Example: `CN=admin,OU=CITY_OU,DC=domain`	—
ldap.bind-password	Bind password used by Trino when issuing group membership queries	—
ldap.group-auth-pattern	Specifies the LDAP query for the LDAP group membership authorization	(&(objectClass=person)(sAMAccountName=${USER})(memberof=CN=AuthorizedGroup,OU=Groups,DC=example,DC=com))

Trino catalog management
Parameter	Description	Default value
catalog.management	Catalog manager type	static
Postgres JDBC URL	JDBC connection URL of the Postgres catalog store	jdbc:postgresql://{{ groups['adpg.adpg'][0] \| d(omit) }}:5432/catalog
Postgres username	Postgres catalog store JDBC connection username	catalog
Postgres password	Postgres catalog store JDBC connection password	—
Enable cleanup	Defines whether the cleanup of the old version catalogs is enabled	true
Encryption password	Password for the properties encryption	—
RegEx pattern	Regular expression pattern for defining properties to be encrypted	—
Cleanup old version catalogs	Old version catalogs cleanup interval. Minimal value is `10d`	30d
Time to live of old version catalogs	Time-to-live (TTL) of old version catalogs. Minimal value is `1d`	365d

node.properties
Parameter	Description	Default value
node.environment	Name of the environment in which the node operates. Must be the same for all Trino nodes in a cluster	adh
node.id	Unique identifier of the node	—
node.data-dir	Path to the directory for node data storage	/srv/trino/data/coordinator
node.server-log-file	Path to the server log file for the node	/var/log/trino/coordinator/server.log
node.launcher-log-file	Path to the launcher log file for the node	/var/log/trino/coordinator/launcher.log

Others
Parameter	Description	Default value
env.sh	The contents of the env.sh file that contains Trino environment settings	env.sh
jvm.config	The contents of the jvm.config file that contains the Java virtual machine settings for Trino	jvm.config
Custom password-authenticator.properties	In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file password-authenticator.properties	—
Custom config.properties	In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file config.properties	—
Custom node.properties	In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file node.properties	—
Enable custom ulimits	Switch on the corresponding toggle button to specify resource limits (ulimits) for the current process. If you do not set these values, the default system settings are used. Ulimit settings are described in the table below	`[Service] DefaultLimitCPU= DefaultLimitFSIZE= DefaultLimitDATA= DefaultLimitSTACK= DefaultLimitCORE= DefaultLimitRSS= DefaultLimitNOFILE= DefaultLimitAS= DefaultLimitNPROC= DefaultLimitMEMLOCK= DefaultLimitLOCKS= DefaultLimitSIGPENDING= DefaultLimitMSGQUEUE= DefaultLimitNICE= DefaultLimitRTPRIO= DefaultLimitRTTIME=`

Ulimit settings
Parameter	Description	Corresponding option of the ulimit command in CentOS
DefaultLimitCPU	A limit in seconds on the amount of CPU time that a process can consume	cpu time ( -t)
DefaultLimitFSIZE	The maximum size of files that a process can create, in 512-byte blocks	file size ( -f)
DefaultLimitDATA	The maximum size of a process’s data segment, in kilobytes	data seg size ( -d)
DefaultLimitSTACK	The maximum stack size allocated to a process, in kilobytes	stack size ( -s)
DefaultLimitCORE	The maximum size of a core dump file allowed for a process, in 512-byte blocks	core file size ( -c)
DefaultLimitRSS	The maximum of resident set size, in kilobytes	max memory size ( -m)
DefaultLimitNOFILE	The maximum number of open file descriptors allowed for the process	open files ( -n)
DefaultLimitAS	The maximum size of the process virtual memory (address space), in kilobytes	virtual memory ( -v)
DefaultLimitNPROC	The maximum number of processes	max user processes ( -u)
DefaultLimitMEMLOCK	The maximum memory size that can be locked for the process, in kilobytes. Memory locking ensures the memory is always in RAM and a swap file is not used	max locked memory ( -l)
DefaultLimitLOCKS	The maximum number of files locked by a process	file locks ( -x)
DefaultLimitSIGPENDING	The maximum number of signals that are pending for delivery to the calling thread	pending signals ( -i)
DefaultLimitMSGQUEUE	The maximum number of bytes in POSIX message queues. POSIX message queues allow processes to exchange data in the form of messages	POSIX message queues ( -q)
DefaultLimitNICE	The maximum NICE priority level that can be assigned to a process	scheduling priority ( -e)
DefaultLimitRTPRIO	The maximum real-time scheduling priority level	real-time priority ( -r)
DefaultLimitRTTIME	The maximum pipe buffer size, in 512-byte blocks	pipe size ( -p)

The Trino Worker component

config.properties
Parameter	Description	Default value
coordinator	Specifies whether this node acts as a coordinator	false
http-server.authentication.allow-insecure-over-http	Defines whether non-secure HTTP connections are allowed when authentication is enabled	true
http-server.http.port	HTTP port of the server	18189
discovery.uri	URI for the discovery service. Typically, the fully qualified domain name (FQDN) with the server port	—
join-distribution-type	Type of distributed JOIN to use	AUTOMATIC
redistribute-writes	Defines whether to redistribute data before writing	true
query.max-cpu-time	Maximum amount of CPU time that a query can use across the entire cluster. Queries that exceed this limit are killed	1_000_000_000d
query.max-memory-per-node	Maximum amount of user memory a query can use on a worker	—
query.max-memory	Maximum amount of user memory a query can use across the entire cluster	20GB
query.max-total-memory	Maximum amount of memory a query can use across the entire cluster, including revocable memory	40GB
memory.heap-headroom-per-node	Amount of memory set aside as headroom (buffer) in the JVM heap for allocations that are not tracked by Trino	—
exchange.deduplication-buffer-size	Size of the buffer used for spooled data during fault-tolerant execution	—
query.client.timeout	Timeout for the cluster to run without contact from the client application before it cancels its work	5m
query.execution-policy	Algorithm for organization of the processing of all the stages in a query	phased
query.determine-partition-count-for-write-enabled	Determines whether the number of partitions is based on amount of data read and processed by the query for write queries	false
query.max-hash-partition-count	Maximum number of partitions to use for processing distributed operations	100
query.min-hash-partition-count-for-write	Minimum number of partitions to use for processing distributed operations in write queries	50
query.max-writer-task-count	Maximum number of tasks that will take part in writing data during `INSERT`, `CREATE TABLE AS SELECT`, and `EXECUTE` queries. The limit is only applicable when the `redistribute-writes` or `scale-writers` parameter is enabled	100
query.low-memory-killer.policy	Defines the mode of dealing with running queries in the event of low memory availability	total-reservation-on-blocked-nodes
query.max-execution-time	Maximum time of query execution in the cluster before it is terminated. Execution time does not include analysis, query planning, or waiting in queue	100d
query.max-length	Maximum number of characters allowed for the SQL query text. Longer queries are not processed and terminated with the `QUERY_TEXT_TOO_LARGE` error	1000000
query.max-planning-time	Maximum allowed time for a query to plan the execution	10m
query.max-run-time	Maximum allowed time for a query to be processed in the cluster before it is terminated	100d
query.max-scan-physical-bytes	Maximum number of bytes that can be scanned by a query during its execution	—
query.remote-task.max-error-duration	Timeout for the remote tasks that fail to communicate with the coordinator. If the coordinator is unable to receive updates from a remote task before this value is reached, the coordinator treats the task as failed	1m
query.max-stage-count	Maximum number of stages allowed to be generated per query	150
query.max-history	Maximum number of queries to keep in the query history to provide statistics and other information, and make the data available in the web UI	100
query.min-expire-age	Minimum age of a query in the history before it is expired. An expired query is removed from the query history buffer and is no longer available in the web UI	15m
query.remote-task.enable-adaptive-request-size	Defines whether dynamic splitting up of server requests sent by tasks is enabled. This can prevent out-of-memory errors for large schemas	true
query.remote-task.guaranteed-splits-per-task	Minimum number of splits that should be assigned to each remote task to ensure that each task has a minimum amount of work to perform	3
query.remote-task.max-request-size	Maximum size of a single request made by a remote task. Requires the `query.remote-task.enable-adaptive-request-size` parameter to be enabled	8MB
query.remote-task.request-size-headroom	Determines the amount of headroom that should be allocated beyond the size of the request data. Requires the `query.remote-task.enable-adaptive-request-size` parameter to be enabled	2MB
query.info-url-template	URL for redirection of clients to an alternative location for query information	—
retry-policy	Retry policy to use for fault-tolerant execution	NONE
sql.forced-session-time-zone	Defines whether to force set the time zone for any query processing to the configured value, and therefore override the time zone of the client. The time zone must be specified as a string	—
sql.default-catalog	Default catalog for all clients. Any default catalog configuration provided by a client overrides this default	—
sql.default-schema	Default schema for all clients. Must be set to a schema name that is valid for the default catalog. Any default schema configuration provided by a client overrides this default	—
sql.default-function-catalog	Default catalog for user-defined functions storage for all clients	—
sql.default-function-schema	Default schema for UDF storage for all clients	—
sql.path	Default collection of paths to functions or table functions in specific catalogs and schemas. Paths are specified as `catalog_name.schema_name`	—
spill-enabled	Defines whether spilling memory to disk to avoid exceeding memory limits for the query is enabled	false
spiller-spill-path	Path for spilling memory to disk	—
spiller-max-used-space-threshold	If disk space usage ratio of a given spill path is above this threshold, this spill path is not eligible for spilling	0.9
spiller-threads	Number of spiller threads. Increase this value if the default is not enough to saturate the underlying spilling device (for example, when using RAID)	4
max-spill-per-node	Maximum spill space to use by all queries on a single node. This only needs to be configured on worker nodes	100GB
query-max-spill-per-node	Maximum spill space to use by a single query on a single node. This only needs to be configured on worker nodes	100GB
aggregation-operator-unspill-memory-limit	Limit for memory used for unspilling a single aggregation operator instance	4MB
spill-compression-codec	Compression codec to use when spilling pages to disk	NONE
spill-encryption-enabled	Set this to `true` to randomly generate a secret key (per spill file) to encrypt and decrypt data spilled to disk	false
exchange.client-threads	Number of threads used by exchange clients to fetch data from other Trino nodes. A higher value can improve performance for large clusters or clusters with very high concurrency, but excessively high values may cause a drop in performance due to context switches and additional memory usage	25
exchange.concurrent-request-multiplier	Multiplier determining the number of concurrent requests relative to available buffer memory	3
exchange.compression-codec	Compression codec to use for file compression and decompression when exchanging data between nodes and the exchange storage with fault-tolerant execution mode	LZ4
exchange.data-integrity-verification	Resulting behavior when encountering data integrity issues	ABORT
exchange.max-buffer-size	Size of buffer in the exchange client that holds data fetched from other nodes before it is processed	32MB
exchange.max-response-size	Maximum size of a response returned from an exchange request	16MB
sink.max-buffer-size	Output buffer size for task data that is waiting to be pulled by upstream tasks	32MB
sink.max-broadcast-buffer-size	Broadcast output buffer size for task data that is waiting to be pulled by upstream tasks	200MB
task.concurrency	Default local concurrency for parallel operators, such as joins and aggregations	—
task.http-response-threads	Maximum number of threads that may be created to handle HTTP responses	100
task.http-timeout-threads	Number of threads used to handle timeouts when generating HTTP responses	3
task.info-update-interval	Time interval for task information renewal, which is used in scheduling. Larger values can reduce coordinator CPU load, but may result in suboptimal split scheduling	3s
task.max-drivers-per-task	Maximum number of drivers a task can run concurrently	2147483647
task.max-partial-aggregation-memory	Maximum size of partial aggregation results for distributed aggregations	16MB
task.max-worker-threads	Number of threads used by workers to process splits	—
task.min-drivers	Target number of running leaf splits on a worker	—
task.min-drivers-per-task	Minimum number of drivers guaranteed to run concurrently for a single task given the task has remaining splits to process	3
task.min-writer-count	Minimum number of concurrent writer threads per worker per query when preferred partitioning and task writer scaling are not used	1
task.max-writer-count	Maximum number of concurrent writer threads per worker per query when preferred partitioning and task writer scaling are not used	—
task.interrupt-stuck-split-tasks-enabled	Set this to `true` to allow Trino to detect and fail tasks containing splits that have been stuck	true
task.interrupt-stuck-split-tasks-warning-threshold	Defines the splits running time threshold after which call stacks at /v1/maxActiveSplits endpoint are printed out and JMX metrics are generated	10m
task.interrupt-stuck-split-tasks-timeout	Timeout for a blocked split processing thread before Trino fails the task	10m
task.interrupt-stuck-split-tasks-detection-interval	Interval of Trino checks for splits that have processing time exceeding the `task.interrupt-stuck-split-tasks-timeout` parameter value	2m
use-preferred-write-partitioning	Defines whether preferred write partitioning is enabled. When set to `true`, each partition is written by a separate writer	true
scale-writers	Defines whether writer scaling by dynamically increasing the number of writer tasks on the cluster is enabled	true
task.scale-writers.enabled	Defines whether scaling the number of concurrent writers within a task is enabled	true
writer-scaling-min-data-processed	Minimum amount of uncompressed data that must be processed by a writer before another writer can be added	100MB
node-scheduler.max-splits-per-node	Target value for the total number of splits that can be running on each worker node, assuming all splits have the standard split weight	256
node-scheduler.min-pending-splits-per-task	Minimum number of outstanding splits with the standard split weight guaranteed to be scheduled on a node (even when the node is already at the limit for total number of splits) for a single task given the task has remaining splits to process	16
node-scheduler.max-adjusted-pending-splits-per-task	Maximum number of outstanding splits with the standard split weight guaranteed to be scheduled on a node (even when the node is already at the limit for total number of splits) for a single task given the task has remaining splits to process	2000
node-scheduler.max-unacknowledged-splits-per-task	Maximum number of splits that are either queued on the coordinator but not yet sent, or confirmed to have been received by the worker	2000
node-scheduler.min-candidates	Minimum number of candidate nodes that are evaluated by the node scheduler when choosing the target node for a split	10
node-scheduler.policy	Node scheduler policy to use when scheduling splits	uniform
optimizer.dictionary-aggregation	Defines whether optimization for aggregations on dictionaries is used	false
optimizer.optimize-hash-generation	Set this to `true` to compute hash codes for distribution, joins, and aggregations early during execution, allowing result to be shared between operations later in the query	false
optimizer.optimize-metadata-queries	Defines whether to push the aggregation below the outer join when an aggregation is above an outer join and all columns from the outer side of the join are in the grouping clause	true
optimizer.distinct-aggregations-strategy	Strategy to use for multiple distinct aggregations	AUTOMATIC
optimizer.push-table-write-through-union	Defines whether to parallelize writes when using `UNION ALL` in queries that write data	true
optimizer.join-reordering-strategy	Join reordering strategy to use. `NONE` maintains the order the tables are listed in the query	AUTOMATIC
optimizer.max-reordered-joins	Maximum number of joins that can be reordered at once when the `optimizer.join-reordering-strategy` parameter is set to `cost-based`	8
optimizer.optimize-duplicate-insensitive-joins	Defines whether to reduce the number of rows produced by joins when optimizer detects that duplicated join output rows can be skipped	true
optimizer.use-exact-partitioning	Defines whether to re-partition the data unless the partitioning of the upstream stage exactly matches what the downstream stage expects	false
optimizer.use-table-scan-node-partitioning	Defines whether to use the connector provided table node partitioning when reading tables	true
optimizer.table-scan-node-partitioning-min-bucket-to-task-ratio	Specifies minimal bucket to task ratio that has to be matched or exceeded in order to use table scan node partitioning	0.5
optimizer.filter-conjunction-independence-factor	Scales the strength of independence assumption for estimating the selectivity of the conjunction of multiple predicates	0.75
optimizer.join-multi-clause-independence-factor	Scales the strength of independence assumption for estimating the output of a multi-clause join	0.25
optimizer.non-estimatable-predicate-approximation.enabled	Defines whether to use the cost based optimizer to determine if repartitioning the output of an already partitioned stage is necessary	true
optimizer.join-partitioned-build-min-row-count	Minimum number of join build side rows required to use partitioned join lookup	1000000
optimizer.min-input-size-per-task	Minimum input size required per task	5GB
optimizer.min-input-rows-per-task	Minimum number of input rows required per task	10000000
log.annotation-file	Name of an optional properties file that contains annotations to include with each log message for TCP output or file output in JSON format, defined with the `log.path` and `log.format` parameters	—
log.path	Path to the log file used by Trino. The path is relative to the data directory, configured to `var/log/server.log` by the launcher script	—
log.max-size	Maximum file size for the general application log file	100MB
log.max-total-size	Maximum total size for the general application log files	1GB
log.compression	Compression format for rotated log files. Can be set to either `GZIP` or `NONE`. When set to `NONE`, compression is disabled	GZIP
http-server.log.enabled	Defines whether the logging for the HTTP server is enabled	true
http-server.log.compression.enabled	Defines whether the compression of the log files of the HTTP server is enabled	true
http-server.log.path	Path to the log file of the HTTP server	/var/log/trino/http-request.log
http-server.log.max-history	Maximum number of log files for the HTTP server to use before log rotation	15
http-server.log.max-size	Maximum file size for the log file of the HTTP server	—
re2j.dfa-states-limit	Maximum number of states to use when RE2J builds the fast, but potentially memory intensive, deterministic finite automaton (DFA) for regular expression matching	2147483647
re2j.dfa-retries	Number of times that RE2J retries the DFA algorithm, when it reaches a states limit before using the slower, but less memory intensive NFA algorithm, for all future inputs for that search	5
http-server.authentication.type	Authentication mechanism to allow user access to Trino	—
node.internal-address-source	Set this to `FQDN` to ensure correct operation and usage of valid DNS host names by Kerberos	FQDN

Kerberos Configuration
Parameter	Description	Default value
http-server.authentication.krb5.service-name	Kerberos service name for authentication	HTTP
http-server.authentication.krb5.principal-hostname	Principal hostname for Kerberos authentication	—
http-server.authentication.krb5.keytab	Path to the keytab file for Kerberos authentication	/etc/security/keytabs/HTTP.service.keytab
http.authentication.krb5.config	Path to the Kerberos configuration file	/etc/krb5.conf
http-server.authentication.krb5.user-mapping.pattern	Regular expression pattern for mapping Kerberos principal names to local usernames	^(.?)\/.$
internal-communication.shared-secret	String used by coordinators and workers of the same cluster to authenticate within it	—

SSL Configuration
Parameter	Description	Default value
http-server.http.enabled	Defines whether HTTP is enabled for the HTTP server	true
internal-communication.https.required	Defines whether SSL/TLS is used for all internal communications	false
http-server.https.enabled	Defines whether HTTPS is enabled for the HTTP server	false
http-server.https.port	HTTPS port of the server	18189
http-server.https.keystore.path	Path to the keystore file for HTTPS	—
http-server.https.keystore.key	Password for the keystore used in HTTPS	—
http-server.https.truststore.path	Path to the truststore file for HTTPS	—
http-server.https.truststore.key	Password for the truststore used in HTTPS	—
internal-communication.https.keystore.path	Path to the keystore file for internal communication within Trino cluster via HTTPS	—
internal-communication.https.keystore.key	Password for the keystore used in internal communication within Trino cluster via HTTPS	—

node.properties
Parameter	Description	Default value
node.environment	Name of the environment in which the node operates. Must be the same for all Trino nodes in a cluster	adh
node.id	Unique identifier of the node	—
node.data-dir	Path to the directory for node data storage	/srv/trino/data/worker
catalog.config-dir	Path to the directory for catalog configurations	/etc/trino/conf/catalog
node.server-log-file	Path to the server log file for the node	/var/log/trino/worker/server.log
node.launcher-log-file	Path to the launcher log file for the node	/var/log/trino/worker/launcher.log

Others
Parameter	Description	Default value
env.sh	The contents of the env.sh file that contains Trino environment settings	env.sh
jvm.config	The contents of the jvm.config file that contains the Java virtual machine settings for Trino	jvm.config
Custom config.properties	In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file config.properties	—
Custom node.properties	In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file node.properties	—
Enable custom ulimits	Switch on the corresponding toggle button to specify resource limits (ulimits) for the current process. If you do not set these values, the default system settings are used. Ulimit settings are described in the table below	`[Service] DefaultLimitCPU= DefaultLimitFSIZE= DefaultLimitDATA= DefaultLimitSTACK= DefaultLimitCORE= DefaultLimitRSS= DefaultLimitNOFILE= DefaultLimitAS= DefaultLimitNPROC= DefaultLimitMEMLOCK= DefaultLimitLOCKS= DefaultLimitSIGPENDING= DefaultLimitMSGQUEUE= DefaultLimitNICE= DefaultLimitRTPRIO= DefaultLimitRTTIME=`

Ulimit settings
Parameter	Description	Corresponding option of the ulimit command in CentOS
DefaultLimitCPU	A limit in seconds on the amount of CPU time that a process can consume	cpu time ( -t)
DefaultLimitFSIZE	The maximum size of files that a process can create, in 512-byte blocks	file size ( -f)
DefaultLimitDATA	The maximum size of a process’s data segment, in kilobytes	data seg size ( -d)
DefaultLimitSTACK	The maximum stack size allocated to a process, in kilobytes	stack size ( -s)
DefaultLimitCORE	The maximum size of a core dump file allowed for a process, in 512-byte blocks	core file size ( -c)
DefaultLimitRSS	The maximum of resident set size, in kilobytes	max memory size ( -m)
DefaultLimitNOFILE	The maximum number of open file descriptors allowed for the process	open files ( -n)
DefaultLimitAS	The maximum size of the process virtual memory (address space), in kilobytes	virtual memory ( -v)
DefaultLimitNPROC	The maximum number of processes	max user processes ( -u)
DefaultLimitMEMLOCK	The maximum memory size that can be locked for the process, in kilobytes. Memory locking ensures the memory is always in RAM and a swap file is not used	max locked memory ( -l)
DefaultLimitLOCKS	The maximum number of files locked by a process	file locks ( -x)
DefaultLimitSIGPENDING	The maximum number of signals that are pending for delivery to the calling thread	pending signals ( -i)
DefaultLimitMSGQUEUE	The maximum number of bytes in POSIX message queues. POSIX message queues allow processes to exchange data in the form of messages	POSIX message queues ( -q)
DefaultLimitNICE	The maximum NICE priority level that can be assigned to a process	scheduling priority ( -e)
DefaultLimitRTPRIO	The maximum real-time scheduling priority level	real-time priority ( -r)
DefaultLimitRTTIME	The maximum pipe buffer size, in 512-byte blocks	pipe size ( -p)

YARN

mapred-site.xml

Parameter Description Default value

mapreduce.application.classpath

The classpath for MapReduce applications. A list of files/directories to be added to the classpath. To add more items to the classpath, click Plus icon . If mapreduce.application.framework is set, then this parameter must specify the appropriate classpath for that archive, and the name of the archive must be present in the classpath. If mapreduce.app-submission.cross-platform=false, the platform-specific environment variable expansion syntax would be used to construct the default classpath entries. If mapreduce.app-submission.cross-platform=true, the platform-agnostic default classpath for MapReduce applications would be used:

{{HADOOP_MAPRED_HOME}}/share/hadoop/mapreduce/*, {{HADOOP_MAPRED_HOME}}/share/hadoop/mapreduce/lib/*

Parameter expansion marker will be replaced by NodeManager on container launch, based on the underlying OS accordingly

/etc/hadoop/conf/*
/usr/lib/hadoop/*
/usr/lib/hadoop/lib/*
/usr/lib/hadoop-hdfs/*
/usr/lib/hadoop-hdfs/lib/*
/usr/lib/hadoop-yarn/*
/usr/lib/hadoop-yarn/lib/*
/usr/lib/hadoop-mapreduce/*
/usr/lib/hadoop-mapreduce/lib/*

mapreduce.cluster.local.dir

The local directory where MapReduce stores intermediate data files. May be a comma-separated list of directories on different devices in order to spread disk I/O. Directories that do not exist, are ignored

/srv/hadoop-yarn/mr-local

mapreduce.framework.name

The runtime framework for executing MapReduce jobs. Can be one of local, classic, or yarn

yarn

mapreduce.jobhistory.address

MapReduce JobHistory Server IPC (<host>:<port>)

—

mapreduce.jobhistory.bind-host

Setting the value to 0.0.0.0 will cause the MapReduce daemons to listen on all addresses and interfaces of the hosts in the cluster

0.0.0.0

mapreduce.jobhistory.webapp.address

MapReduce JobHistory Server Web UI (<host>:<port>)

—

mapreduce.map.env

Environment variables for the map task processes added by a user, specified as a comma separated list. Example: VAR1=value1,VAR2=value2

HADOOP_MAPRED_HOME=/usr/lib/hadoop-mapreduce

mapreduce.reduce.env

Environment variables for the reduce task processes added by a user, specified as a comma separated list. Example: VAR1=value1,VAR2=value2

HADOOP_MAPRED_HOME=/usr/lib/hadoop-mapreduce

yarn.app.mapreduce.am.env

Environment variables for the MapReduce App Master processes added by a user. Examples:

A=foo. This sets the environment variable A to foo.
B=$B:c. This inherits the tasktracker B environment variable.

HADOOP_MAPRED_HOME=/usr/lib/hadoop-mapreduce

yarn.app.mapreduce.am.staging-dir

The staging directory used while submitting jobs

/user

mapreduce.jobhistory.keytab

The location of the Kerberos keytab file for the MapReduce JobHistory Server

/etc/security/keytabs/mapreduce-historyserver.service.keytab

mapreduce.jobhistory.principal

Kerberos principal name for the MapReduce JobHistory Server

mapreduce-historyserver/_HOST@REALM

mapreduce.jobhistory.http.policy

Configures the HTTP endpoint for JobHistoryServer web UI. The following values are supported:

HTTP_ONLY — provides service only via HTTP;
HTTPS_ONLY — provides service only via HTTPS.

HTTP_ONLY

mapreduce.jobhistory.webapp.https.address

The HTTPS address where MapReduce JobHistory Server WebApp is running

0.0.0.0:19890

mapreduce.shuffle.ssl.enabled

Defines whether to use SSL for for the Shuffle HTTP endpoints

false

ranger-yarn-audit.xml

Parameter Description Default value

xasecure.audit.destination.solr.batch.filespool.dir

The spool directory path

/srv/ranger/hdfs_plugin/audit_solr_spool

xasecure.audit.destination.solr.urls

A URL of the Solr server to store audit events. Leave this property value empty or set it to NONE when using ZooKeeper to connect to Solr

—

xasecure.audit.destination.solr.zookeepers

Specifies the ZooKeeper connection string for the Solr destination

—

xasecure.audit.destination.solr.force.use.inmemory.jaas.config

Uses in-memory JAAS configuration file to connect to Solr

—

xasecure.audit.is.enabled

Enables Ranger audit

true

xasecure.audit.jaas.Client.loginModuleControlFlag

Specifies whether the success of the module is required, requisite, sufficient, or optional

—

xasecure.audit.jaas.Client.loginModuleName

The name of the authenticator class

—

xasecure.audit.jaas.Client.option.keyTab

The name of the keytab file to get the principal’s secret key

—

xasecure.audit.jaas.Client.option.principal

The name of the principal to be used

—

xasecure.audit.jaas.Client.option.serviceName

The name of a user or a service that wants to log in

—

xasecure.audit.jaas.Client.option.storeKey

Set this to true if you want the keytab or the principal’s key to be stored in the subject’s private credentials

false

xasecure.audit.jaas.Client.option.useKeyTab

Set this to true if you want the module to get the principal’s key from the keytab

false

ranger-yarn-security.xml

Parameter Description Default value

ranger.plugin.yarn.policy.rest.url

The URL to Ranger Admin

—

ranger.plugin.yarn.service.name

The name of the Ranger service containing policies for this instance

—

ranger.plugin.yarn.policy.cache.dir

The directory where Ranger policies are cached after successful retrieval from the source

/srv/ranger/yarn/policycache

ranger.plugin.yarn.policy.pollIntervalMs

Defines how often to poll for changes in policies

30000

ranger.plugin.yarn.policy.rest.client.connection.timeoutMs

The YARN Plugin RangerRestClient connection timeout (in milliseconds)

120000

ranger.plugin.yarn.policy.rest.client.read.timeoutMs

The YARN Plugin RangerRestClient read timeout (in milliseconds)

30000

ranger.add-yarn-authorization

Set true to use only Ranger ACLs (i.e. ignore YARN ACLs)

false

ranger.plugin.yarn.policy.rest.ssl.config.file

The path to the RangerRestClient SSL config file for the YARN plugin

/etc/yarn/conf/ranger-yarn-policymgr-ssl.xml

yarn-site.xml

Parameter Description Default value

yarn.application.classpath

The classpath for YARN applications. A list of files/directories to be added to the classpath. To add more items to the classpath, click Plus icon

/etc/hadoop/conf/*
/usr/lib/hadoop/*
/usr/lib/hadoop/lib/*
/usr/lib/hadoop-hdfs/*
/usr/lib/hadoop-hdfs/lib/*
/usr/lib/hadoop-yarn/*
/usr/lib/hadoop-yarn/lib/*
/usr/lib/hadoop-mapreduce/*
/usr/lib/hadoop-mapreduce/lib/*
/usr/lib/hive/lib/*.jar

yarn.cluster.max-application-priority

Defines the maximum application priority in a cluster. Leaf Queue-level priority: each leaf queue provides default priority by the administrator. The queue default priority will be used for any application submitted without a specified priority. $HADOOP_HOME/etc/hadoop/capacity-scheduler.xml is the configuration file for queue-level priority

yarn.log.server.url

The URL for log aggregation Server

—

yarn.log-aggregation-enable

Whether to enable log aggregation. Log aggregation collects logs from each container and moves these logs onto a file system, for example HDFS, after the application processing completes. Users can configure the yarn.nodemanager.remote-app-log-dir and yarn.nodemanager.remote-app-log-dir-suffix properties to determine, where these logs are moved to. Users can access the logs via the Application Timeline Server

true

yarn.log-aggregation.retain-seconds

Defines how long to keep aggregation logs before deleting them. The value of -1 disables logs saving. Be careful: setting this value too small will spam the NameNode

172800

yarn.nodemanager.local-dirs

The list of directories to store localized. An application localized file directory will be found in: ${yarn.nodemanager.local-dirs}/usercache/${user}/appcache/application_${appid}. Individual containers work directories, called container_${contid}, will be subdirectories of this

/srv/hadoop-yarn/nm-local

yarn.node-labels.enabled

Enables node labels feature

true

yarn.node-labels.fs-store.root-dir

The URI for NodeLabelManager. The default value is /tmp/hadoop-yarn-${user}/node-labels/ in the local filesystem

hdfs:///system/yarn/node-labels

yarn.timeline-service.bind-host

The actual address the server will bind to. If this optional address is set, the RPC and Webapp servers will bind to this address and the port, specified in yarn.timeline-service.address and yarn.timeline-service.webapp.address, respectively. This is most useful for making the service listen to all interfaces by setting to 0.0.0.0

0.0.0.0

yarn.timeline-service.leveldb-timeline-store.path

Stores file name for leveldb Timeline store

/srv/hadoop-yarn/leveldb-timeline-store

yarn.nodemanager.address

The address of the container manager in the NodeManager

0.0.0.0:8041

yarn.nodemanager.aux-services

A comma-separated list of services, where service name should only contain a-zA-Z0-9_ and cannot start with numbers

mapreduce_shuffle,spark_shuffle

yarn.nodemanager.aux-services.mapreduce_shuffle.class

The auxiliary service class to use

org.apache.hadoop.mapred.ShuffleHandler

yarn.nodemanager.aux-services.spark_shuffle.class

The class name of YarnShuffleService — an external shuffle service for Spark3 on YARN

org.apache.spark.network.yarn.YarnShuffleService

yarn.nodemanager.aux-services.spark_shuffle.classpath

The classpath for external Spark3 shuffle-service in YARN. A list of files/directories to be added to the classpath. To add more items to the classpath, click Plus icon

/usr/lib/spark3/yarn/lib/*

yarn.nodemanager.recovery.enabled

Enables the NodeManager to recover after starting

true

yarn.nodemanager.recovery.dir

The local filesystem directory, in which the NodeManager will store state, when recovery is enabled

/srv/hadoop-yarn/nm-recovery

yarn.nodemanager.remote-app-log-dir

Defines a directory for logs aggregation

/logs

yarn.nodemanager.resource-plugins

Enables additional discovery/isolation of resources on the NodeManager. By default, this parameters is empty. Acceptable values: yarn.io/gpu, yarn.io/fpga

—

yarn.nodemanager.resource-plugins.gpu.path-to-discovery-executables

When yarn.nodemanager.resource.gpu.allowed-gpu-devices=auto, YARN NodeManager needs to run GPU discovery binary (now only support nvidia-smi) to get GPU-related information. When value is empty (default), YARN NodeManager will try to locate discovery executable itself. An example of the config value is: /usr/local/bin/nvidia-smi

/usr/bin/nvidia-smi

yarn.nodemanager.resource.detect-hardware-capabilities

Enables auto-detection of node capabilities such as memory and CPU

true

yarn.nodemanager.vmem-check-enabled

Whether virtual memory limits will be enforced for containers

false

yarn.resource-types

The resource types to be used for scheduling. Use resource-types.xml to specify details about the individual resource types

—

yarn.resourcemanager.bind-host

The actual address, the server will bind to. If this optional address is set, the RPC and Webapp servers will bind to this address and the port, specified in yarn.resourcemanager.address and yarn.resourcemanager.webapp.address, respectively. This is most useful for making Resource Manager listen to all interfaces by setting to 0.0.0.0

0.0.0.0

yarn.resourcemanager.cluster-id

The name of the cluster. In the High Availability mode, this parameter is used to ensure that Resource Manager participates in leader election for this cluster and ensures that it does not affect other clusters

—

yarn.resource-types.memory-mb.increment-allocation

The FairScheduler grants memory equal to increments of this value. If you submit a task with a resource request which is not a multiple of memory-mb.increment-allocation, the request will be rounded up to the nearest increment

1024

yarn.resource-types.vcores.increment-allocation

The FairScheduler grants vcores in increments of this value. If you submit a task with resource request, that is not a multiple of vcores.increment-allocation, the request will be rounded up to the nearest increment

yarn.resourcemanager.ha.enabled

Enables Resource Manager High Availability. When enabled:

The Resource Manager starts in the Standby mode by default, and transitions to the Active mode when prompted to.
The nodes in the Resource Manager ensemble are listed in yarn.resourcemanager.ha.rm-ids.
The id of each Resource Manager either comes from yarn.resourcemanager.ha.id, if yarn.resourcemanager.ha.id is explicitly specified, or can be figured out by matching yarn.resourcemanager.address.{id} with local address.
The actual physical addresses come from the configs of the pattern {rpc-config}.{id}.

false

yarn.resourcemanager.ha.rm-ids

The list of Resource Manager nodes in the cluster when the High Availability is enabled. See description of yarn.resourcemanager.ha.enabled for full details on how this is used

—

yarn.resourcemanager.hostname

The host name of the Resource Manager

—

yarn.resourcemanager.leveldb-state-store.path

The Local path, where the Resource Manager state will be stored, when using org.apache.hadoop.yarn.server.resourcemanager.recovery.LeveldbRMStateStore as the value for yarn.resourcemanager.store.class

/srv/hadoop-yarn/leveldb-state-store

yarn.resourcemanager.monitor.capacity.queue-management.monitoring-interval

The time between invocations of this QueueManagementDynamicEditPolicy policy (in milliseconds)

1500

yarn.resourcemanager.reservation-system.enable

Enables the ReservationSystem in the ResourceManager

false

yarn.resourcemanager.reservation-system.planfollower.time-step

The frequency of the PlanFollower timer (in milliseconds). A large value is expected

1000

Resource scheduler

The type of a pluggable scheduler for Hadoop. Available values: CapacityScheduler and FairScheduler. CapacityScheduler allows for multiple-tenants to securely share a large cluster such that their applications are allocated resources in a timely manner under constraints of allocated capacities. FairScheduler allows YARN applications to share resources in large clusters fairly

CapacityScheduler

yarn.resourcemanager.scheduler.monitor.enable

Enables a set of periodic monitors (specified in yarn.resourcemanager.scheduler.monitor.policies) that affect the Scheduler

false

yarn.resourcemanager.scheduler.monitor.policies

The list of SchedulingEditPolicy classes that interact with the Scheduler. A particular module may be incompatible with the Scheduler, other policies, or a configuration of either

org.apache.hadoop.yarn.server.resourcemanager.monitor.capacity.ProportionalCapacityPreemptionPolicy

yarn.resourcemanager.monitor.capacity.preemption.observe_only

If set to true, run the policy but do not affect the cluster with preemption and kill events

false

yarn.resourcemanager.monitor.capacity.preemption.monitoring_interval

The time between invocations of this ProportionalCapacityPreemptionPolicy policy (in milliseconds)

3000

yarn.resourcemanager.monitor.capacity.preemption.max_wait_before_kill

The time between requesting a preemption from an application and killing the container (in milliseconds)

15000

yarn.resourcemanager.monitor.capacity.preemption.total_preemption_per_round

The maximum percentage of resources, preempted in a single round. By controlling this value one can throttle the pace, at which containers are reclaimed from the cluster. After computing the total desired preemption, the policy scales it back within this limit

0.1

yarn.resourcemanager.monitor.capacity.preemption.max_ignored_over_capacity

The maximum amount of resources above the target capacity ignored for preemption. This defines a deadzone around the target capacity, that helps to prevent thrashing and oscillations around the computed target balance. High values would slow the time to capacity and (absent natural.completions) it might prevent convergence to guaranteed capacity

0.1

yarn.resourcemanager.monitor.capacity.preemption.natural_termination_factor

Given a computed preemption target, account for containers naturally expiring and preempt only this percentage of the delta. This determines the rate of geometric convergence into the deadzone (MAX_IGNORED_OVER_CAPACITY). For example, a termination factor of 0.5 will reclaim almost 95% of resources within 5 * #WAIT_TIME_BEFORE_KILL, even absent natural termination

0.2

yarn.resourcemanager.nodes.exclude-path

The path to the file with nodes to exclude

/etc/hadoop/conf/exclude-path.xml

yarn.resourcemanager.nodes.include-path

The path to the file with nodes to include

/etc/hadoop/conf/include-path

yarn.resourcemanager.recovery.enabled

Enables Resource Manager to recover state after starting. If set to true, then yarn.resourcemanager.store.class must be specified

true

yarn.resourcemanager.store.class

The class to use as the persistent store. If org.apache.hadoop.yarn.server.resourcemanager.recovery.ZKRMStateStore is used, the store is implicitly fenced; meaning a single Resource Manager is able to use the store at any point in time. More details on this implicit fencing, along with setting up appropriate ACLs is discussed under yarn.resourcemanager.zk-state-store.root-node.acl

—

yarn.resourcemanager.system-metrics-publisher.enabled

The setting that controls whether YARN system metrics are published on the Timeline Server or not by Resource Manager

true

yarn.scheduler.fair.user-as-default-queue

Defines whether to use the username, associated with the allocation as the default queue name, in the event, that a queue name is not specified. If this is set to false or unset, all jobs have a shared default queue, named default. Defaults to true. If a queue placement policy is given in the allocations file, this property is ignored

true

yarn.scheduler.fair.preemption

Defines whether to use preemption

false

yarn.scheduler.fair.preemption.cluster-utilization-threshold

The utilization threshold after which the preemption kicks in. The utilization is computed as the maximum ratio of usage to capacity among all resources

0.8f

yarn.scheduler.fair.sizebasedweight

Defines whether to assign shares to individual apps based on their size, rather than providing an equal share to all apps regardless of size. When set to true, apps are weighted by the natural logarithm of one plus the app total requested memory, divided by the natural logarithm of 2

false

yarn.scheduler.fair.assignmultiple

Defines whether to allow multiple container assignments in one heartbeat

false

yarn.scheduler.fair.dynamic.max.assign

If assignmultiple is true, this parameter specifies whether to dynamically determine the amount of resources that can be assigned in one heartbeat. When turned on, about half of the non-allocated resources on the node are allocated to containers in a single heartbeat

true

yarn.scheduler.fair.max.assign

If assignmultiple is true, the maximum amount of containers that can be assigned in one heartbeat. Defaults to -1, which sets no limit

-1

yarn.scheduler.fair.locality.threshold.node

For applications that request containers on particular nodes, this parameter defines the number of scheduling opportunities since the last container assignment to wait before accepting a placement on another node. Expressed as a floating number between 0 and 1, which, as a fraction of the cluster size, is the number of scheduling opportunities to pass up. The default value of -1.0 means not to pass up any scheduling opportunities

-1.0

yarn.scheduler.fair.locality.threshold.rack

For applications, that request containers on particular racks, the number of scheduling opportunities since the last container assignment to wait before accepting a placement on another rack. Expressed as a floating point between 0 and 1, which, as a fraction of the cluster size, is the number of scheduling opportunities to pass up. The default value of -1.0 means not to pass up any scheduling opportunities

-1.0

yarn.scheduler.fair.allow-undeclared-pools

If set to true, new queues can be created at application submission time, whether because they are specified as the application queue by the submitter or because they are placed there by the user-as-default-queue property. If set to false, any time an app would be placed in a queue that is not specified in the allocations file, it is placed in the default queue instead. Defaults to true. If a queue placement policy is given in the allocations file, this property is ignored

true

yarn.scheduler.fair.update-interval-ms

The time interval, at which to lock the scheduler and recalculate fair shares, recalculate demand, and check whether anything is due for preemption

500

yarn.scheduler.minimum-allocation-mb

The minimum allocation for every container request at the Resource Manager (in MB). Memory requests, lower than this, will throw InvalidResourceRequestException

1024

yarn.scheduler.maximum-allocation-mb

The maximum allocation for every container request at the Resource Manager (in MB). Memory requests, higher than this, will throw InvalidResourceRequestException

4096

yarn.scheduler.minimum-allocation-vcores

The minimum allocation for every container request at the Resource Manager, in terms of virtual CPU cores. Requests, lower than this, will throw InvalidResourceRequestException

yarn.scheduler.maximum-allocation-vcores

The maximum allocation for every container request at the Resource Manager, in terms of virtual CPU cores. Requests, higher than this, will throw InvalidResourceRequestException

yarn.timeline-service.enabled

On the server side this parameter indicates, whether Timeline service is enabled or not. And on the client side, this parameter can be used to indicate whether client wants to use Timeline service. If this parameter is set on the client side along with security, then YARN Client tries to fetch the delegation tokens for the Timeline Server

true

yarn.timeline-service.hostname

The hostname of the Timeline service Web application

—

yarn.timeline-service.http-cross-origin.enabled

Enables cross-origin support (CORS) for Timeline Server

true

yarn.webapp.ui2.enable

In the Server side it indicates, whether the new YARN UI v2 is enabled or not

true

yarn.resourcemanager.proxy-user-privileges.enabled

If set to true, ResourceManager will have proxy-user privileges. For example: in a secure cluster, YARN requires the user hdfs delegation-tokens to do localization and log-aggregation on behalf of the user. If this is set to true, ResourceManager is able to request new hdfs delegation tokens on behalf of the user. This is needed by long-running-services, because the hdfs tokens will eventually expire and YARN requires new valid tokens to do localization and log-aggregation. Note that to enable this use case, the corresponding HDFS NameNode must have ResourceManager configured as a proxy-user so that ResourceManager can itself ask for new tokens on behalf of the user when tokens are past their max-life-time

false

yarn.resourcemanager.webapp.spnego-principal

The Kerberos principal to be used for SPNEGO filter for the Resource Manager web UI

HTTP/_HOST@REALM

yarn.resourcemanager.webapp.spnego-keytab-file

The Kerberos keytab file to be used for SPNEGO filter for the Resource Manager web UI

/etc/security/keytabs/HTTP.service.keytab

yarn.nodemanager.linux-container-executor.group

The UNIX group that the linux-container-executor should run as

yarn

yarn.resourcemanager.webapp.delegation-token-auth-filter.enabled

A flag to enable override of the default Kerberos authentication filter with the RM authentication filter to allow authentication using delegation tokens (fallback to Kerberos if the tokens are missing). Only applicable when the http authentication type is kerberos

false

yarn.resourcemanager.principal

The Kerberos principal for the Resource Manager

yarn-resourcemanager/_HOST@REALM

yarn.resourcemanager.keytab

The keytab for the Resource Manager

/etc/security/keytabs/yarn-resourcemanager.service.keytab

yarn.resourcemanager.webapp.https.address

The https address of the Resource Manager web application. If only a host is provided as the value, the webapp will be served on a random port

${yarn.resourcemanager.hostname}:8090

yarn.nodemanager.principal

The Kerberos principal for the NodeManager

yarn-nodemanager/_HOST@REALM

yarn.nodemanager.keytab

Keytab for NodeManager

/etc/security/keytabs/yarn-nodemanager.service.keytab

yarn.nodemanager.webapp.spnego-principal

The Kerberos principal to be used for SPNEGO filter for the NodeManager web interface

HTTP/_HOST@REALM

yarn.nodemanager.webapp.spnego-keytab-file

The Kerberos keytab file to be used for SPNEGO filter for the NodeManager web interface

/etc/security/keytabs/HTTP.service.keytab

yarn.nodemanager.webapp.https.address

The HTTPS address of the NodeManager web application

0.0.0.0:8044

yarn.timeline-service.http-authentication.type

Defines the authentication used for the Timeline Server HTTP endpoint. Supported values are: simple, kerberos, #AUTHENTICATION_HANDLER_CLASSNAME#

simple

yarn.timeline-service.http-authentication.simple.anonymous.allowed

Indicates if anonymous requests are allowed by the Timeline Server when using simple authentication

true

yarn.timeline-service.http-authentication.kerberos.keytab

The Kerberos keytab to be used for the Timeline Server (Collector/Reader) HTTP endpoint

/etc/security/keytabs/HTTP.service.keytab

yarn.timeline-service.http-authentication.kerberos.principal

The Kerberos principal to be used for the Timeline Server (Collector/Reader) HTTP endpoint

HTTP/_HOST@REALM

yarn.timeline-service.principal

The Kerberos principal for the timeline reader. NodeManager principal would be used for timeline collector as it runs as an auxiliary service inside NodeManager

yarn/_HOST@REALM

yarn.timeline-service.keytab

The Kerberos keytab for the timeline reader. NodeManager keytab would be used for timeline collector as it runs as an auxiliary service inside NodeManager

/etc/security/keytabs/yarn.service.keytab

yarn.timeline-service.delegation.key.update-interval

The update interval for delegation keys

86400000

yarn.timeline-service.delegation.token.renew-interval

The time to renew delegation tokens

86400000

yarn.timeline-service.delegation.token.max-lifetime

The maxim token lifetime

86400000

yarn.timeline-service.client.best-effort

Defines, whether a failure to obtain a delegation token should be considered as an application failure (false), or the client should attempt to continue to publish information without it (true)

false

yarn.timeline-service.webapp.https.address

The HTTPS address of the Timeline service web application

${yarn.timeline-service.hostname}:8190

yarn.http.policy

This configures the HTTP endpoint for Yarn Daemons. The following values are supported:

HTTP_ONLY — provides service only via HTTP;
HTTPS_ONLY — provides service only via HTTPS.

HTTP_ONLY

yarn.nodemanager.container-executor.class

The name of the container-executor Java class

org.apache.hadoop.yarn.server.nodemanager.LinuxContainerExecutor

container-executor.cfg

CAUTION

In AstraLinux, regular user UIDs can start from 100. For YARN to work correctly on AstraLinux, set the min.user.id parameter value to 100.

Parameter

Description

Default value

banned.users

A comma-separated list of users who cannot run applications

bin

min.user.id

Prevents other super-users

500

Enable CORS

Parameter

Description

Default value

yarn.nodemanager.webapp.cross-origin.enabled

Enables cross-origin support for NodeManager web-services

true

yarn.resourcemanager.webapp.cross-origin.enabled

Enables cross-origin support for ResourceManager web-services

true

yarn_site.enable_cors.active

Enables CORS (Cross-Origin Resource Sharing)

true

yarn-env.sh

Parameter

Description

Default value

YARN_RESOURCEMANAGER_OPTS

YARN ResourceManager heap memory. Sets initial (-Xms) and maximum (-Xmx) Java heap size for ResourceManager

-Xms1G -Xmx8G

YARN_NODEMANAGER_OPTS

YARN NodeManager heap memory. Sets initial (-Xms) and maximum (-Xmx) Java heap size for NodeManager

—

YARN_TIMELINESERVER_OPTS

YARN Timeline Server heap memory. Sets initial (-Xms) and maximum (-Xmx) Java heap size for Timeline Server

-Xms700m -Xmx8G

Lists of decommissioned hosts

Parameter Description Default value

DECOMMISSIONED

The list of hosts in the DECOMMISSIONED state

—

ranger-yarn-policymgr-ssl.xml

Parameter

Description

Default value

xasecure.policymgr.clientssl.keystore

The path to the keystore file used by Ranger

—

xasecure.policymgr.clientssl.keystore.credential.file

The path to the keystore credentials file

/etc/yarn/conf/ranger-yarn.jceks

xasecure.policymgr.clientssl.truststore.credential.file

The path to the truststore credentials file

/etc/yarn/conf/ranger-yarn.jceks

xasecure.policymgr.clientssl.truststore

The path to the truststore file used by Ranger

—

xasecure.policymgr.clientssl.keystore.password

The password to the keystore file

—

xasecure.policymgr.clientssl.truststore.password

The password to the truststore file

—

mapred-env.sh

Parameter

Description

Default value

HADOOP_JOB_HISTORYSERVER_OPTS

MapReduce History Server heap memory. Sets initial (-Xms) and maximum (-Xmx) Java heap size for MapReduce History Server

-Xms700m -Xmx8G

Other

Parameter

Description

Default value

GPU on YARN

Defines, whether to use GPU on YARN

false

capacity-scheduler.xml

The content of capacity-scheduler.xml, which is used by CapacityScheduler

Default capacity-scheduler.xml

fair-scheduler.xml

The content of fair-scheduler.xml, which is used by FairScheduler

Default fair-scheduler.xml

Custom mapred-site.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file mapred-site.xml

—

Ranger plugin enabled

Whether or not Ranger plugin is enabled

false

Custom yarn-site.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file yarn-site.xml

—

Custom ranger-yarn-audit.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file ranger-yarn-audit.xml

—

Custom ranger-yarn-security.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file ranger-yarn-security.xml

—

Custom ranger-yarn-policymgr-ssl.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file ranger-yarn-policymgr-ssl.xml

—

Custom mapred-env.sh

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file mapred-env.sh

—

Custom yarn-env.sh

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file yarn-env.sh

—

container-executor.cfg template

The template for the container-executor.cfg configuration file

—

YARN NodeManager component

Monitoring
Parameter	Description	Default value
Java agent path	Path to the JMX Prometheus Java agent	/usr/lib/adh-utils/jmx/jmx_prometheus_javaagent.jar
Prometheus metrics port	Port on which to display YARN NodeManager metrics in the Prometheus format	9205
Mapping config path	Path to the metrics mapping configuration file	/etc/hadoop/conf/jmx_yarn_nodemanager_metric_config.yml
Mapping config	Metrics mapping configuration file	yarn-mapping-config.yml

YARN ResourceManager component

Monitoring
Parameter	Description	Default value
Java agent path	Path to the JMX Prometheus Java agent	/usr/lib/adh-utils/jmx/jmx_prometheus_javaagent.jar
Prometheus metrics port	Port on which to display YARN ResourceManager metrics in the Prometheus format	9204
Mapping config path	Path to the metrics mapping configuration file	/etc/hadoop/conf/jmx_yarn_resourcemanager_metric_config.yml
Mapping config	Metrics mapping configuration file	yarn-mapping-config.yml

YARN Timeline Server component

Monitoring
Parameter	Description	Default value
Java agent path	Path to the JMX Prometheus Java agent	/usr/lib/adh-utils/jmx/jmx_prometheus_javaagent.jar
Prometheus metrics port	Port on which to display YARN Timeline Server metrics in the Prometheus format	9206
Mapping config path	Path to the metrics mapping configuration file	/etc/hadoop/conf/jmx_yarn_timelineserver_metric_config.yml
Mapping config	Metrics mapping configuration file	yarn-mapping-config.yml

Zeppelin

User-managed interpreters

Parameter Description Default value

Allow user-managed interpreters

Allows you to use Zeppelin interpreters with the user-managed=true property. If enabled, ADCM will preserve custom user properties when restarting Zeppelin

true

Custom interpreter.json

Enables the interpreter configuration defined in the interpreter.json ADCM field

false

interpreter.json

The custom JSON definition of interpreters to be available in the Zeppelin web UI. Defining interpreters in this way overwrites all (both user and system) interpreters settings

interpreters.json

interpreter.sh

The custom contents of the interpreter.sh script. This script is invoked on the Zeppelin startup and is used to prepare the environment for proper Zeppelin operation

interpreters.sh

zeppelin-site.xml

Parameter Description Default value

zeppelin.dep.localrepo

The local repository for the dependency loader

/srv/zeppelin/local-repo

zeppelin.server.addr

The Zeppelin server binding address

0.0.0.0

zeppelin.server.port

The server port

8180

zeppelin.server.kerberos.principal

The principal name to load from the keytab

—

zeppelin.server.kerberos.keytab

The path to the keytab file

—

zeppelin.shell.auth.type

Sets the authentication type. Possible values are SIMPLE and KERBEROS

—

zeppelin.shell.principal

The principal name to load from the keytab

—

zeppelin.shell.keytab.location

The path to the keytab file

—

zeppelin.jdbc.auth.type

Sets the authentication type. Possible values are SIMPLE and KERBEROS

—

zeppelin.jdbc.keytab.location

The path to the keytab file

—

zeppelin.jdbc.principal

The principal name to load from the keytab

—

zeppelin.jdbc.auth.kerberos.proxy.enable

When the KERBEROS authentication type is used, this parameter enables/disables proxy with the login user to get the connection

true

spark.yarn.keytab

The full path to the file that contains the keytab for the principal. This keytab will be copied to the node running the YARN Application Master via the Secure Distributed Cache, for renewing the login tickets and the delegation tokens periodically

—

spark.yarn.principal

The principal to be used to login to KDC, while running on secure HDFS

—

zeppelin.livy.keytab

The path to the keytab file

—

zeppelin.livy.principal

The principal name to load from the keytab

—

zeppelin.server.ssl.port

The port number for SSL communication

8180

zeppelin.ssl

Defines whether to use SSL

false

zeppelin.ssl.keystore.path

The path to the keystore used by Zeppelin

—

zeppelin.ssl.keystore.password

The password to access the keystore file

—

zeppelin.ssl.truststore.path

The path to the truststore used by Zeppelin

—

zeppelin.ssl.truststore.password

The password to access the truststore file

—

Zeppelin server heap memory settings

Parameter

Description

Default value

Zeppelin Server Heap Memory

Sets initial (-Xms) and maximum (-Xmx) Java heap size for Zeppelin Server

-Xms700m -Xmx1024m

Shiro Simple username/password auth

Parameter Description Default value

Users/password map

A map of type <username: password>. For example, <myUser1: password1>

—

sessionManager

A class that handles the creation, maintenance, and cleanup of all application sessions

org.apache.shiro.web.session.mgt.DefaultWebSessionManager

securityManager.sessionManager

The SessionManager component implementation for SecurityManager

$sessionManager

securityManager.sessionManager.globalSessionTimeout

The timeout in milliseconds after which a session gets expired and the user is required to log in again

86400000

shiro.loginUrl

Sets the loginUrl property for any Shiro’s default filters

/api/login

The class responsible for cookie handling

org.apache.shiro.web.servlet.SimpleCookie

cookie.name

The name of the session cookie used by Shiro. This parameter must be the same as the responseExcludeHeaders parameter in /usr/lib/knox/data/services/zeppelinui/{version}/service.xml

JSESSIONID

cookie.httpOnly

Defines whether Shiro’s cookies should have the HttpOnly attribute

true

cookie.secure

Defines whether Shiro’s cookies should have the Secure attribute

false

sessionManager.sessionIdCookie

The cookie implementation for sessionManager

$cookie

Shiro LDAP auth

Parameter Description Default value

ldapRealm

Extends the Apache Shiro provider to allow for LDAP searches and to provide group membership to the authorization provider

org.apache.zeppelin.realm.LdapRealm

ldapRealm.contextFactory.authenticationMechanism

Specifies the authentication mechanism used by the LDAP service

simple

ldapRealm.contextFactory.url

The URL of the source LDAP. For example, ldap://ldap.example.com:389

—

ldapRealm.userDnTemplate

Optional. Knox uses this value to construct the UserDN for the authentication bind. Specify the UserDN where the first attribute is {0} indicating the attribute which matches the user log in token. For example, the UserDnTemplate for Apache DS bundled with Knox is uid={0},ou=people,dc=hadoop,dc=apache,dc=org

—

ldapRealm.pagingSize

Sets the LDAP paging size

100

ldapRealm.authorizationEnabled

Enables authorization for Shiro ldapRealm

true

ldapRealm.contextFactory.systemAuthenticationMechanism

Defines the authentication mechanism to use for Shiro ldapRealm context factory. Possible values are simple and digest-md+5

—

ldapRealm.userLowerCase

Forces username returned from LDAP to be lower-cased

true

ldapRealm.memberAttributeValueTemplate

The attribute that identifies a user in the group. For exmaple: cn={0},ou=people,dc=hadoop,dc=apache,dc=org

—

ldapRealm.searchBase

The starting DN in the LDAP DIT for the search. Only subtrees of the specified subtree are searched. For example: dc=hadoop,dc=apache,dc=org

—

ldapRealm.userSearchBase

Search base for user bind DN. Defaults to the value of ldapRealm.searchBase if no value is defined. If ldapRealm.userSearchAttributeName is defined, also define a value for either ldapRealm.searchBase or ldapRealm.userSearchBase

—

ldapRealm.groupSearchBase

Search base used to search for groups. Defaults to the value of ldapRealm.searchBase. Only set if ldapRealm.authorizationEnabled=true

—

ldapRealm.groupObjectClass

Set the value to the Objectclass that identifies group entries in LDAP

groupofnames

ldapRealm.userSearchAttributeName

Specify the attribute that corresponds to the user login token. This attribute is used with the search results to compute the UserDN for the authentication bind

sAMAccountName

ldapRealm.memberAttribute

Set the value to the attribute that defines group membership. When the value is rememberer, found groups are treated as dynamic groups

member

ldapRealm.userSearchScope

Defines searchScope. Possible values are subtree, one, base

subtree

ldapRealm.groupSearchScope

Defines groupSearchScope. Possible values are subtree, one, base

subtree

ldapRealm.contextFactory.systemUsername

Set to the LDAP Service Account that the Zeppelin uses for LDAP searches. If required, specify the full account UserDN. For example: uid=guest,ou=people,dc=hadoop,dc=apache,dc=org. This account requires read permission to the search base DN

—

ldapRealm.contextFactory.systemPassword

Sets the password for systemUsername. This password will be added to the keystore using hadoop credentials

—

ldapRealm.groupSearchEnableMatchingRuleInChain

Enables support for nested groups using the LDAP_MATCHING_RULE_IN_CHAIN operator

true

ldapRealm.rolesByGroup

Optional mapping from physical groups to logical application roles. For example: "LDN_USERS":"user_role", "NYK_USERS":"user_role", "HKG_USERS":"user_role", "GLOBAL_ADMIN":"admin_role"

—

ldapRealm.allowedRolesForAuthentication

Optional list of roles that are allowed to authenticate. If not specified, all groups are allowed to authenticate (login). This changes nothing for url-specific permissions that will continue to work as specified in [urls]. For example: admin_role,user_role

—

ldapRealm.permissionsByRole

Optional. Sets permissions by role. For example: 'user_role = :ToDoItemsJdo::*, :ToDoItem::*; admin_role = *'

—

securityManager.realms

Specifies a list of Apache Shiro Realms

$ldapRealm

sessionManager

A class that handles the creation, maintenance, and cleanup of all application sessions

org.apache.shiro.web.session.mgt.DefaultWebSessionManager

securityManager.sessionManager

The SessionManager component implementation for SecurityManager

$sessionManager

securityManager.sessionManager.globalSessionTimeout

The timeout in milliseconds after which a session gets expired and the user is required to log in again

86400000

shiro.loginUrl

Sets the loginUrl property for any Shiro’s default filters

/api/login

The class responsible for cookie handling

org.apache.shiro.web.servlet.SimpleCookie

cookie.name

The name of the session cookie used by Shiro. This parameter must be the same as the responseExcludeHeaders parameter in /usr/lib/knox/data/services/zeppelinui/{version}/service.xml

JSESSIONID

cookie.httpOnly

Defines whether Shiro’s cookies should have the HttpOnly attribute

true

cookie.secure

Defines whether Shiro’s cookies should have the Secure attribute

false

sessionManager.sessionIdCookie

The cookie implementation for sessionManager

$cookie

Shiro Active Directory auth

Parameter Description Default value

activeDirectoryRealm

The Shiro realm to work with Active Directory

org.apache.zeppelin.realm.ActiveDirectoryGroupRealm

activeDirectoryRealm.systemUsername

The user name for connecting to Active Directory

—

activeDirectoryRealm.systemPassword

The user password for connecting to Active Directory

—

activeDirectoryRealm.searchBase

The base DN of your Active Directory server. For example: CN=Users,DC=SOME_GROUP,DC=COMPANY,DC=COM

—

activeDirectoryRealm.url

The URL of your Active Directory server. For example: ldap://ldap.example.com:389

—

activeDirectoryRealm.groupRolesMap

A mapping of AD groups to Apache Shiro roles. For example: 'CN=aGroupName,OU=groups,DC=SOME_GROUP,DC=COMPANY,DC=COM':'group1', 'CN=bGroupName,OU=groups,DC=SOME_GROUP,DC=COMPANY,DC=COM':'group2'

—

activeDirectoryRealm.authorizationCachingEnabled

Enables/disables caching of authorization decisions

false

securityManager.realms

The realm used by securityManager

$activeDirectoryRealm

sessionManager

A class that handles the creation, maintenance, and cleanup of all application sessions

org.apache.shiro.web.session.mgt.DefaultWebSessionManager

securityManager.sessionManager

The SessionManager component implementation for SecurityManager

$sessionManager

securityManager.sessionManager.globalSessionTimeout

The timeout in milliseconds after which a session gets expired and the user is required to log in again

86400000

shiro.loginUrl

Sets the loginUrl property for any Shiro’s default filters

/api/login

The class responsible for cookie handling

org.apache.shiro.web.servlet.SimpleCookie

cookie.name

The name of the session cookie used by Shiro. This parameter must be the same as the responseExcludeHeaders parameter in /usr/lib/knox/data/services/zeppelinui/{version}/service.xml

JSESSIONID

cookie.httpOnly

Defines whether Shiro’s cookies should have the HttpOnly attribute

true

cookie.secure

Defines whether Shiro’s cookies should have the Secure attribute

false

sessionManager.sessionIdCookie

The cookie implementation for sessionManager

$cookie

Shiro SSO Knox

Parameter Description Default value

knoxJwtRealm

The Shiro realm to work with Knox

org.apache.zeppelin.realm.ActiveDirectoryGroupRealm

knoxJwtRealm.providerUrl

The URL of your Knox server. For example: https://<knox-host>:8443/

—

knoxJwtRealm.login

The URL to perform the login action

gateway/knoxssout/api/v1/webssout

knoxJwtRealm.logout

The URL to perform the logout action

gateway/knoxssout/api/v1/webssout

knoxJwtRealm.logoutAPI

Enables the logout API action

true

knoxJwtRealm.redirectParam

A URL to redirect the client after the logout

originalUrl

knoxJwtRealm.cookieName

The name of the cookie used by Knox SSO

hadoop-jwt

knoxJwtRealm.publicKeyPath

A path to the public key (certificate) used by Knox

/usr/lib/zeppelin/gateway.cer

sessionManager

A class that handles the creation, maintenance, and cleanup of all application sessions

org.apache.shiro.web.session.mgt.DefaultWebSessionManager

securityManager.sessionManager

The SessionManager component implementation for SecurityManager

$sessionManager

securityManager.sessionManager.globalSessionTimeout

The timeout in milliseconds after which a session gets expired and the user is required to log in again

86400000

shiro.loginUrl

Sets the loginUrl property for any Shiro’s default filters

/api/login

The class responsible for cookie handling

org.apache.shiro.web.servlet.SimpleCookie

cookie.httpOnly

Defines whether Shiro’s cookies should have the HttpOnly attribute

true

cookie.secure

Defines whether Shiro’s cookies should have the Secure attribute

false

sessionManager.sessionIdCookie

The cookie implementation for sessionManager

$cookie

cookie.name

The name of the session cookie used by Shiro. This parameter must be the same as the responseExcludeHeaders parameter in /usr/lib/knox/data/services/zeppelinui/{version}/service.xml

WWW-AUTHENTICATE

knoxJwtRealm.groupPrincipalMapping

Defines principal and group mapping rules

group.principal.mapping

knoxJwtRealm.principalMapping

Defines principal mapping rules

principal.mapping

Additional configuration Shiro.ini

Parameter Description Default value

Additional main section in shiro.ini

Allows adding additional key/value pairs to the main section of the shiro.ini file

—

Additional roles section in shiro.ini

Allows adding additional key/value pairs to the roles section of the shiro.ini file

—

Additional urls section in shiro.ini

Allows adding additional key/value pairs to the urls section of the shiro.ini file

—

Credential Encryption

Parameter Description Default value

Encryption enable

Enables or disables the credential encryption feature. When enabled, Zeppelin stores configuration passwords and credentials required for interacting with other services in the encrypted form

false

Credential provider path

The path to a keystore file with secrets

jceks://file/etc/zeppelin/conf/zeppelin.jceks

Custom jceks

Set to true to use a custom JCEKS file. Set to false to use the default auto-generated JCEKS file

false

Other

Parameter

Description

Default value

Custom zeppelin-site.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file zeppelin-site.xml

—

Custom zeppelin-env.sh

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file zeppelin-env.sh

zeppelin-env.sh

Custom log4j.properties

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file log4j.properties

log4j.properties

ZooKeeper

Main

Parameter

Description

Default value

connect

The ZooKeeper connection string used by other services or clusters. It is generated automatically

—

dataDir

The location where ZooKeeper stores the in-memory database snapshots and, unless specified otherwise, the transaction log of updates to the database

/var/lib/zookeeper

zoo.cfg

Parameter Description Default value

clientPort

The port to listen for client connections, that is the port that clients attempt to connect to

2181

admin.serverPort

Embedded Jetty server port

5181

admin.enableServer

Enables the AdminServer, which is an embedded Jetty server that provides an HTTP interface to the four-letter-word commands. Set to false to use the system property

false

tickTime

The basic time unit used by ZooKeeper (in milliseconds). It is used for heartbeats. The minimum session timeout will be twice the tickTime

2000

initLimit

The timeouts that ZooKeeper uses to limit the length of the time for ZooKeeper servers in quorum to connect to the leader

syncLimit

Defines the maximum date skew between server and the leader

maxClientCnxns

This property limits the number of active connections from the host, specified by IP address, to a single ZooKeeper Server

autopurge.snapRetainCount

When enabled, ZooKeeper auto-purge feature retains the autopurge.snapRetainCount most recent snapshots and the corresponding transaction logs in the dataDir and dataLogDir respectively and deletes the rest. The minimum value is 3

autopurge.purgeInterval

The time interval, for which the purge task has to be triggered (in hours). Set to a positive integer (1 and above) to enable the auto-purging

Add key,value

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file zoo.cfg

—

SSl configuration

Parameter Description Default value

sslQuorum

Enables encrypted quorum communication

false

serverCnxnFactory

Specifies ServerCnxnFactory implementation. This should be set to NettyServerCnxnFactory in order to use TLS-based server communication

org.apache.zookeeper.server.NettyServerCnxnFactory

ssl.quorum.keyStore.location

Specifies the absolute path to the server keystore file (for example: /etc/zookeeper/ssl/keystore.jks)

—

ssl.quorum.keyStore.password

Specifies the password used when the keystore was created

—

ssl.quorum.trustStore.location

Specifies the absolute path to the server truststore file (for example: /etc/zookeeper/ssl/truststore.jks)

—

ssl.quorum.trustStore.password

Specifies the password used when the truststore is created

—

ssl.protocol

Specifies the protocol to be used in client TLS negotiation

TLSv1.2

ssl.quorum.protocol

Specifies the protocol to be used in quorum TLS negotiation

TLSv1.2

Parameter

Description

Default value

Myid matching

A mapping of the quorum hosts to internal ZooKeeper IDs

—

zookeeper-env.sh

Parameter Description Default value

ZOO_LOG_DIR

The directory to store logs

/var/log/zookeeper

ZOOPIDFILE

The directory to store the ZooKeeper process ID

/var/run/zookeeper/zookeeper_server.pid

SERVER_JVMFLAGS

Used for setting different JVM parameters connected, for example, with garbage collecting

-Xmx1024m

JAVA

A path to Java

$JAVA_HOME/bin/java

logback.xml template

Used for setting the logging level and defines which log appenders to turn on. Enabling the log appender CONSOLE directs logs to stdout. Enabling ROLLINGFILE creates the zookeeper.log file, then this file gets rotated, and expired

—

Monitoring

Prometheus settings

Group Parameter Description Default value

—

scrape_interval

Specifies how frequently to scrape targets

—

scrape_timeout

Specifies how long to wait until a scrape request times out

10s

—

Password for grafana connection

Password of a Grafana user (admprom_grafana) to connect to Prometheus

—

Prometheus users to login/logout to Prometheus

User credentials for logging into the Prometheus web interface

—

Service parameters

config.file

Path to the Prometheus server configuration file

/etc/admprom/prometheus/prometheus.yml

storage.tsdb.path

Path to the Prometheus server database

/var/lib/admprom/prometheus

web.console.libraries

Path to console management libraries

/usr/share/admprom/prometheus/console_libraries

web.console.templates

Path to Prometheus server console templates

/usr/share/admprom/prometheus/consoles

web.config.file

Specifies which web configuration file to load. The file is written in YAML format

/etc/admprom/prometheus/prometheus-auth.yml

storage.tsdb.retention.time

Specifies how long to retain samples in the storage. Supported units: y, w, d, h, m, s, ms

15d

web.listen-address

Address to access the Prometheus web interface

0.0.0.0:11200

Grafana settings

Parameter

Description

Default value

Grafana administrator’s password

Password of a Grafana administrator user

—

Grafana listen port

Port to access the Grafana web interface

11210

Node Exporter settings

Parameter

Description

Default value

Listen port

Port to access ADH host system metrics in the Prometheus format

11203

Metrics endpoint

Endpoint to which the Node Exporter exports system metrics in the Prometheus format

/metrics

SSL configuration

Parameter

Description

Default value

[Prometheus] → Enable SSL

Defines whether SSL is enabled for Prometheus

false

[Prometheus] → Certificate file

Path to the Prometheus server SSL certificate file in the PEM format

/etc/admprom/prometheus/server.crt

[Prometheus] → Private key file

Path to the private key file of the Prometheus server SSL certificate

/etc/admprom/prometheus/server.key

[Prometheus] → Certificate authority file

Path to the certificate authority file

/etc/admprom/prometheus/ca.crt

[Grafana] → Enable SSL

Defines whether SSL is enabled for Grafana

false

[Grafana] → Certificate file

Path to the Grafana server SSL certificate file in the PEM format

/etc/admprom/grafana/server.crt

[Grafana] → Private key file

Path to the private key file of the Grafana server SSL certificate

/etc/admprom/grafana/server.key

[Grafana] → Certificate authority file

Path to the certificate authority file

/etc/admprom/grafana/ca.crt

[Node-exporter] → Enable SSL

Defines whether SSL is enabled for Node Exporter

false

[Node-exporter] → Certificate file

Path to the Node Exporter server SSL certificate file in the PEM format

/etc/ssl/server.crt

[Node-exporter] → Private key file

Path to the private key file of the Node Exporter server SSL certificate

/etc/ssl/server.key

Set SSL rights for certs/key

Enables changing the owner and permissions to the SSL certificate and key files

false

Scrape config for HDFS NameNode (hdfs_namenode_scraper.yml)

Parameter Description Default value

job_name

The name of the job within which metrics will be collected

hdfs-namenode

scrape_interval

Specifies how frequently to scrape targets

—

scrape_timeout

Specifies how long to wait until a scrape request times out. Cannot be greater than the value of the scrape_interval parameter

—

Scrape config for HDFS DataNode (hdfs_datanode_scraper.yml)

Parameter Description Default value

job_name

The name of the job within which metrics will be collected

hdfs-datanode

scrape_interval

Specifies how frequently to scrape targets

—

scrape_timeout

Specifies how long to wait until a scrape request times out. Cannot be greater than the value of the scrape_interval parameter

—

Scrape config for HDFS JournalNode (hdfs_journalnode_scraper.yml)

Parameter Description Default value

job_name

The name of the job within which metrics will be collected

hdfs-journalnode

scrape_interval

Specifies how frequently to scrape targets

—

scrape_timeout

Specifies how long to wait until a scrape request times out. Cannot be greater than the value of the scrape_interval parameter

—

Scrape config for Hue (hue-scraper.yml)

Parameter Description Default value

job_name

The name of the job within which metrics will be collected

hue_exporter

scrape_interval

Specifies how frequently to scrape targets

—

scrape_timeout

Specifies how long to wait until a scrape request times out. Cannot be greater than the value of the scrape_interval parameter

—

Scrape config for Impala Daemon (impala_daemon_scraper.yml)

Parameter Description Default value

job_name

The name of the job within which metrics will be collected

impala-daemon

scrape_interval

Specifies how frequently to scrape targets

—

scrape_timeout

Specifies how long to wait until a scrape request times out. Cannot be greater than the value of the scrape_interval parameter

—

Scrape config for Impala Statestore (impala_statestore_scraper.yml)

Parameter Description Default value

job_name

The name of the job within which metrics will be collected

impala-statestore

scrape_interval

Specifies how frequently to scrape targets

—

scrape_timeout

Specifies how long to wait until a scrape request times out. Cannot be greater than the value of the scrape_interval parameter

—

Scrape config for Impala Catalog (impala_catalog_scraper.yml)

Parameter Description Default value

job_name

The name of the job within which metrics will be collected

impala-catalog

scrape_interval

Specifies how frequently to scrape targets

—

scrape_timeout

Specifies how long to wait until a scrape request times out. Cannot be greater than the value of the scrape_interval parameter

—

Scrape config for Hive Metastore (hive_metastore_scraper.yml)

Parameter Description Default value

job_name

The name of the job within which metrics will be collected

hive-metastore

scrape_interval

Specifies how frequently to scrape targets

—

scrape_timeout

Specifies how long to wait until a scrape request times out. Cannot be greater than the value of the scrape_interval parameter

—

Scrape config for Hive Server (hive_server_scraper.yml)

Parameter Description Default value

job_name

The name of the job within which metrics will be collected

hive-server

scrape_interval

Specifies how frequently to scrape targets

—

scrape_timeout

Specifies how long to wait until a scrape request times out. Cannot be greater than the value of the scrape_interval parameter

—

Scrape config for Ozone (ozone_scraper.yml)

Parameter Description Default value

job_name

The name of the job within which metrics will be collected

ozone_exporter

scrape_interval

Specifies how frequently to scrape targets

—

scrape_timeout

Specifies how long to wait until a scrape request times out. Cannot be greater than the value of the scrape_interval parameter

—

Scrape config for Kyuubi (kyuubi_scraper.yml)

Parameter Description Default value

job_name

The name of the job within which metrics will be collected

kyuubi_exporter

scrape_interval

Specifies how frequently to scrape targets

—

scrape_timeout

Specifies how long to wait until a scrape request times out. Cannot be greater than the value of the scrape_interval parameter

—

Scrape config for YARN nodemanager (yarn_nodemanager_scraper.yml)

Parameter Description Default value

job_name

The name of the job within which metrics will be collected

yarn-nodemanager

scrape_interval

Specifies how frequently to scrape targets

—

scrape_timeout

Specifies how long to wait until a scrape request times out. Cannot be greater than the value of the scrape_interval parameter

—

Scrape config for YARN resourcemanager (yarn_resourcemanager_scraper.yml)

Parameter Description Default value

job_name

The name of the job within which metrics will be collected

yarn-resourcemanager

scrape_interval

Specifies how frequently to scrape targets

—

scrape_timeout

Specifies how long to wait until a scrape request times out. Cannot be greater than the value of the scrape_interval parameter

—

Scrape config for YARN timelineserver (yarn_timelineserver_scraper.yml)

Parameter Description Default value

job_name

The name of the job within which metrics will be collected

yarn-timelineserver

scrape_interval

Specifies how frequently to scrape targets

—

scrape_timeout

Specifies how long to wait until a scrape request times out. Cannot be greater than the value of the scrape_interval parameter

—

Found a mistake? Seleсt text and press Ctrl+Enter to report it