Configuration parameters

This topic describes the parameters that can be configured for ADH services via ADCM. To read about the configuring process, refer to the relevant articles: Online installation, Offline installation.

NOTE
  • Some of the parameters become visible in the ADCM UI after the Advanced flag has been set.

  • The parameters that are set in the Custom group will overwrite the existing parameters even if they are read-only.

Airflow

Redis configuration
Parameter Description Default value

redis.conf

Redis configuration file

 — 

sentinel.conf

Sentinel configuration file

 — 

redis_port

Redis broker listen port

6379

sentinel_port

Sentinel port

26379

airflow.cfg
Parameter Description Default value

db_user

The user to connect to Metadata DB

airflow

db_password

The password to connect to Metadata DB

 — 

db_port

The port to connect to Metadata DB

3307

admin_password

The password for the web server’s admin user

 — 

server_port

The port to run the web server

8080

flower_port

The port that Celery Flower runs on

5555

worker_port

When you start an Airflow Worker, Airflow starts a tiny web server subprocess to serve the Workers local log files to the Airflow main web server, which then builds pages and sends them to users. This defines the port, on which the logs are served. The port must be free and accessible from the main web server to connect to the Workers

8793

fernet_key

The secret key to save connection passwords in the database

 — 

security

Defines which security module to use. For example, kerberos

 — 

keytab

The path to the keytab file

 — 

reinit_frequency

Sets the ticket renewal frequency

3600

principal

The Kerberos principal

ssl_active

Defines if SSL is active for Airflow

false

web_server_ssl_cert

The path to SSL certificate

/etc/ssl/certs/host_cert.cert

web_server_ssl_key

The path to SSL certificate key

/etc/ssl/host_cert.key

Logging level

Specifies the logging level for Airflow activity

INFO

Logging level for Flask-appbuilder UI

Specifies the logging level for Flask-appbuilder UI

WARNING

cfg_properties_template

The Jinja template to initialize environment variables for Airflow

External database
Parameter Description Default value

Database type

The external database type. Possible values: PostgreSQL, MySQL/MariaDB

MySQL/MariaDB

Hostname

The external database host

 — 

Custom port

The external database port

 — 

Airflow database name

The external database name

airflow

External Broker
Parameter Description Default value

Broker URL

The URL of an external broker

 — 

LDAP Security manager
Parameter Description Default value

AUTH_LDAP_SERVER

The LDAP server URI

 — 

AUTH_LDAP_BIND_USER

The path of the LDAP proxy user to bind on to the top level. Example: cn=airflow,ou=users,dc=example,dc=com

 — 

AUTH_LDAP_BIND_PASSWORD

The password of the bind user

 — 

AUTH_LDAP_SEARCH

Update with the LDAP path under which you’d like the users to have access to Airflow. Example: dc=example, dc=com

 — 

AUTH_LDAP_UID_FIELD

The UID (unique identifier) field in LDAP

 — 

AUTH_ROLES_MAPPING

The parameter for mapping the internal roles to the LDAP Active Directory groups

 — 

AUTH_LDAP_GROUP_FIELD

The LDAP user attribute which has their role DNs

 — 

AUTH_ROLES_SYNC_AT_LOGIN

A flag that indicates if all the user’s roles should be replaced on each login, or only on registration

true

PERMANENT_SESSION_LIFETIME

Sets an inactivity timeout after which users have to re-authenticate (to keep roles in sync)

1800

AUTH_LDAP_USE_TLS

Boolean whether TLS is being used

false

AUTH_LDAP_ALLOW_SELF_SIGNED

Boolean to allow self-signed certificates

true

AUTH_LDAP_TLS_CACERTFILE

Location of the certificate

 — 

flink-conf.yaml
Parameter Description Default value

jobmanager.rpc.port

The RPC port through which the JobManager is reachable. In the high availability mode, this value is ignored and the port number to connect to JobManager is generated by ZooKeeper

6123

sql-gateway.endpoint.rest.port

A port to connect to the SQL Gateway service

8083

taskmanager.network.bind-policy

The automatic address binding policy used by the TaskManager

name

parallelism.default

The system-wide default parallelism level for all execution environments

1

taskmanager.numberOfTaskSlots

The number of task slots that each TaskManager offers. Each slot runs one parallel pipeline

1

taskmanager.cpu.cores

The number of CPU cores used by TaskManager. By default, the value is set to the number of slots per TaskManager

1

taskmanager.memory.flink.size

The total memory size for the TaskExecutors

 — 

taskmanager.memory.process.size

The total process memory size for the TaskExecutors. This includes all the memory that a TaskExecutor consumes, including the total Flink memory, JVM Metaspace, and JVM Overhead. In containerized setups, this parameter should be equal to the container memory

2048m

jobmanager.memory.flink.size

The total memory size for the JobManager

 — 

jobmanager.memory.process.size

The total process memory size for the JobManager. This includes all the memory that a JobManager JVM consumes, including the total Flink memory, JVM Metaspace, and JVM Overhead. In containerized setups, this parameter should be equal to the container memory

2048m

taskmanager.heap.size

The Java heap size for the TaskManager JVM

1024m

jobmanager.memory.heap.size

The heap size for the JobManager JVM

 — 

flink.yarn.appmaster.vcores

The number of virtual cores (vcores) used by YARN application master

1

taskmanager.host

The external address of the network interface where the TaskManager runs

 — 

taskmanager.memory.task.heap.size

The size of the JVM heap memory reserved for tasks

256m

taskmanager.memory.task.off-heap.size

The size of the off-heap memory reserved for tasks

256m

taskmanager.memory.managed.size

The size of the managed memory for TaskExecutors. This is the size of off-heap memory managed by the memory manager, reserved for sorting, hash tables, caching of intermediate results, and the RocksDB state backend

256m

taskmanager.memory.framework.heap.size

The size of the JVM heap memory reserved for TaskExecutor framework that will not be allocated to task slots

256m

taskmanager.memory.framework.off-heap.size

The size of the off-heap memory reserved for TaskExecutor framework that will not be allocated to task slots

256m

taskmanager.memory.network.min

The minimum network memory size for TaskExecutors. The network memory is the off-heap memory reserved for ShuffleEnvironment (e.g. network buffers)

256m

taskmanager.memory.network.max

The maximum network memory size for TaskExecutors. The network memory is the off-heap memory reserved for ShuffleEnvironment (e.g. network buffers)

256m

taskmanager.memory.jvm-overhead.max

The maximum JVM overhead size for the TaskExecutors. This is the off-heap memory reserved for JVM overhead, such as thread stack space, compile cache, etc.

256m

taskmanager.memory.jvm-metaspace.size

The JVM metaspace size for the TaskExecutors

256m

yarn.provided.lib.dirs

A semicolon-separated list of directories with provided libraries. Flink uses these libraries to exclude the local Flink JARS uploading to accelerate the job submission process

hdfs:///apps/flink/

flink.yarn.resourcemanager.scheduler.address

The address of the scheduler interface

 — 

flink.yarn.containers.vcores

Sets the number of vcores for Flink YARN containers

1

flink.yarn.application.classpath

A list of files/directories to be added to the classpath. To add more items to the classpath, click Plus icon Plus icon

  • /etc/hadoop/conf/*

  • /usr/lib/hadoop/*

  • /usr/lib/hadoop/lib/*

  • /usr/lib/hadoop-hdfs/*

  • /usr/lib/hadoop-hdfs/lib/*

  • /usr/lib/hadoop-yarn/*

  • /usr/lib/hadoop-yarn/lib/*

  • /usr/lib/hadoop-mapreduce/*

  • /usr/lib/hadoop-mapreduce/lib/*

high-availability.cluster-id

The ID of the Flink cluster used to separate multiple Flink clusters from each other

default

high-availability.storageDir

A file system path (URI) where Flink persists metadata in the HA mode

 — 

high-availability

Defines the High Availability (HA) mode used for cluster execution

NONE

high-availability.zookeeper.quorum

The ZooKeeper quorum to use when running Flink in the HA mode with ZooKeeper

 — 

high-availability.zookeeper.path.root

The root path for Flink ZNode in Zookeeper

/flink

sql-gateway.session.check-interval

The check interval to detect idle sessions. A value <= 0 disables the checks

1 min

sql-gateway.session.idle-timeout

The timeout to close a session if no successful connection was made during this interval. A value <= 0 never closes the sessions

10 min

sql-gateway.session.max-num

The maximum number of sessions to run simultaneously

1000000

sql-gateway.worker.keepalive-time

The time to keep an idle worker thread alive. When the worker thread count exceeds sql-gateway.worker.threads.min, excessive threads are killed after this time interval

5 min

sql-gateway.worker.threads.max

The maximum number of worker threads on the SQL Gateway server

500

sql-gateway.worker.threads.min

The minimum number of worker threads. If the current number of worker threads is less than this value, the worker threads are not deleted automatically

5

security.kerberos.login.use-ticket-cache

Indicates whether to read from the Kerberos ticket cache

false

security.kerberos.login.keytab

The absolute path to the Kerberos keytab file that stores user credentials

 — 

security.kerberos.login.principal

Flink Kerberos principal

 — 

security.delegation.tokens.hive.renewer

Flink Kerberos principal for Hive

 — 

security.kerberos.login.contexts

A comma-separated list of login contexts to provide the Kerberos credentials to

 — 

security.ssl.internal.enabled

Enables SSL for internal communication channels between Flink components. This includes the communication between TaskManagers, transporting of blobs from JobManager to TaskManager, RPC-connections, etc.

false

security.ssl.internal.keystore

The path to the keystore file to be used by Flink’s internal endpoints

 — 

security.ssl.internal.truststore

The path to the truststore file used by internal Flink’s endpoints

 — 

security.ssl.internal.keystore-password

The password to the keystore file used by internal Flink’s endpoints

 — 

security.ssl.internal.truststore-password

The password to the truststore file used by internal Flink’s endpoints

 — 

security.ssl.internal.key-password

The password to decrypt the key in the keystore

 — 

security.ssl.rest.enabled

Turns on SSL for external communication via REST endpoints

false

security.ssl.rest.keystore

The Java keystore file with SSL keys and certificates to be used by Flink’s external REST endpoints

 — 

security.ssl.rest.truststore

The truststore file containing public CA certificates to verify the peer for Flink’s external REST endpoints

 — 

security.ssl.rest.keystore-password

The secret to decrypt the keystore file for Flink external REST endpoints

 — 

security.ssl.rest.truststore-password

The password to decrypt the truststore for Flink’s external REST endpoints

 — 

security.ssl.rest.key-password

The secret to decrypt the key in the keystore for Flink’s external REST endpoints

 — 

security.ssl.protocol

The TLS protocol version to be used for SSL. Accepts a single value, not a list

TLSv1.2

zookeeper.sasl.disable

Defines the SASL authentication in Zookeeper

false

Logging level

Defines the logging level for Flink activity

INFO

Other
Parameter Description Default value

Custom flink-conf.yaml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file flink-conf.yaml

 — 

log4j.properties

The contents of the log4j.properties configuration file

log4j-cli.properties

The contents of the log4j-cli.properties configuration file

HBase

hbase-site.xml
Parameter Description Default value

hbase.balancer.period

The time period to run the Region balancer in Master

300000

hbase.client.pause

General client pause value. Used mostly as value to wait before running a retry of a failed get, region lookup, etc. See hbase.client.retries.number for description of how this pause works with retries

100

hbase.client.max.perregion.tasks

The maximum number of concurrent mutation tasks the Client will maintain to a single Region. That is, if there is already hbase.client.max.perregion.tasks writes in progress for this Region, new puts won’t be sent to this Region, until some writes finishes

1

hbase.client.max.perserver.tasks

The maximum number of concurrent mutation tasks a single HTable instance will send to a single Region Server

2

hbase.client.max.total.tasks

The maximum number of concurrent mutation tasks, a single HTable instance will send to the cluster

100

hbase.client.retries.number

The maximum number of retries. It is used as maximum for all retryable operations, such as: getting a cell value, starting a row update, etc. Retry interval is a rough function based on hbase.client.pause. See the constant RETRY_BACKOFF for how the backup ramps up. Change this setting and hbase.client.pause to suit your workload

15

hbase.client.scanner.timeout.period

The Client scanner lease period in milliseconds

60000

hbase.cluster.distributed

The cluster mode. Possible values are: false — for standalone mode and pseudo-distributed setups with managed ZooKeeper; true — for fully-distributed mode with unmanaged ZooKeeper Quorum. If false, the startup will run all HBase and ZooKeeper daemons together in the one JVM, if true — one JVM instance per daemon

true

hbase.hregion.majorcompaction

The time interval between Major compactions in milliseconds. Set to 0 to disable time-based automatic Major compactions. User-requested and size-based Major compactions will still run. This value is multiplied by hbase.hregion.majorcompaction.jitter to cause compaction to start at a somewhat-random time during a given time frame

604800000

hbase.hregion.max.filesize

The maximum file size. If the total size of some Region HFiles has grown to exceed this value, the Region is split in two. There are two options of how this option works: the first is when any store size exceeds the threshold — then split, and the other is if overall Region size exceeds the threshold — then split. It can be configured by hbase.hregion.split.overallfiles

10737418240

hbase.hstore.blockingStoreFiles

If more than this number of StoreFiles exists in any Store (one StoreFile is written per flush of MemStore), updates are blocked for this Region, until a compaction is completed, or until hbase.hstore.blockingWaitTime is exceeded

16

hbase.hstore.blockingWaitTime

The time for which a Region will block updates after reaching the StoreFile limit, defined by hbase.hstore.blockingStoreFiles. After this time is elapsed, the Region will stop blocking updates, even if a compaction has not been completed

90000

hbase.hstore.compaction.max

The maximum number of StoreFiles that will be selected for a single Minor compaction, regardless of the number of eligible StoreFiles. Effectively, the value of hbase.hstore.compaction.max controls the time it takes for a single compaction to complete. Setting it larger means that more StoreFiles are included in a compaction. For most cases, the default value is appropriate

10

hbase.hstore.compaction.min

The minimum number of StoreFiles that must be eligible for compaction before compaction can run. The goal of tuning hbase.hstore.compaction.min is to avoid a situation with too many tiny StoreFiles to compact. Setting this value to 2 would cause a Minor compaction each time you have two StoreFiles in a Store, and this is probably not appropriate. If you set this value too high, all the other values will need to be adjusted accordingly. For most cases, the default value is appropriate. In the previous versions of HBase, the parameter hbase.hstore.compaction.min was called hbase.hstore.compactionThreshold

3

hbase.hstore.compaction.min.size

A StoreFile, smaller than this size, will always be eligible for Minor compaction. StoreFiles this size or larger are evaluated by hbase.hstore.compaction.ratio to determine, if they are eligible. Because this limit represents the "automatic include" limit for all StoreFiles smaller than this value, this value may need to be reduced in write-heavy environments, where many files in the 1-2 MB range are being flushed, because every StoreFile will be targeted for compaction and the resulting StoreFiles may still be under the minimum size and require further compaction. If this parameter is lowered, the ratio check is triggered more quickly. This addressed some issues seen in earlier versions of HBase, but changing this parameter is no longer necessary in most situations

134217728

hbase.hstore.compaction.ratio

For Minor compaction, this ratio is used to determine, whether a given StoreFile that is larger than hbase.hstore.compaction.min.size, is eligible for compaction. Its effect is to limit compaction of large StoreFile. The value of hbase.hstore.compaction.ratio is expressed as a floating-point decimal

1.2F

hbase.hstore.compaction.ratio.offpeak

The compaction ratio used during off-peak compactions if the off-peak hours are also configured. Expressed as a floating-point decimal. This allows for more aggressive (or less aggressive, if you set it lower than hbase.hstore.compaction.ratio) compaction during a given time period. The value is ignored if off-peak is disabled (default). This works the same as hbase.hstore.compaction.ratio

5.0F

hbase.hstore.compactionThreshold

If more than this number of StoreFiles exists in any Store (one StoreFile is written per flush of MemStore), a compaction is run to rewrite all StoreFiles into a single StoreFile. Larger values delay the compaction, but when compaction does occur, it takes longer to complete

3

hbase.hstore.flusher.count

The number of flush threads. With fewer threads, the MemStore flushes will be queued. With more threads, the flushes will be executed in parallel, increasing the load on HDFS, and potentially causing more compactions

2

hbase.hstore.time.to.purge.deletes

The amount of time to delay purging of delete markers with future timestamps. If unset or set to 0, all the delete markers, including those with future timestamps, are purged during the next Major compaction. Otherwise, a delete marker is kept until the Major compaction that occurs after the marker timestamp plus the value of this setting (in milliseconds)

0

hbase.master.ipc.address

HMaster RPC

0.0.0.0

hbase.normalizer.period

The period at which the Region normalizer runs on Master (in milliseconds)

300000

hbase.regionserver.compaction.enabled

Enables/disables compactions by setting true/false. You can further switch compactions dynamically with the compaction_switch shell command

true

hbase.regionserver.ipc.address

Region Server RPC

0.0.0.0

hbase.regionserver.regionSplitLimit

The limit for the number of Regions, after which no more Region splitting should take place. This is not hard limit for the number of Regions, but acts as a guideline for the Region Server to stop splitting after a certain limit

1000

hbase.rootdir

The directory shared by Region Servers and into which HBase persists. The URL should be fully-qualified to include the filesystem scheme. For example, to specify the HDFS directory /hbase where the HDFS instance NameNode is running at namenode.example.org on port 9000, set this value to: hdfs://namenode.example.org:9000/hbase

 — 

hbase.zookeeper.quorum

A comma-separated list of servers in the ZooKeeper ensemble. For example, host1.mydomain.com,host2.mydomain.com,host3.mydomain.com. By default, this is set to localhost for local and pseudo-distributed modes of operation. For a fully-distributed setup, this should be set to a full list of ZooKeeper ensemble servers. If HBASE_MANAGES_ZK is set in hbase-env.sh, this is the list of servers, which HBase will start/stop ZooKeeper on, as part of cluster start/stop. Client-side, the list of ensemble members is put together with the hbase.zookeeper.property.clientPort config and is passed to the ZooKeeper constructor as the connection string parameter

 — 

zookeeper.session.timeout

The ZooKeeper session timeout in milliseconds. It is used in two different ways. First, this value is processed by the ZooKeeper Client that HBase uses to connect to the ensemble. It is also used by HBase, when it starts a ZooKeeper Server (in that case the timeout is passed as the maxSessionTimeout). See more details in the ZooKeeper documentation. For example, if an HBase Region Server connects to a ZooKeeper ensemble that is also managed by HBase, then the session timeout will be the one specified by this configuration. But a Region Server that connects to an ensemble managed with a different configuration will be subjected to the maxSessionTimeout of that ensemble. So, even though HBase might propose using 90 seconds, the ensemble can have a max timeout, lower than this, and it will take precedence. The current default maxSessionTimeout that ZooKeeper ships with is 40 seconds, which is lower than HBase

90000

zookeeper.znode.parent

The root znode for HBase in ZooKeeper. All of the HBase ZooKeeper files configured with a relative path will go under this node. By default, all of the HBase ZooKeeper file paths are configured with a relative path, so they will all go under this directory unless changed

/hbase

hbase.rest.port

The port used by HBase Rest Servers

60080

hbase.zookeeper.property.authProvider.1

Specifies the ZooKeeper authentication method

hbase.security.authentication

Set the value to true to run HBase RPC with strong authentication

false

hbase.security.authorization

Set the value to true to run HBase RPC with strong authorization

false

hbase.master.kerberos.principal

The Kerberos principal used to run the HMaster process

 — 

hbase.master.keytab.file

Full path to the Kerberos keytab file to use for logging in the configured HMaster server principal

 — 

hbase.regionserver.kerberos.principal

The Kerberos principal name that should be used to run the HRegionServer process

 — 

hbase.regionserver.keytab.file

Full path to the Kerberos keytab file to use for logging in the configured HRegionServer server principal

 — 

hbase.rest.authentication.type

REST Gateway Kerberos authentication type

 — 

hbase.rest.authentication.kerberos.principal

REST Gateway Kerberos principal

 — 

hbase.rest.authentication.kerberos.keytab

REST Gateway Kerberos principal

 — 

hbase.rest.support.proxyuser

Enables running the REST server to support the proxy user mode

false

hbase.thrift.keytab.file

Thrift Kerberos keytab

 — 

hbase.rest.keytab.file

HBase REST gateway Kerberos keytab

 — 

hbase.rest.kerberos.principal

HBase REST gateway Kerberos principal

 — 

hbase.thrift.kerberos.principal

Thrift Kerberos principal

 — 

hbase.thrift.security.qop

Defines authentication, integrity, and confidentiality checking. Supported values:

  • auth-conf — authentication, integrity, and confidentiality checking;

  • auth-int — authentication and integrity checking;

  • auth — authentication checking only.

 — 

phoenix.queryserver.keytab.file

The path to the Kerberos keytab file

 — 

phoenix.queryserver.kerberos.principal

The Kerberos principal to use when authenticating. If phoenix.queryserver.kerberos.http.principal is not defined, this principal specified will be also used to both authenticate SPNEGO connections and to connect to HBase

 — 

phoenix.queryserver.kerberos.keytab

The full path to the Kerberos keytab file to use for logging in the configured HMaster server principal

 — 

phoenix.queryserver.http.keytab.file

The keytab file to use for authenticating SPNEGO connections. This configuration must be specified if phoenix.queryserver.kerberos.http.principal is configured. phoenix.queryserver.keytab.file will be used if this property is undefined

 — 

phoenix.queryserver.http.kerberos.principal

The Kerberos principal to use when authenticating SPNEGO connections. phoenix.queryserver.kerberos.principal will be used if this property is undefined

 — 

phoenix.queryserver.kerberos.http.principal

Deprecated, use phoenix.queryserver.http.kerberos.principal instead

 — 

hbase.security.authentication.ui

Enables Kerberos authentication to HBase web UI with SPNEGO

 — 

hbase.security.authentication.spnego.kerberos.principal

The Kerberos principal for SPNEGO authentication

 — 

hbase.security.authentication.spnego.kerberos.keytab

The path to the Kerberos keytab file with principals to be used for SPNEGO authentication

 — 

hbase.ssl.enabled

Defines whether SSL is enabled for web UIs

false

hadoop.ssl.enabled

Defines whether SSL is enabled for Hadoop RPC

false

ssl.server.keystore.location

The path to the keystore file

 — 

ssl.server.keystore.password

The password to the keystore

 — 

ssl.server.truststore.location

The path to the truststore to be used

 — 

ssl.server.truststore.password

The password to the truststore

 — 

ssl.server.keystore.keypassword

The password to the key in the keystore

 — 

hbase.rest.ssl.enabled

Defines whether SSL is enabled for HBase REST server

false

hbase.rest.ssl.keystore.store

The path to the keystore used by HBase REST server

 — 

hbase.rest.ssl.keystore.password

The password to the keystore

 — 

hbase.rest.ssl.keystore.keypassword

The password to the key in the keystore

 — 

hadoop.security.credential.provider.path

Path to the credential provider (jceks) containing the passwords to all services

 — 

Credential encryption
Parameter Description Default value

Encryption enable

Defines whether the credentials are encrypted

false

Credential provider path

Path to the credential provider for creating the .jceks files containing secret keys

jceks://file/etc/hbase/conf/hbase.jceks

Ranger plugin credential provider path

Path to the Ranger plugin credential provider

jceks://file/etc/hbase/conf/ranger-hbase.jceks

Custom jceks

Defines whether custom .jceks files located at the credential provider path are used (true) or auto-generated ones (false)

false

Password file name

Name of the password file in the classpath of the service if the password file is selected in the credstore options

hbase_credstore_pass

hbase-env.sh
Parameter Description Default value

HBase Master Heap Memory

Sets initial (-Xms) and maximum (-Xmx) Java heap size for HBase Master

-Xms700m -Xmx9G

Phoenix QueryServer Heap Memory

Sets initial (-Xms) and maximum (-Xmx) Java heap size for Phoenix Query server

-Xms700m -Xmx8G

HBase Thrift2 Server Heap Memory

Sets initial (-Xms) and maximum (-Xmx) Java heap size for HBase Thrift2 server

-Xms700m -Xmx8G

HBase REST Server Heap Memory

Sets initial (-Xms) and maximum (-Xmx) Java heap size for HBase Rest server

-Xms200m -Xmx8G

HBASE_OPTS

Additional Java runtime options

 — 

HBASE_CLASSPATH

The classpath for HBase. A list of files/directories to be added to the classpath. To add more items to the classpath, click Plus icon Plus icon

  • /usr/lib/phoenix/phoenix-server-hbase.jar

hbase-regionserver-env.sh
Parameter Description Default value

HBase RegionServer Heap Memory

Sets initial (-Xms) and maximum (-Xmx) Java heap size for HBase Region server

-Xms700m -Xmx9G

ranger-hbase-audit.xml
Parameter Description Default value

xasecure.audit.destination.solr.batch.filespool.dir

The spool directory path

/srv/ranger/hdfs_plugin/audit_solr_spool

xasecure.audit.destination.solr.urls

A URL of the Solr server to store audit events. Leave this property value empty or set it to NONE when using ZooKeeper to connect to Solr

 — 

xasecure.audit.destination.solr.zookeepers

Specifies the ZooKeeper connection string for the Solr destination

 — 

xasecure.audit.destination.solr.force.use.inmemory.jaas.config

Uses in-memory JAAS configuration file to connect to Solr

 — 

xasecure.audit.is.enabled

Enables Ranger audit

true

xasecure.audit.jaas.Client.loginModuleControlFlag

Specifies whether the success of the module is required, requisite, sufficient, or optional

 — 

xasecure.audit.jaas.Client.loginModuleName

The name of the authenticator class

 — 

xasecure.audit.jaas.Client.option.keyTab

The name of the keytab file to get the principal’s secret key

 — 

xasecure.audit.jaas.Client.option.principal

The name of the principal to be used

 — 

xasecure.audit.jaas.Client.option.serviceName

The name of a user or a service that wants to log in

 — 

xasecure.audit.jaas.Client.option.storeKey

Set this to true if you want the keytab or the principal’s key to be stored in the subject’s private credentials

false

xasecure.audit.jaas.Client.option.useKeyTab

Set this to true if you want the module to get the principal’s key from the keytab

false

ranger-hbase-security.xml
Parameter Description Default value

ranger.plugin.hbase.policy.rest.url

The URL to Ranger Admin

 — 

ranger.plugin.hbase.service.name

The name of the Ranger service containing policies for this instance

 — 

ranger.plugin.hbase.policy.cache.dir

The directory where Ranger policies are cached after successful retrieval from the source

/srv/ranger/hbase/policycache

ranger.plugin.hbase.policy.pollIntervalMs

Defines how often to poll for changes in policies

30000

ranger.plugin.hbase.policy.rest.client.connection.timeoutMs

The HBase Plugin RangerRestClient connection timeout (in milliseconds)

120000

ranger.plugin.hbase.policy.rest.client.read.timeoutMs

The HBase Plugin RangerRestClient read timeout (in milliseconds)

30000

ranger.plugin.hbase.policy.rest.ssl.config.file

The path to the RangerRestClient SSL config file for HBase plugin

/etc/hbase/conf/ranger-hbase-policymgr-ssl.xml

ranger-hbase-policymgr-ssl.xml
Parameter Description Default value

xasecure.policymgr.clientssl.keystore

The path to the keystore file used by Ranger

 — 

xasecure.policymgr.clientssl.keystore.credential.file

The path to the keystore credentials file

/etc/hbase/conf/ranger-hbase.jceks

xasecure.policymgr.clientssl.truststore.credential.file

The path to the truststore credentials file

/etc/hbase/conf/ranger-hbase.jceks

xasecure.policymgr.clientssl.truststore

The path to the truststore file used by Ranger

 — 

xasecure.policymgr.clientssl.keystore.password

The password to the keystore file

 — 

xasecure.policymgr.clientssl.truststore.password

The password to the truststore file

 — 

Other
Parameter Description Default value

Custom hbase-site.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file hbase-site.xml

 — 

Custom hbase-env.sh

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file hbase-env.sh

 — 

Custom hbase-regionserver-env.sh

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file hbase-regionserver-env.sh

 — 

Ranger plugin enabled

Whether or not Ranger plugin is enabled

false

Custom ranger-hbase-audit.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file ranger-hbase-audit.xml

 — 

Custom ranger-hbase-security.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file ranger-hbase-security.xml

 — 

Custom ranger-hbase-policymgr-ssl.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file ranger-hbase-policymgr-ssl.xml

 — 

Custom log4j.properties

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file log4j.properties

Custom hadoop-metrics2-hbase.properties

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file hadoop-metrics2-hbase.properties

HDFS

core-site.xml
Parameter Description Default value

fs.defaultFS

The name of the default file system. A URI whose scheme and authority determine the FileSystem implementation. The URI scheme determines the config property (fs.SCHEME.impl) naming the FileSystem implementation class. The URI authority is used to determine the host, port, etc. for a filesystem

 — 

fs.trash.checkpoint.interval

The number of minutes between trash checkpoints. Should be smaller or equal to fs.trash.interval. Every time the checkpointer runs, it creates a new checkpoint out of current and removes checkpoints, created more than fs.trash.interval minutes ago

60

fs.trash.interval

The number of minutes, after which the checkpoint gets deleted. If set to 0, the trash feature is disabled

1440

hadoop.tmp.dir

The base for other temporary directories

/tmp/hadoop-${user.name}

hadoop.zk.address

A comma-separated list of pairs <Host>:<Port>. Each corresponds to a ZooKeeper to be used by the Resource Manager for storing Resource Manager state

 — 

io.file.buffer.size

The buffer size for sequence files. The size of this buffer should probably be a multiple of hardware page size (4096 on Intel x86), and it determines, how much data is buffered during read and write operations

131072

net.topology.script.file.name

The script name, that should be invoked to resolve DNS names to NetworkTopology names. Example: the script would take host.foo.bar as an argument, and return /rack1 as the output

 — 

ha.zookeeper.quorum

A list of ZooKeeper Server addresses, separated by commas, that are to be used by the ZKFailoverController in automatic failover

 — 

ipc.client.fallback-to-simple-auth-allowed

When a client is configured to attempt a secure connection, but attempts to connect to an insecure server, that server may instuct the client to switch to SASL SIMPLE (unsecure) authentication. This setting controls whether or not the client will accept this instruction from the server. When set to false (default), the client does not allow the fallback to SIMPLE authentication and will abort the connection

false

hadoop.security.authentication

Defines the authentication type. Possible values: simple — no authentication, kerberos — enables the authentication by Kerberos

simple

hadoop.security.authorization

Enables RPC service-level authorization

false

hadoop.rpc.protection

Specifies RPC protection. Possible values:

  • authentication — authentication only;

  • integrity — performs the integrity check in addition to authentication;

  • privacy — encrypts the data in addition to integrity.

authentication

hadoop.security.auth_to_local

The value is a string containing new line characters. See Kerberos documentation for more information about the format

 — 

hadoop.http.authentication.type

Defines authentication used for the HTTP web-consoles. The supported values are: simple, kerberos, [AUTHENTICATION_HANDLER-CLASSNAME]

simple

hadoop.http.authentication.kerberos.principal

Indicates the Kerberos principal to be used for HTTP endpoint when using the kerberos authentication. The principal short name adhere to HTTP per Kerberos HTTP SPNEGO specification

HTTP/localhost@$LOCALHOST

hadoop.http.authentication.kerberos.keytab

The location of the keytab file with the credentials for the Kerberos principal used for the HTTP endpoint

/etc/security/keytabs/HTTP.service.keytab

ha.zookeeper.acl

ACLs for all znodes

 — 

hadoop.http.filter.initializers

Add to this property the org.apache.hadoop.security.AuthenticationFilterInitializer initializer class

 — 

hadoop.http.authentication.signature.secret.file

The signature secret file for signing the authentication tokens. If not set, a random secret is generated during the startup. The same secret should be used for all nodes in the cluster, JobTracker, NameNode, DataNode and TastTracker. This file should be readable only by the Unix user running the daemons

/etc/security/http_secret

hadoop.http.authentication.cookie.domain

The domain to use for the HTTP cookie that stores the authentication token. In order for authentication to work properly across all nodes in the cluster, the domain must be correctly set. There is no default value, the HTTP cookie will not have a domain working only with the hostname issuing the HTTP cookie

 — 

hadoop.ssl.require.client.cert

Defines whether client certificates are required

false

hadoop.ssl.hostname.verifier

The host name verifier to provide for HttpsURLConnections. Valid values are: DEFAULT, STRICT, STRICT_IE6, DEFAULT_AND_LOCALHOST, and ALLOW_ALL

DEFAULT

hadoop.ssl.keystores.factory.class

The KeyStoresFactory implementation to use

org.apache.hadoop.security.ssl.FileBasedKeyStoresFactory

hadoop.ssl.server.conf

A resource file from which the SSL server keystore information will be extracted. This file is looked up in the classpath, typically it should be located in Hadoop conf/ directory

ssl-server.xml

hadoop.ssl.client.conf

A resource file from which the SSL client keystore information will be extracted. This file is looked up in the classpath, typically it should be located in Hadoop conf/ directory

ssl-client.xml

User managed hadoop.security.auth_to_local

Disable automatic generation of hadoop.security.auth_to_local

false

Credential Encryption
Parameter Description Default value

Encryption enable

Enables or disables the credential encryption feature. When enabled, HDFS stores configuration passwords and credentials required for interacting with other services in the encrypted form

false

Credential provider path

The path to a keystore file with secrets

jceks://file/etc/hadoop/conf/hadoop.jceks

Ranger plugin credential provider path

The path to a Ranger keystore file with secrets

jceks://file/etc/hadoop/conf/ranger-hdfs.jceks

Custom jceks

Set to true to use a custom JCEKS file. Set to false to use the default auto-generated JCEKS file

false

Password file name

The name of the file in the service’s classpath that stores passwords

hadoop_credstore_pass

Enable CORS
Parameter Description Default value

hadoop.http.cross-origin.enabled

Enables cross-origin support for all web services

true

hadoop.http.cross-origin.allowed-origins

Comma-separated list of origins that are allowed. Values prefixed with regex are interpreted as regular expressions. Values containing wildcards (*) are possible as well, here a regular expression is generated, the use is discouraged and support is only available for backward compatibility

*

hadoop.http.cross-origin.allowed-headers

Comma-separated list of allowed headers

X-Requested-With,Content-Type,Accept,Origin,WWW-Authenticate,Accept-Encoding,Transfer-Encoding

hadoop.http.cross-origin.allowed-methods

Comma-separated list of methods that are allowed

GET,PUT,POST,OPTIONS,HEAD,DELETE

hadoop.http.cross-origin.max-age

Number of seconds a pre-flighted request can be cached

1800

core_site.enable_cors.active

Enables CORS (Cross-Origin Resource Sharing)

true

hdfs-site.xml
Parameter Description Default value

dfs.client.block.write.replace-datanode-on-failure.enable

If there is a DataNode/network failure in the write pipeline, DFSClient will try to remove the failed DataNode from the pipeline and then continue writing with the remaining DataNodes. As a result, the number of DataNodes in the pipeline is decreased. The feature is to add new DataNodes to the pipeline. This is a site-wide property to enable/disable the feature. When the cluster size is extremely small, e.g. 3 nodes or less, cluster administrators may want to set the policy to NEVER in the default configuration file or disable this feature. Otherwise, users may experience an unusually high rate of pipeline failures since it is impossible to find new DataNodes for replacement. See also dfs.client.block.write.replace-datanode-on-failure.policy

true

dfs.client.block.write.replace-datanode-on-failure.policy

This property is used only if the value of dfs.client.block.write.replace-datanode-on-failure.enable is true. Possible values:

  • ALWAYS. Always adds a new DataNode, when an existing DataNode is removed.

  • NEVER. Never adds a new DataNode.

  • DEFAULT. Let r be the replication number. Let n be the number of existing DataNodes. Add a new DataNode only, if r is greater than or equal to 3 and either:

    1. floor(r/2) is greater than or equal to n;

    2. r is greater than n and the block is hflushed/appended.

DEFAULT

dfs.client.block.write.replace-datanode-on-failure.best-effort

This property is used only if the value of dfs.client.block.write.replace-datanode-on-failure.enable is true. Best effort means, that the client will try to replace a failed DataNode in write pipeline (provided that the policy is satisfied), however, it continues the write operation in case that the DataNode replacement also fails. Suppose, the DataNode replacement fails: false — an exception should be thrown so that the write will fail; true — the write should be resumed with the remaining DataNodes. Note, that setting this property to true allows writing to a pipeline with a smaller number of DataNodes. As a result, it increases the probability of data loss

false

dfs.client.block.write.replace-datanode-on-failure.min-replication

The minimum number of replications needed not to fail the write pipeline if new DataNodes can not be found to replace failed DataNodes (could be due to network failure) in the write pipeline. If the number of the remaining DataNodes in the write pipeline is greater than or equal to this property value, continue writing to the remaining nodes. Otherwise throw exception. If this is set to 0, an exception will be thrown, when a replacement can not be found. See also dfs.client.block.write.replace-datanode-on-failure.policy

0

dfs.balancer.dispatcherThreads

The size of the thread pool for the HDFS balancer block mover — dispatchExecutor

200

dfs.balancer.movedWinWidth

The time window in milliseconds for the HDFS balancer tracking blocks and its locations

5400000

dfs.balancer.moverThreads

The thread pool size for executing block moves — moverThreadAllocator

1000

dfs.balancer.max-size-to-move

The maximum number of bytes that can be moved by the balancer in a single thread

10737418240

dfs.balancer.getBlocks.min-block-size

The minimum block threshold size in bytes to ignore, when fetching a source block list

10485760

dfs.balancer.getBlocks.size

The total size in bytes of DataNode blocks to get, when fetching a source block list

2147483648

dfs.balancer.block-move.timeout

The maximum amount of time for a block to move (in milliseconds). If set greater than 0, the balancer will stop waiting for a block move completion after this time. In typical clusters, a 3-5 minute timeout is reasonable. If the timeout is set for a large proportion of block moves, this needs to be increased. It could also be that too much work is dispatched and many nodes are constantly exceeding the bandwidth limit as a result. In that case, other balancer parameters might need to be adjusted. It is disabled (0) by default

0

dfs.balancer.max-no-move-interval

If this specified amount of time has elapsed and no blocks have been moved out of a source DataNode, one more attempt will be made to move blocks out of this DataNode in the current Balancer iteration

60000

dfs.balancer.max-iteration-time

The maximum amount of time an iteration can be run by the Balancer. After this time the Balancer will stop the iteration, and re-evaluate the work needed to be done to balance the cluster. The default value is 20 minutes

1200000

dfs.blocksize

The default block size for new files (in bytes). You can use the following suffixes to define size units (case insensitive): k (kilo), m (mega), g (giga), t (tera), p (peta), e (exa). For example, 128k, 512m, 1g, etc. You can also specify the block size in bytes (such as 134217728 for 128 MB)

134217728

dfs.client.read.shortcircuit

Turns on short-circuit local reads

true

dfs.datanode.balance.max.concurrent.moves

The maximum number of threads for DataNode balancer pending moves. This value is reconfigurable via the dfsadmin -reconfig command

50

dfs.datanode.data.dir

Determines, where on the local filesystem a DFS data node should store its blocks. If multiple directories are specified, then data will be stored in all named directories, typically on different devices. The directories should be tagged with corresponding storage types (SSD/DISK/ARCHIVE/RAM_DISK) for HDFS storage policies. The default storage type will be DISK if the directory does not have a storage type tagged explicitly. Directories, that do not exist, will be created, if the local filesystem permission allows

/srv/hadoop-hdfs/data:DISK

dfs.disk.balancer.max.disk.throughputInMBperSec

The maximum disk bandwidth, used by the disk balancer during reads from a source disk. The unit is MB/sec

10

dfs.disk.balancer.block.tolerance.percent

The parameter specifies when a good enough value is reached for any copy step (in percents). For example, if set to to 10 then getting close to 10% of the target value is considered as good enough. In other words, if the move operation is 20GB in size, if 18GB (20 * (1-10%)) can be moved, the entire operation is considered successful

10

dfs.disk.balancer.max.disk.errors

During a block move from a source to destination disk, there might be various errors. This parameter defines how many errors to tolerate before declaring a move between 2 disks (or a step) has failed

5

dfs.disk.balancer.plan.valid.interval

The maximum amount of time a disk balancer plan (a set of configurations that define the data volume to be redistributed between two disks) remains valid. This setting supports multiple time unit suffixes as described in dfs.heartbeat.interval. If no suffix is specified, then milliseconds are assumed

1d

dfs.disk.balancer.plan.threshold.percent

Defines a data storage threshold in percents at which disks start participating in data redistribution or balancing activities

10

dfs.domain.socket.path

The path to a UNIX domain socket that will be used for communication between the DataNode and local HDFS clients. If the string _PORT is present in this path, it will be replaced by the TCP port of the DataNode. The parameter is optional

/var/lib/hadoop-hdfs/dn_socket

dfs.hosts

Names a file that contains a list of hosts allowed to connect to the NameNode. The full pathname of the file must be specified. If the value is empty, all hosts are permitted

/etc/hadoop/conf/dfs.hosts

dfs.mover.movedWinWidth

The minimum time interval for a block to be moved to another location again (in milliseconds)

5400000

dfs.mover.moverThreads

Sets the balancer mover thread pool size

1000

dfs.mover.retry.max.attempts

The maximum number of retries before the mover considers the move as failed

10

dfs.mover.max-no-move-interval

If this specified amount of time has elapsed and no block has been moved out of a source DataNode, one more attempt will be made to move blocks out of this DataNode in the current mover iteration

60000

dfs.namenode.name.dir

Determines where on the local filesystem the DFS name node should store the name table (fsimage). If multiple directories are specified, then the name table is replicated in all of the directories, for redundancy

/srv/hadoop-hdfs/name

dfs.namenode.checkpoint.dir

Determines where on the local filesystem the DFS secondary name node should store the temporary images to merge. If multiple directories are specified, then the image is replicated in all of the directories for redundancy

/srv/hadoop-hdfs/checkpoint

dfs.namenode.hosts.provider.classname

The class that provides access for host files. org.apache.hadoop.hdfs.server.blockmanagement.HostFileManager is used by default that loads files specified by dfs.hosts and dfs.hosts.exclude. If org.apache.hadoop.hdfs.server.blockmanagement.CombinedHostFileManager is used, it will load the JSON file defined in dfs.hosts. To change the class name, NameNode restart is required. dfsadmin -refreshNodes only refreshes the configuration files, used by the class

org.apache.hadoop.hdfs.server.blockmanagement.CombinedHostFileManager

dfs.namenode.rpc-bind-host

The actual address, the RPC Server will bind to. If this optional address is set, it overrides only the hostname portion of dfs.namenode.rpc-address. It can also be specified per NameNode or name service for HA/Federation. This is useful for making the NameNode listen on all interfaces by setting it to 0.0.0.0

0.0.0.0

dfs.permissions.superusergroup

The name of the group of super-users. The value should be a single group name

hadoop

dfs.replication

The default block replication. The actual number of replications can be specified, when the file is created. The default is used, if replication is not specified in create time

3

dfs.journalnode.http-address

The HTTP address of the JournalNode web UI

0.0.0.0:8480

dfs.journalnode.https-address

The HTTPS address of the JournalNode web UI

0.0.0.0:8481

dfs.journalnode.rpc-address

The RPC address of the JournalNode web UI

0.0.0.0:8485

dfs.datanode.http.address

The address of the DataNode HTTP server

0.0.0.0:9864

dfs.datanode.https.address

The address of the DataNode HTTPS server

0.0.0.0:9865

dfs.datanode.address

The address of the DataNode for data transfer

0.0.0.0:9866

dfs.datanode.ipc.address

The IPC address of the DataNode

0.0.0.0:9867

dfs.namenode.http-address

The address and the base port to access the dfs NameNode web UI

0.0.0.0:9870

dfs.namenode.https-address

The secure HTTPS address of the NameNode

0.0.0.0:9871

dfs.ha.automatic-failover.enabled

Defines whether automatic failover is enabled

true

dfs.ha.fencing.methods

A list of scripts or Java classes that will be used to fence the Active NameNode during a failover

shell(/bin/true)

dfs.journalnode.edits.dir

The directory where to store journal edit files

/srv/hadoop-hdfs/journalnode

dfs.namenode.shared.edits.dir

The directory on shared storage between the multiple NameNodes in an HA cluster. This directory will be written by the active and read by the standby in order to keep the namespaces synchronized. This directory does not need to be listed in dfs.namenode.edits.dir. It should be left empty in a non-HA cluster

---

dfs.internal.nameservices

A unique nameservices identifier for a cluster or federation. For a single cluster, specify the name that will be used as an alias. For HDFS federation, specify, separated by commas, all namespaces associated with this cluster. This option allows you to use an alias instead of an IP address or FQDN for some commands, for example: hdfs dfs -ls hdfs://<dfs.internal.nameservices>. The value must be alphanumeric without underscores

 — 

dfs.block.access.token.enable

If set to true, access tokens are used as capabilities for accessing DataNodes. If set to false, no access tokens are checked on accessing DataNodes

false

dfs.namenode.kerberos.principal

The NameNode service principal. This is typically set to nn/_HOST@REALM.TLD. Each NameNode will substitute _HOST with its own fully qualified hostname during the startup. The _HOST placeholder allows using the same configuration setting on both NameNodes in an HA setup

nn/_HOST@REALM

dfs.namenode.keytab.file

The keytab file used by each NameNode daemon to login as its service principal. The principal name is configured with dfs.namenode.kerberos.principal

/etc/security/keytabs/nn.service.keytab

dfs.namenode.kerberos.internal.spnego.principal

HTTP Kerberos principal name for the NameNode

HTTP/_HOST@REALM

dfs.web.authentication.kerberos.principal

Kerberos principal name for the WebHDFS

HTTP/_HOST@REALM

dfs.web.authentication.kerberos.keytab

Kerberos keytab file for WebHDFS

/etc/security/keytabs/HTTP.service.keytab

dfs.journalnode.kerberos.principal

The JournalNode service principal. This is typically set to jn/_HOST@REALM.TLD. Each JournalNode will substitute _HOST with its own fully qualified hostname at startup. The _HOST placeholder allows using the same configuration setting on all JournalNodes

jn/_HOST@REALM

dfs.journalnode.keytab.file

The keytab file used by each JournalNode daemon to login as its service principal. The principal name is configured with dfs.journalnode.kerberos.principal

/etc/security/keytabs/jn.service.keytab

dfs.journalnode.kerberos.internal.spnego.principal

The server principal used by the JournalNode HTTP Server for SPNEGO authentication when Kerberos security is enabled. This is typically set to HTTP/_HOST@REALM.TLD. The SPNEGO server principal begins with the prefix HTTP/ by convention. If the value is *, the web server will attempt to login with every principal specified in the keytab file dfs.web.authentication.kerberos.keytab. For most deployments this can be set to ${dfs.web.authentication.kerberos.principal} that is use the value of dfs.web.authentication.kerberos.principal

HTTP/_HOST@REALM

dfs.datanode.data.dir.perm

Permissions for the directories on the local filesystem where the DFS DataNode stores its blocks. The permissions can either be octal or symbolic

700

dfs.datanode.kerberos.principal

The DataNode service principal. This is typically set to dn/_HOST@REALM.TLD. Each DataNode will substitute _HOST with its own fully qualified host name at startup. The _HOST placeholder allows using the same configuration setting on all DataNodes

dn/_HOST@REALM.TLD

dfs.datanode.keytab.file

The keytab file used by each DataNode daemon to login as its service principal. The principal name is configured with dfs.datanode.kerberos.principal

/etc/security/keytabs/dn.service.keytab

dfs.http.policy

Defines if HTTPS (SSL) is supported on HDFS. This configures the HTTP endpoint for HDFS daemons. The following values are supported: HTTP_ONLY — the service is provided only via http; HTTPS_ONLY — the service is provided only via https; HTTP_AND_HTTPS — the service is provided both via http and https

HTTP_ONLY

dfs.data.transfer.protection

A comma-separated list of SASL protection values used for secured connections to the DataNode when reading or writing block data. The possible values are:

  • authentication — provides only authentication; no integrity or privacy;

  • integrity — authentication and integrity are enabled;

  • privacy — authentication, integrity and privacy are enabled.

If dfs.encrypt.data.transfer=true, then it supersedes the setting for dfs.data.transfer.protection and enforces that all connections must use a specialized encrypted SASL handshake. This property is ignored for connections to a DataNode listening on a privileged port. In this case, it is assumed that the use of a privileged port establishes sufficient trust

 — 

dfs.encrypt.data.transfer

Defines whether or not actual block data that is read/written from/to HDFS should be encrypted on the wire. This only needs to be set on the NameNodes and DataNodes, clients will deduce this automatically. It is possible to override this setting per connection by specifying custom logic via dfs.trustedchannel.resolver.class

false

dfs.encrypt.data.transfer.algorithm

This value may be set to either 3des or rc4. If nothing is set, then the configured JCE default on the system is used (usually 3DES). It is widely believed that 3DES is more secure, but RC4 is substantially faster. Note that if AES is supported by both the client and server, then this encryption algorithm will only be used to initially transfer keys for AES

3des

dfs.encrypt.data.transfer.cipher.suites

This value can be either undefined or AES/CTR/NoPadding. If defined, then dfs.encrypt.data.transfer uses the specified cipher suite for data encryption. If not defined, then only the algorithm specified in dfs.encrypt.data.transfer.algorithm is used

 — 

dfs.encrypt.data.transfer.cipher.key.bitlength

The key bitlength negotiated by dfsclient and datanode for encryption. This value may be set to either 128, 192, or 256

128

ignore.secure.ports.for.testing

Allows skipping HTTPS requirements in the SASL mode

false

dfs.client.https.need-auth

Whether SSL client certificate authentication is required

false

httpfs-site.xml
Parameter Description Default value

httpfs.http.administrators

The ACL for the admins. This configuration is used to control who can access the default servlets for HttpFS server. The value should be a comma-separated list of users and groups. The user list comes first and is separated by a space, followed by the group list, for example: user1,user2 group1,group2. Both users and groups are optional, so you can define only users, or groups, or both of them. Notice that in all these cases you should always use the leading space in the groups list. Using the asterisk grants access to all users and groups

*

hadoop.http.temp.dir

The HttpFS temp directory

${hadoop.tmp.dir}/httpfs

httpfs.ssl.enabled

Defines whether SSL is enabled. Default is false, that is disabled

false

httpfs.hadoop.config.dir

The location of the Hadoop configuration directory

/etc/hadoop/conf

httpfs.hadoop.authentication.type

Defines the authentication mechanism used by httpfs for its HTTP clients. Valid values are simple and kerberos. If simple is used, clients must specify the username with the user.name query string parameter. If kerberos is used, HTTP clients must use HTTP SPNEGO or delegation tokens

simple

httpfs.hadoop.authentication.kerberos.keytab

The Kerberos keytab file with the credentials for the HTTP Kerberos principal used by httpfs in the HTTP endpoint. httpfs.authentication.kerberos.keytab is deprecated. Instead, use hadoop.http.authentication.kerberos.keytab

/etc/security/keytabs/httpfs.service.keytab

httpfs.hadoop.authentication.kerberos.principal

The HTTP Kerberos principal used by HttpFS in the HTTP endpoint. The HTTP Kerberos principal MUST start with HTTP/ as per Kerberos HTTP SPNEGO specification. httpfs.authentication.kerberos.principal is deprecated. Instead, use hadoop.http.authentication.kerberos.principal

HTTP/${httpfs.hostname}@${kerberos.realm}

ranger-hdfs-audit.xml
Parameter Description Default value

xasecure.audit.destination.solr.batch.filespool.dir

The spool directory path

/srv/ranger/hdfs_plugin/audit_solr_spool

xasecure.audit.destination.solr.urls

A URL of the Solr server to store audit events. Leave this property value empty or set it to NONE when using ZooKeeper to connect to Solr

 — 

xasecure.audit.destination.solr.zookeepers

Specifies the ZooKeeper connection string for the Solr destination

 — 

xasecure.audit.destination.solr.force.use.inmemory.jaas.config

Uses in-memory JAAS configuration file to connect to Solr

 — 

xasecure.audit.is.enabled

Enables Ranger audit

true

xasecure.audit.jaas.Client.loginModuleControlFlag

Specifies whether the success of the module is required, requisite, sufficient, or optional

 — 

xasecure.audit.jaas.Client.loginModuleName

The name of the authenticator class

 — 

xasecure.audit.jaas.Client.option.keyTab

The name of the keytab file to get the principal’s secret key

 — 

xasecure.audit.jaas.Client.option.principal

The name of the principal to be used

 — 

xasecure.audit.jaas.Client.option.serviceName

The name of a user or a service that wants to log in

 — 

xasecure.audit.jaas.Client.option.storeKey

Set this to true if you want the keytab or the principal’s key to be stored in the subject’s private credentials

false

xasecure.audit.jaas.Client.option.useKeyTab

Set this to true if you want the module to get the principal’s key from the keytab

false

ranger-hdfs-security.xml
Parameter Description Default value

ranger.plugin.hdfs.policy.rest.url

The URL to Ranger Admin

 — 

ranger.plugin.hdfs.service.name

The name of the Ranger service containing policies for this instance

 — 

ranger.plugin.hdfs.policy.cache.dir

The directory where Ranger policies are cached after successful retrieval from the source

/srv/ranger/hdfs/policycache

ranger.plugin.hdfs.policy.pollIntervalMs

Defines how often to poll for changes in policies

30000

ranger.plugin.hdfs.policy.rest.client.connection.timeoutMs

The HDFS Plugin RangerRestClient connection timeout (in milliseconds)

120000

ranger.plugin.hdfs.policy.rest.client.read.timeoutMs

The HDFS Plugin RangerRestClient read timeout (in milliseconds)

30000

ranger.plugin.hdfs.policy.rest.ssl.config.file

The path to the RangerRestClient SSL config file for the HDFS plugin

/etc/hadoop/conf/ranger-hdfs-policymgr-ssl.xml

httpfs-env.sh
Parameter Description Default value

HADOOP_CONF_DIR

Hadoop configuration directory

/etc/hadoop/conf

HADOOP_LOG_DIR

Location of the log directory

${HTTPFS_LOG}

HADOOP_PID_DIR

PID file directory location

${HTTPFS_TEMP}

HTTPFS_SSL_ENABLED

Defines if SSL is enabled for httpfs

false

HTTPFS_SSL_KEYSTORE_FILE

The path to the keystore file

admin

HTTPFS_SSL_KEYSTORE_PASS

The password to access the keystore

admin

Hadoop options
Parameter Description Default value

HDFS_NAMENODE_OPTS

NameNode Heap Memory. Sets initial (-Xms) and maximum (-Xmx) Java heap memory size and environment options for the NameNode

-Xms1G -Xmx8G

HDFS_DATANODE_OPTS

DataNode Heap Memory. Sets initial (-Xms) and maximum (-Xmx) Java heap memory size and environment options for the DataNode

-Xms700m -Xmx8G

HDFS_HTTPFS_OPTS

HttpFS Heap Memory. Sets initial (-Xms) and maximum (-Xmx) Java heap memory size and environment options for the httpfs server

-Xms700m -Xmx8G

HDFS_JOURNALNODE_OPTS

JournalNode Heap Memory. Sets initial (-Xms) and maximum (-Xmx) Java heap memory size and environment options for the JournalNode

-Xms700m -Xmx8G

HDFS_ZKFC_OPTS

ZKFC Heap Memory. Sets initial (-Xms) and maximum (-Xmx) Java heap memory size and environment options for ZKFC

-Xms500m -Xmx8G

ssl-server.xml
Parameter Description Default value

ssl.server.truststore.location

The truststore to be used by NameNodes and DataNodes

 — 

ssl.server.truststore.password

The password to the truststore

 — 

ssl.server.truststore.type

The truststore file format

jks

ssl.server.truststore.reload.interval

The truststore reload check interval (in milliseconds)

10000

ssl.server.keystore.location

The path to the keystore file used by NameNodes and DataNodes

 — 

ssl.server.keystore.password

The password to the keystore

 — 

ssl.server.keystore.keypassword

The password to the key in the keystore

 — 

ssl.server.keystore.type

The keystore file format

 — 

ssl-client.xml
Parameter Description Default value

ssl.client.truststore.location

The truststore to be used by NameNodes and DataNodes

 — 

ssl.client.truststore.password

The password to the truststore

 — 

ssl.client.truststore.type

The truststore file format

jks

ssl.client.truststore.reload.interval

The truststore reload check interval (in milliseconds)

10000

ssl.client.keystore.location

The path to the keystore file used by NameNodes and DataNodes

 — 

ssl.client.keystore.password

The password to the keystore

 — 

ssl.client.keystore.keypassword

The password to the key in the keystore

 — 

ssl.client.keystore.type

The keystore file format

 — 

Lists of decommissioned and in maintenance hosts
Parameter Description Default value

DECOMMISSIONED

When an administrator decommissions a DataNode, the DataNode will first be transitioned into DECOMMISSION_INPROGRESS state. After all blocks belonging to that DataNode are fully replicated elsewhere based on each block replication factor, the DataNode will be transitioned to DECOMMISSIONED state. After that, the administrator can shutdown the node to perform long-term repair and maintenance that could take days or weeks. After the machine has been repaired, the machine can be recommissioned back to the cluster

 — 

IN_MAINTENANCE

Sometimes administrators only need to take DataNodes down for minutes/hours to perform short-term repair/maintenance. For such scenarios, the HDFS block replication overhead, incurred by decommission, might not be necessary and a light-weight process is desirable. And that is what maintenance state is used for. When an administrator puts a DataNode in the maintenance state, the DataNode will first be transitioned to ENTERING_MAINTENANCE state. As long as all blocks belonging to that DataNode, are minimally replicated elsewhere, the DataNode will immediately be transitioned to IN_MAINTENANCE state. After the maintenance has completed, the administrator can take the DataNode out of the maintenance state. In addition, maintenance state supports the timeout that allows administrators to configure the maximum duration, in which a DataNode is allowed to stay in the maintenance state. After the timeout, the DataNode will be transitioned out of maintenance state automatically by HDFS without human intervention

 — 

Other
Parameter Description Default value

Additional nameservices

Additional (internal) names for an HDFS cluster that allows querying another HDFS cluster from the current one

 — 

Custom core-site.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file core-site.xml

 — 

Custom hdfs-site.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file hdfs-site.xml

 — 

Custom httpfs-site.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file httpfs-site.xml

 — 

Ranger plugin enabled

Whether or not Ranger plugin is enabled

 — 

Custom ranger-hdfs-audit.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file ranger-hdfs-audit.xml

 — 

Custom ranger-hdfs-security.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file ranger-hdfs-security.xml

 — 

Custom ranger-hdfs-policymgr-ssl.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file ranger-hdfs-policymgr-ssl.xml

 — 

Custom httpfs-env.sh

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file httpfs-env.sh

 — 

Custom ssl-server.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file ssl-server.xml

 — 

Custom ssl-client.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file ssl-client.xml

 — 

Topology script

The topology script used in HDFS

 — 

Topology data

An otional text file to map host names to the rack number for topology script. Stored to /etc/hadoop/conf/topology.data

 — 

Custom log4j.properties

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file log4j.properties

Custom httpfs-log4j.properties

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file httpfs-log4j.properties

Hive

hive-env.sh
Parameter Description Default value

HADOOP_CLASSPATH

A list of files/directories to be added to the classpath. To add more items to the classpath, click Plus icon Plus icon

  • /etc/tez/conf/

  • /usr/lib/tez/*

  • /usr/lib/tez/lib/*

HIVE_HOME

The Hive home directory

/usr/lib/hive

METASTORE_PORT

The Hive Metastore port

9083

HADOOP_CLIENT_OPTS

Hadoop client options. For example, JVM startup parameters

$HADOOP_CLIENT_OPTS -Djava.io.tmpdir={{ cluster.config.java_tmpdir | d('/tmp') }}

hive-server2-env.sh
Parameter Description Default value

HADOOP_CLIENT_OPTS

Hadoop client options for HiveServer2

-Xms256m -Xmx256m

HIVE_AUX_JARS_PATH

Allows including custom JAR files to the Hive’s classpath. A list of files/directories to be added to the classpath. To add more items to the classpath, click Plus icon Plus icon

 — 

hive-metastore-env.sh
Parameter Description Default value

HADOOP_CLIENT_OPTS

Hadoop client options for Hive Metastore

-Xms256m -Xmx256m

Credential Encryption
Parameter Description Default value

Encryption enable

Enables or disables the credential encryption feature. When enabled, Hive stores configuration passwords and credentials required for interacting with other services in the encrypted form

false

Credential provider path

The path to a keystore file with secrets

jceks://file/etc/hive/conf/hive.jceks

Ranger plugin credential provider path

The path to a Ranger keystore file with secrets

jceks://file/etc/hive/conf/ranger-hive.jceks

Custom jceks

Set to true to use a custom JCEKS file. Set to false to use the default auto-generated JCEKS file

false

Password file name

The name of the file in the service’s classpath that stores passwords

hive_credstore_pass

hive-site.xml
Parameter Description Default value

hive.cbo.enable

When set to true, enables the cost-based optimizer that uses the Calcite framework

true

hive.compute.query.using.stats

When set to true, Hive will answer a few queries like min, max, and count(1) purely using statistics stored in the Metastore. For basic statistics collection, set the configuration property hive.stats.autogather to true. For more advanced statistics collection, run the ANALYZE TABLE queries

false

hive.execution.engine

Selects the execution engine. Supported values are: mr (Map Reduce, default), tez (Tez execution, for Hadoop 2 only), or spark (Spark execution, for Hive 1.1.0 onward)

Tez

hive.log.explain.output

When enabled, logs the EXPLAIN EXTENDED output for the query at log4j INFO level and in the HiveServer2 web UI (Drilldown → Query Plan). Starting Hive 3.1.0, this configuration property only logs as the log4j INFO. To log the EXPLAIN EXTENDED output in WebUI/Drilldown/Query Plan in Hive 3.1.0 and later, use hive.server2.webui.explain.output

true

hive.metastore.event.db.notification.api.auth

Defines whether the Metastore should perform the authorization against database notification related APIs such as get_next_notification. If set to true, then only the superusers in proxy settings have the permission

false

hive.metastore.uris

The Metastore URI used to access metadata in a remote metastore setup. For a remote metastore, you should specify the Thrift metastore server URI: thrift://<hostname>:<port> where <hostname> is a name or IP address of the Thrift metastore server, <port> is the port, on which the Thrift server is listening

 — 

hive.metastore.warehouse.dir

The absolute HDFS file path of the default database for the warehouse, that is local to the cluster

/apps/hive/warehouse

hive.server2.enable.doAs

Impersonate the connected user

false

hive.stats.fetch.column.stats

Annotation of the operator tree with statistics information requires column statistics. Column statistics are fetched from the Metastore. Fetching column statistics for each needed column can be expensive, when the number of columns is high. This flag can be used to disable fetching of column statistics from the Metastore

false

hive.tez.container.size

By default, Tez will spawn containers of the size of a mapper. This parameter can be used to overwrite the default value

1024

hive.support.concurrency

Defines whether Hive should support concurrency or not. A ZooKeeper instance must be up and running for the default Hive Lock Manager to support read/write locks

false

hive.txn.manager

Set this to org.apache.hadoop.hive.ql.lockmgr.DbTxnManager as part of turning on Hive transactions. The default DummyTxnManager replicates pre-Hive-0.13 behavior and provides no transactions

org.apache.hadoop.hive.ql.lockmgr.DummyTxnManager

hive.users.in.admin.role

A comma-separated list of users to be assigned the administrator role when the Metastore starts

 — 

javax.jdo.option.ConnectionUserName

The metastore database user name

APP

javax.jdo.option.ConnectionPassword

The password for the metastore user name

 — 

javax.jdo.option.ConnectionURL

The JDBC connection URI used to access the data stored in the local Metastore setup. Use the following connection URI: jdbc:<datastore type>://<node name>:<port>/<database name> where:

  • <node name> is the host name or IP address of the data store;

  • <data store type> is the type of the data store;

  • <port> is the port on which the data store listens for remote procedure calls (RPC);

  • <database name> is the name of the database.

For example, the following URI specifies a local metastore that uses MySQL as a data store: jdbc:mysql://hostname23:3306/metastore

jdbc:postgresql://{{ groups['adpg.adpg'][0] | d(omit) }}:5432/hive

javax.jdo.option.ConnectionDriverName

The JDBC driver class name used to access Hive Metastore

org.postgresql.Driver

hive.server2.transport.mode

Sets the transport mode

binary

hive.server2.thrift.port

The port number used for the binary connection with Thrift Server2

10000

hive.server2.thrift.http.port

The port number used for the HTTP connection with Thrift Server2

10001

hive.server2.thrift.http.path

The HTTP endpoint of the Thrift Server2 service

cliservice

hive.metastore.transactional.event.listeners

The listener class that stores events in a database

org.apache.hive.hcatalog.listener.DbNotificationListener

hive.metastore.dml.events

Indicates whether Hive should track DML events

true

hive.server2.authentication.kerberos.principal

HiveServer2 Kerberos principal

 — 

hive.server2.authentication.kerberos.keytab

The path to the Kerberos keytab file containing the HiveServer2 principal

 — 

hive.server2.authentication.spnego.principal

The SPNEGO Kerberos principal

 — 

hive.server2.webui.spnego.principal

The SPNEGO Kerberos principal to access Web UI

 — 

hive.server2.webui.spnego.keytab

The SPNEGO Kerberos keytab file to access Web UI

 — 

hive.server2.webui.use.spnego

Defines whether to use Kerberos SPNEGO for Web UI access

false

hive.server2.authentication.spnego.keytab

The path to SPNEGO principal

 — 

hive.server2.authentication

Sets the authentication mode

NONE

hive.metastore.sasl.enabled

If true, the Metastore Thrift interface will be secured with SASL. Clients must authenticate with Kerberos

false

hive.metastore.kerberos.principal

The service principal for the metastore Thrift server. The _HOST token will be automatically replaced with the appropriate host name

 — 

hive.metastore.kerberos.keytab.file

The path to the Kerberos keytab file containing the metastore Thrift server’s service principal

 — 

hive.server2.use.SSL

Defines whether to use SSL for HiveServer2

false

hive.server2.keystore.path

The keystore to be used by HiveServer2

 — 

hive.server2.keystore.password

The password to the HiveServer2 keystore

 — 

hive.server2.truststore.path

The truststore to be used by HiveServer2

 — 

hive.server2.truststore.password

The password to the HiveServer2 truststore

 — 

hive.server2.webui.use.ssl

Defines whether to use SSL for the Hive web UI

false

hive.server2.webui.keystore.path

The path to the keystore file used to access the Hive web UI

 — 

hive.server2.webui.keystore.password

The password to the keystore file used to access the Hive web UI

 — 

hive.ssl.protocol.blacklist

A comma-separated list of TLS versions that cannot be used by Hive

SSLv2Hello,SSLv3,TLSv1,TLSv1.1

metastore.keystore.path

The path to the Hive Metastore keystore

 — 

metastore.keystore.password

The password to the Hive Metastore keystore

 — 

metastore.truststore.path

The path to the Hive Metastore truststore

 — 

metastore.use.SSL

Defines whether to use SSL for interaction with Hive Metastore

false

metastore.ssl.protocol.blacklist

A comma-separated list of TLS versions that cannot be used for communication with Hive Metastore

SSLv2Hello,SSLv2,SSLv3,TLSv1,TLSv1.1

iceberg.engine.hive.enabled

Enables Iceberg tables support

true

hive.security.authorization.sqlstd.confwhitelist.append

A regex to append configuration properties to the white list in addition to hive.security.authorization.sqlstd.confwhitelist

kyuubi\.operation\.handle|kyuubi\.client\.version|kyuubi\.client\.ipAddress|tez\.application\.tags

hive.server2.support.dynamic.service.discovery

Defines whether to support dynamic service discovery via ZooKeeper

true

hive.zookeeper.quorum

A comma-separated list of ZooKeeper servers (<host>:<port>) running in the cluster

 — 

hive.server2.zookeeper.namespace

Specifies the root namespace on ZooKeeper

hiveserver2

hive.cluster.delegation.token.store.class

The name of the class that implements the delegation token store system

org.apache.hadoop.hive.metastore.security.ZooKeeperTokenStore

Custom log4j.properties
Parameter Description Default value

HiveServer2 hive-log4j.properties

The Log4j configuration used for logging HiveServer2’s activity

hive-log4j.properties

Hive Metastore hive-log4j2.properties

The Log4j2 configuration used for logging Hive Metastore’s activity

hive-log4j2.properties

Hive Beeline beeline-log4j2.properties

The Log4j2 configuration used for logging Hive Beeline’s activity

beeline-log4j2.properties

ranger-hive-audit.xml
Parameter Description Default value

xasecure.audit.destination.solr.batch.filespool.dir

The spool directory path

/srv/ranger/hdfs_plugin/audit_solr_spool

xasecure.audit.destination.solr.urls

A URL of the Solr server to store audit events. Leave this property value empty or set it to NONE when using ZooKeeper to connect to Solr

 — 

xasecure.audit.destination.solr.zookeepers

Specifies the ZooKeeper connection string for the Solr destination

 — 

xasecure.audit.destination.solr.force.use.inmemory.jaas.config

Uses in-memory JAAS configuration file to connect to Solr

 — 

xasecure.audit.is.enabled

Enables Ranger audit

true

xasecure.audit.jaas.Client.loginModuleControlFlag

Specifies whether the success of the module is required, requisite, sufficient, or optional

 — 

xasecure.audit.jaas.Client.loginModuleName

The name of the authenticator class

 — 

xasecure.audit.jaas.Client.option.keyTab

The name of the keytab file to get the principal’s secret key

 — 

xasecure.audit.jaas.Client.option.principal

The name of the principal to be used

 — 

xasecure.audit.jaas.Client.option.serviceName

The name of a user or a service that wants to log in

 — 

xasecure.audit.jaas.Client.option.storeKey

Set this to true if you want the keytab or the principal’s key to be stored in the subject’s private credentials

false

xasecure.audit.jaas.Client.option.useKeyTab

Set this to true if you want the module to get the principal’s key from the keytab

false

ranger-hive-security.xml
Parameter Description Default value

ranger.plugin.hive.policy.rest.url

The URL to Ranger Admin

 — 

ranger.plugin.hive.service.name

The name of the Ranger service containing policies for this instance

 — 

ranger.plugin.hive.policy.cache.dir

The directory where Ranger policies are cached after successful retrieval from the source

/srv/ranger/hive/policycache

ranger.plugin.hive.policy.pollIntervalMs

Defines how often to poll for changes in policies

30000

ranger.plugin.hive.policy.rest.client.connection.timeoutMs

The Hive Plugin RangerRestClient connection timeout (in milliseconds)

120000

ranger.plugin.hive.policy.rest.client.read.timeoutMs

The Hive Plugin RangerRestClient read timeout (in milliseconds)

30000

xasecure.hive.update.xapolicies.on.grant.revoke

Controls Hive Ranger policy update from SQL Grant/Revoke commands

true

ranger.plugin.hive.policy.rest.ssl.config.file

The path to the RangerRestClient SSL config file for the Hive plugin

/etc/hive/conf/ranger-hive-policymgr-ssl.xml

ranger-hive-policymgr-ssl.xml
Parameter Description Default value

xasecure.policymgr.clientssl.keystore

The path to the keystore file used by Ranger

 — 

xasecure.policymgr.clientssl.keystore.credential.file

The path to the keystore credentials file

/etc/hive/conf/ranger-hive.jceks

xasecure.policymgr.clientssl.truststore.credential.file

The path to the truststore credentials file

/etc/hive/conf/ranger-hive.jceks

xasecure.policymgr.clientssl.truststore

The path to the truststore file used by Ranger

 — 

xasecure.policymgr.clientssl.keystore.password

The password to the keystore file

 — 

xasecure.policymgr.clientssl.truststore.password

The password to the truststore file

 — 

tez-site.xml
Parameter Description Default value

tez.am.resource.memory.mb

The amount of memory in MB, that YARN will allocate to the Tez Application Master. The size increases with the size of the DAG

1024

tez.history.logging.service.class

Enables Tez to use the Timeline Server for History Logging

org.apache.tez.dag.history.logging.ats.ATSHistoryLoggingService

tez.lib.uris

HDFS paths containing the Tez JAR files

${fs.defaultFS}/apps/tez/tez-0.10.3.tar.gz

tez.task.resource.memory.mb

The amount of memory used by launched tasks in TEZ containers. Usually this value is set in the DAG

1024

tez.tez-ui.history-url.base

The URL where the Tez UI is hosted

 — 

tez.use.cluster.hadoop-libs

Specifies, whether Tez will use the cluster Hadoop libraries

true

nginx.conf
Parameter Description Default value

ssl_certificate

The path to the SSL certificate for Nginx

/etc/ssl/certs/host_cert.cert

ssl_certificate_key

The path to the SSL certificate key for Nginx

/etc/ssl/host_cert.key

ssl_protocols

A list of allowed TLS protocols to set up SSL connection

TLSv1.2

nginx_http_port

Nginx HTTP port

8089

nginx_https_port

Nginx HTTPS port

9999

Other
Parameter Description Default value

ACID Transactions

Defines whether to enable ACID transactions

false

Database type

The type of the external database used for Hive Metastore

postgres

Custom hive-site.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file hive-site.xml

 — 

Custom hive-env.sh

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file hive-env.sh

 — 

Ranger plugin enabled

Whether or not Ranger plugin is enabled

false

Custom ranger-hive-audit.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file ranger-hive-audit.xml

 — 

Custom ranger-hive-security.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file ranger-hive-security.xml

 — 

Custom ranger-hive-policymgr-ssl.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file ranger-hive-policymgr-ssl.xml

 — 

Custom tez-site.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file tez-site.xml

 — 

beeline-hs2-connection.xml

An XML template to generate property-value pairs from the hive_beeline_connection_conf object

beeline-hs2-connection.xml

HUE

The HUE Server component
hue.ini syntax

The hue.ini configuration file displayed in ADCM has different syntax from its original syntax. In the original file, the nesting level is determined by placing the section names into the corresponding number of square brackets. Example:

[notebook]
show_notebooks=true
[[interpreters]]
[[[mysql]]]
name = MySQL
interface=sqlalchemy
options='{"url": "mysql://root:secret@database:3306/hue"}'
[[[hive]]]
name=Hive
interface=hiveserver2

In ADCM, the nesting level is determined by separating the section names with periods. The structure from the above example will look the following way:

notebook.show_notebooks: true
notebook.interpreters.mysql.name: MySQL
notebook.interpreters.mysql.interface: sqlalchemy
notebook.interpreters.mysql.options: '{"url": "mysql://root:secret@database:3306/hue"}'
notebook.interpreters.hive.name: Hive
notebook.interpreters.hive.interface: hiveserver2
hue.ini
Parameter Description Default value

desktop.http_host

HUE Server listening IP address

0.0.0.0

desktop.http_port

HUE Server listening port

8000

desktop.use_cherrypy_server

Defines whether CherryPy (true) or Gunicorn (false) is used as the webserver

false

desktop.gunicorn_work_class

Gunicorn work class: gevent, eventlet, gthread, or sync

gthread

desktop.secret_key

Random string used for secure hashing in the session store

jFE93j;2[290-eiw.KEiwN2s3['d;/.q[eIW^y#e=+Iei*@Mn<qW5o

desktop.enable_xff_for_hive_impala

Defines whether the X-Forwarded-For header is used if Hive or Impala require it

false

desktop.enable_x_csrf_token_for_hive_impala

Defines whether the X-CSRF-Token header is used if Hive or Impala require it

false

desktop.app_blacklist

Comma-separated list of apps to not load at server startup

security,pig,sqoop,oozie,hbase,search

desktop.auth.backend

Comma-separated list of authentication backend combinations in order of priority

desktop.auth.backend.AllowFirstUserDjangoBackend

desktop.database.host

HUE Server database network or IP address

{% raw -%}{{ groups['adpg.adpg'][0] | d(omit) }}{% endraw -%}

desktop.database.port

HUE Server database network port

5432

desktop.database.engine

Engine used by the HUE Server database

postgresql_psycopg2

desktop.database.user

Admin username for the HUE Server database

hue

desktop.database.name

HUE Server database name

hue

desktop.database.password

Password for the desktop.database.user username

 — 

Interpreter Impala
Parameter Description Default value

notebook.interpreters.impala.name

Impala interpreter name

impala

notebook.interpreters.impala.interface

Interface for the Impala interpreter

hiveserver2

impala.server_host

Host of the Impala Server (one of the Impala Daemon components)

 — 

impala.server_port

Port of the Impala Server

21050

impala.impersonation_enabled

Enables the impersonation mechanism during interaction with Impala

true

impala.impala_conf_dir

Path to the Impala configuration directory that contains the impalad_flags file

/etc/hue/conf

impala.ssl.cacerts

Path to the CA certificates

/etc/pki/tls/certs/ca-bundle.crt

impala.ssl.validate

Defines whether HUE should validate certificates received from the server

false

impala.ssl.enabled

Enables SSL communication for this server

false

impala.impala_principal

Kerberos principal name for Impala

 — 

Interpreter HDFS
Parameter Description Default value

hadoop.hdfs_clusters.default.webhdfs_url

WebHDFS or HttpFS endpoint link for accessing HDFS data

 — 

hadoop.hdfs_clusters.default.hadoop_conf_dir

Path to the directory of the Hadoop configuration files

/etc/hadoop/conf

hadoop.hdfs_clusters.default.security_enabled

Defines whether the Hadoop cluster is secured by Kerberos

false

hadoop.hdfs_clusters.default.ssl_cert_ca_verify

Defines whether to verify SSL certificates against the CA

false

Interpreter Hive
Parameter Description Default value

notebook.interpreters.hive.name

Hive interpreter name

hive

notebook.interpreters.hive.interface

Interface for the Hive interpreter

hiveserver2

beeswax.hive_discovery_hs2

Defines whether to use service discovery for HiveServer2

true

beeswax.hive_conf_dir

Path to the Hive configuration directory containing the hive-site.xml file

/etc/hive/conf

beeswax.use_sasl

Defines whether to use the SASL framework to establish connection to host

true

beeswax.hive_discovery_hiveserver2_znode

Hostname of the znode of the HiveServer2 if Hive is using ZooKeeper service discovery mode

hive.server2.zookeeper.namespace

libzookeeper.ensemble

List of ZooKeeper ensemble members hosts and ports

host1:2181,host2:2181,host3:2181

libzookeeper.principal_name

Kerberos principal name for ZooKeeper

 — 

Interpreter YARN
Parameter Description Default value

hadoop.yarn_clusters.default.resourcemanager_host

Network address of the host where the Resource Manager is running

 — 

hadoop.yarn_clusters.default.resourcemanager_port

Port listened by the Resource Manager IPC

 — 

hadoop.yarn_clusters.default.submit_to

Defines whether the jobs are submitted to this cluster

true

hadoop.yarn_clusters.default.logical_name

Resource Manager logical name (required for High Availability mode)

 — 

hadoop.yarn_clusters.default.security_enabled

Defines whether the YARN cluster is secured by Kerberos

false

hadoop.yarn_clusters.default.ssl_cert_ca_verify

Defines whether to verify the SSL certificates from YARN Rest APIs against the CA when using the secure mode (HTTPS)

false

hadoop.yarn_clusters.default.resourcemanager_api_url

URL of the Resource Manager API

 — 

hadoop.yarn_clusters.default.proxy_api_url

URL of the first Resource Manager API

 — 

hadoop.yarn_clusters.default.history_server_api_url

URL of the History Server API

 — 

hadoop.yarn_clusters.default.spark_history_server_url

URL of the Spark History Server

 — 

hadoop.yarn_clusters.default.spark_history_server_security_enabled

Defines whether the Spark History Server is secured by Kerberos

false

hadoop.yarn_clusters.ha.resourcemanager_host

Network address of the host where the Resource Manager is running (High Availability mode)

 — 

hadoop.yarn_clusters.ha.resourcemanager_port

Port listened by the Resource Manager IPC (High Availability mode)

 — 

hadoop.yarn_clusters.ha.logical_name

Resource Manager logical name (required for High Availability mode)

 — 

hadoop.yarn_clusters.ha.security_enabled

Defines whether the YARN cluster is secured by Kerberos (High Availability mode)

false

hadoop.yarn_clusters.ha.submit_to

Defines whether the jobs are submitted to this cluster (High Availability mode)

true

hadoop.yarn_clusters.ha.ssl_cert_ca_verify

Defines whether to verify the SSL certificates from YARN Rest APIs against the CA when using the secure mode (HTTPS) (High Availability mode)

false

hadoop.yarn_clusters.ha.resourcemanager_api_url

URL of the Resource Manager API (High Availability mode)

 — 

hadoop.yarn_clusters.ha.history_server_api_url

URL of the History Server API (High Availability mode)

 — 

hadoop.yarn_clusters.ha.spark_history_server_url

URL of the Spark History Server (High Availability mode)

 — 

hadoop.yarn_clusters.ha.spark_history_server_security_enabled

Defines whether the Spark History Server is secured by Kerberos (High Availability mode)

false

Interpreter Spark3
Parameter Description Default value

notebook.interpreters.sparksql.name

Spark3 interpreter name

Spark3 SQL

notebook.interpreters.hive.interface

Interface for the Hive interpreter

hiveserver2

spark.sql_server_host

Hostname of the SQL server

 — 

spark.sql_server_port

Port of the SQL server

 — 

spark.security_enabled

Defines whether the Spark3 cluster is secured by Kerberos

false

spark.ssl_cert_ca_verify

Defines whether to verify SSL certificates against the CA

false

spark.use_sasl

Defines whether to use the SASL framework to establish connection to host

true

spark.spark_impersonation_enabled

Enables the impersonation mechanism during interaction with Spark3

true

spark.spark_principal

Kerberos principal name for Spark3

 — 

Interpreter Kyuubi
Parameter Description Default value

notebook.dbproxy_extra_classpath

Classpath to be appended to the default DBProxy server classpath

/usr/share/java/kyuubi-hive-jdbc.jar

notebook.interpreters.kyuubi.name

Kyuubi interpreter name

Kyuubi[Spark3]

notebook.interpreters.kyuubi.options

Special parameters for connection to the Kyuubi server

 — 

notebook.interpreters.kyuubi.interface

Interface for the Kyuubi service

jdbc

hue.ini kerberos config
Parameter Description Default value

desktop.kerberos.hue_keytab

Path to HUE Kerberos keytab file

 — 

desktop.kerberos.hue_principal

Kerberos principal name for HUE

 — 

desktop.kerberos.kinit_path

Path to kinit utility

/usr/bin/kinit

desktop.kerberos.reinit_frequency

Time interval in seconds for HUE to renew its keytab

3600

desktop.kerberos.ccache_path

Path to cached Kerberos credentials

/tmp/hue_krb5_ccache

desktop.kerberos.krb5_renewlifetime_enabled

This must be set to false if the renew_lifetime parameter in krb5.conf file is set to 0m

false

desktop.auth.auth

Authentication type

 — 

Authentication on WEB UIs
Parameter Description Default value

desktop.kerberos.kerberos_auth

Defines whether to use Kerberos authentication for HTTP clients based on the current ticket

false

desktop.kerberos.spnego_principal

Default Kerberos principal name for the HTTP client

 — 

hue.ini SSL config
Parameter Description Default value

desktop.ssl_certificate

Path to the SSL certificate file

/etc/ssl/certs/host_cert.cert

desktop.ssl_private_key

Path to the SSL RSA private key file

/etc/ssl/host_cert.key

desktop.ssl_password

SSL certificate password

 — 

desktop.ssl_no_renegotiation

Disables all renegotiation in TLSv1.2 and earlier

true

desktop.ssl_validate

Defines whether HUE should validate certificates received from the server

false

desktop.ssl_cacerts

This must be set to false if the renew_lifetime parameter in krb5.conf file is set to 0m

/etc/pki/tls/certs/ca-bundle.crt

desktop.session.secure

Defines whether the cookie containing the user’s session ID and csrf cookie will use the secure flag

true

desktop.session.http_only

Defines whether the cookie containing the user’s session ID and csrf cookie will use the HTTP only flag

false

LDAP security
Parameter Description Default value

desktop.ldap.ldap_url

URL of the LDAP server

 — 

desktop.ldap.base_dn

The search base for finding users and groups

"DC=mycompany,DC=com"

desktop.ldap.nt_domain

The NT domain used for LDAP authentication

mycompany.com

desktop.ldap.ldap_cert

Certificate files in PEM format for the CA that HUE will trust for authentication over TLS

 — 

desktop.ldap.use_start_tls

Set this to true if you are not using Secure LDAP (LDAPS) but want to establish secure connections using TLS

true

desktop.ldap.bind_dn

Distinguished name of the user to bind as

"CN=ServiceAccount,DC=mycompany,DC=com"

desktop.ldap.bind_password

Password of the bind user

 — 

desktop.ldap.ldap_username_pattern

Pattern for username search. Specify the <username> placeholder for this parameter

"uid=<username>,ou=People,dc=mycompany,dc=com"

desktop.ldap.create_users_on_login

Defines whether to create users in HUE when they try to login with their LDAP credentials

true

desktop.ldap.sync_groups_on_login

Defines whether to synchronize users groups when they login

true

desktop.ldap.login_groups

A comma-separated list of LDAP groups containing users that are allowed to login

 — 

desktop.ldap.ignore_username_case

Defines whether to ignore the case of usernames when searching for existing users

true

desktop.ldap.force_username_lowercase

Defines whether to force use lowercase for usernames when creating new users from LDAP

true

desktop.ldap.force_username_uppercase

Defines whether to force use uppercase for usernames when creating new users from LDAP. This parameter cannot be combined with desktop.ldap.force_username_lowercase

false

desktop.ldap.search_bind_authentication

Enables search bind authentication

true

desktop.ldap.subgroups

Specifies the kind of subgrouping to use: nested or suboordinate (deprecated)

nested

desktop.ldap.nested_members_search_depth

The number of levels to search for nested members

10

desktop.ldap.follow_referrals

Defines whether to follow referrals

false

desktop.ldap.users.user_filter

Base filter for users search

"objectclass=*"

desktop.ldap.users.user_name_attr

The username attribute in the LDAP schema

sAMAccountName

desktop.ldap.groups.group_filter

Base filter for groups search

"objectclass=*"

desktop.ldap.groups.group_name_attr

The group name attribute in the LDAP schema

cn

desktop.ldap.groups.group_member_attr

The attribute of the group object that identifies the group members

member

Others
Parameter Description Default value

Enable custom ulimits

Switch on the corresponding toggle button to specify resource limits (ulimits) for the current process. If you do not set these values, the default system settings are used. Ulimit settings are described in the table below

[Manager]
DefaultLimitCPU=
DefaultLimitFSIZE=
DefaultLimitDATA=
DefaultLimitSTACK=
DefaultLimitCORE=
DefaultLimitRSS=
DefaultLimitNOFILE=
DefaultLimitAS=
DefaultLimitNPROC=
DefaultLimitMEMLOCK=
DefaultLimitLOCKS=
DefaultLimitSIGPENDING=
DefaultLimitMSGQUEUE=
DefaultLimitNICE=
DefaultLimitRTPRIO=
DefaultLimitRTTIME=

Custom hue.ini

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file hue.ini. List of available parameters can be found in the HUE documentation

 — 

Ulimit settings
Parameter Description Corresponding option of the ulimit command in CentOS

DefaultLimitCPU

A limit in seconds on the amount of CPU time that a process can consume

cpu time ( -t)

DefaultLimitFSIZE

The maximum size of files that a process can create, in 512-byte blocks

file size ( -f)

DefaultLimitDATA

The maximum size of a process’s data segment, in kilobytes

data seg size ( -d)

DefaultLimitSTACK

The maximum stack size allocated to a process, in kilobytes

stack size ( -s)

DefaultLimitCORE

The maximum size of a core dump file allowed for a process, in 512-byte blocks

core file size ( -c)

DefaultLimitRSS

The maximum of resident set size, in kilobytes

max memory size ( -m)

DefaultLimitNOFILE

The maximum number of open file descriptors allowed for the process

open files ( -n)

DefaultLimitAS

The maximum size of the process virtual memory (address space), in kilobytes

virtual memory ( -v)

DefaultLimitNPROC

The maximum number of processes

max user processes ( -u)

DefaultLimitMEMLOCK

The maximum memory size that can be locked for the process, in kilobytes. Memory locking ensures the memory is always in RAM and a swap file is not used

max locked memory ( -l)

DefaultLimitLOCKS

The maximum number of files locked by a process

file locks ( -x)

DefaultLimitSIGPENDING

The maximum number of signals that are pending for delivery to the calling thread

pending signals ( -i)

DefaultLimitMSGQUEUE

The maximum number of bytes in POSIX message queues. POSIX message queues allow processes to exchange data in the form of messages

POSIX message queues ( -q)

DefaultLimitNICE

The maximum NICE priority level that can be assigned to a process

scheduling priority ( -e)

DefaultLimitRTPRIO

The maximum real-time scheduling priority level

real-time priority ( -r)

DefaultLimitRTTIME

The maximum pipe buffer size, in 512-byte blocks

pipe size ( -p)

Impala

Parameter Description Default value

impala-env.sh

The contents of the impala-env.sh file that contains Impala environment settings

Custom impala-env.sh

The contents of the custom impala-env.sh file that contains custom Impala environment settings

Credential encryption
Parameter Description Default value

Encryption enable

Defines whether the credentials are encrypted

false

Credential provider path

Path to the credential provider for creating the .jceks files containing secret keys

jceks://hdfs/apps/impala/security/impala.jceks

Ranger plugin credential provider path

Path to the Ranger plugin credential provider

jceks://file/etc/impala/conf/ranger-impala.jceks

Custom jceks

Defines whether custom .jceks files located at the credential provider path are used (true) or auto-generated ones (false)

false

Password file name

Name of the password file in the classpath of the service if the password file is selected in the credstore options

impala_credstore_pass

ranger-hive-audit.xml
Parameter Description Default value

xasecure.audit.destination.solr.batch.filespool.dir

The spool directory path

/srv/ranger/hdfs_plugin/audit_solr_spool

xasecure.audit.destination.solr.urls

A URL of the Solr server to store audit events. Leave this property value empty or set it to NONE when using ZooKeeper to connect to Solr

 — 

xasecure.audit.destination.solr.zookeepers

Specifies the ZooKeeper connection string for the Solr destination

 — 

xasecure.audit.destination.solr.force.use.inmemory.jaas.config

Uses in-memory JAAS configuration file to connect to Solr

 — 

xasecure.audit.is.enabled

Enables Ranger audit

true

xasecure.audit.jaas.Client.loginModuleControlFlag

Specifies whether the success of the module is required, requisite, sufficient, or optional

 — 

xasecure.audit.jaas.Client.loginModuleName

The name of the authenticator class

 — 

xasecure.audit.jaas.Client.option.keyTab

The name of the keytab file to get the principal’s secret key

 — 

xasecure.audit.jaas.Client.option.principal

The name of the principal to be used

 — 

xasecure.audit.jaas.Client.option.serviceName

The name of a user or a service that wants to log in

 — 

xasecure.audit.jaas.Client.option.storeKey

Set this to true if you want the keytab or the principal’s key to be stored in the subject’s private credentials

false

xasecure.audit.jaas.Client.option.useKeyTab

Set this to true if you want the module to get the principal’s key from the keytab

false

ranger-hive-security.xml
Parameter Description Default value

ranger.plugin.hive.policy.rest.url

The URL to Ranger Admin

 — 

ranger.plugin.hive.service.name

Name of the Ranger service containing policies for this Impala instance

 — 

ranger.plugin.hive.policy.cache.dir

Directory, where Ranger policies are cached after a successful retrieval from the source

/srv/ranger/impala/policycache

ranger.plugin.hive.policy.pollIntervalMs

How often to poll for changes in policies in milliseconds

30000

ranger.plugin.hive.policy.rest.client.connection.timeoutMs

Impala plugin connection timeout in milliseconds

120000

ranger.plugin.hive.policy.rest.client.read.timeoutMs

Impala plugin read timeout in milliseconds

30000

xasecure.hive.update.xapolicies.on.grant.revoke

Specifies whether the Impala plugin should update the Ranger policies on the updates to permissions done using GRANT/REVOKE

true

ranger.plugin.hive.policy.rest.ssl.config.file

The path to the RangerRestClient SSL config file for HBase plugin

/etc/hbase/conf/ranger-hbase-policymgr-ssl.xml

ranger-hive-policymgr-ssl.xml
Parameter Description Default value

xasecure.policymgr.clientssl.keystore

The path to the keystore file used by Ranger

 — 

xasecure.policymgr.clientssl.keystore.credential.file

The path to the keystore credentials file

/etc/impala/conf/ranger-impala.jceks

xasecure.policymgr.clientssl.truststore.credential.file

The path to the truststore credentials file

/etc/impala/conf/ranger-impala.jceks

xasecure.policymgr.clientssl.truststore

The path to the truststore file used by Ranger

 — 

xasecure.policymgr.clientssl.keystore.password

The password to the keystore file

 — 

xasecure.policymgr.clientssl.truststore.password

The password to the truststore file

 — 

Enable LDAP
Parameter Description Default value

ldap_uri

URI of the LDAP server. Typically, the URI is prefixed with ldap:// or ldaps:// for SSL-based LDAP transport. The URI can optionally specify the port, for example: ldap://ldap_server.example.com:389

 — 

ldap_domain

Replaces the username with a string <username>@ldap_domain, where <username> is the name of the user trying to authenticate. Mutually exclusive with ldap_baseDN and ldap_bind_pattern

 — 

ldap_bind_dn

Distinguished name of the user to bind to for user/group searches. Required only if the user or group filters are being used and the LDAP server is not configured to allow anonymous searches

 — 

ldap_bind_password

Password of the user to bind to for user/group searches. Required only if the anonymous bind is not activated

 — 

ldap_bind_password_cmd

A Unix command the output of which returns the password to use with the --ldap_bind_dn option. The output of the command will be truncated to 1024 bytes and trimmed of trailing whitespace.

cat /etc/impala/conf/pass.pwd

ldap_user_search_basedn

The base DN for the LDAP subtree to search users

 — 

ldap_group_search_basedn

The base DN for the LDAP subtree to search groups

 — 

ldap_baseDN

Search base. Replaces the username with a DN of the form: uid=<userid>,ldap_baseDN, where <userid> is the username of the user trying to authenticate. Mutually exclusive with ldap_domain and ldap_bind_pattern

 — 

ldap_user_filter

A filter for both simple and search bind mechanisms. For a simple bind, it is a comma-separated list of user names. If specified, users must be on this list for authentication to succeed. For a search bind, it is an LDAP filter that will be used during an LDAP search, it can contain the {0} pattern which will be replaced with the user name

 — 

ldap_group_filter

Comma-separated list of groups. If specified, users must belong to one of these groups for authentication to succeed

 — 

ldap_allow_anonymous_binds

When true, LDAP authentication with a blank password (an anonymous bind) is allowed by Impala

false

ldap_search_bind_authentication

Allows switching between the search and simple bind user lookup methods when authenticating

true

ldap_ca_certificate

Specifies the location of the certificate in standard PEM format for SSL. Store this certificate on the local filesystem, in a location that only the impala user and other trusted users can read

 — 

ldap_passwords_in_clear_ok

Enables the webserver to start with the LDAP authentication even if SSL is not enabled. If set to true, the auth_creds_ok_in_clear parameter in the impalarc file gets set to true as well. A potentially unsecure configuration

false

ldap_bind_pattern

A string in which the #UID instance is replaced with the user id. For example, if this parameter is set to user=#UID,OU=foo,CN=bar and the user henry tries to authenticate, the constructed bind name will be user=henry,OU=foo,CN=bar. Mutually exclusive with ldap_domain and ldap_baseDN

 — 

allow_custom_ldap_filters_with_kerberos