HDFS configuration parameters

To configure the service, use the following configuration parameters in ADCM.

NOTE
  • Some of the parameters become visible in the ADCM UI after the Advanced flag has been set.

  • The parameters that are set in the Custom group will overwrite the existing parameters even if they are read-only.

Credential Encryption
Parameter Description Default value

Encryption enable

Enables or disables the credential encryption feature. When enabled, HDFS stores configuration passwords and credentials required for interacting with other services in the encrypted form

false

Credential provider path

Path to a keystore file with secrets

jceks://file/etc/hadoop/conf/hadoop.jceks

Ranger plugin credential provider path

Path to a Ranger keystore file with secrets

jceks://file/etc/hadoop/conf/ranger-hdfs.jceks

Custom jceks

Set to true to use a custom JCEKS file. Set to false to use the default auto-generated JCEKS file

false

Password file name

Name of the file in the service’s classpath that stores passwords

hadoop_credstore_pass

Enable CORS
Parameter Description Default value

hadoop.http.cross-origin.enabled

Enables cross-origin support for all web services

true

hadoop.http.cross-origin.allowed-origins

Comma-separated list of origins that are allowed. Values prefixed with regex are interpreted as regular expressions. Values containing wildcards (*) are possible as well, here a regular expression is generated, the use is discouraged and support is only available for backward compatibility

*

hadoop.http.cross-origin.allowed-headers

Comma-separated list of allowed headers

X-Requested-With,Content-Type,Accept,Origin,WWW-Authenticate,Accept-Encoding,Transfer-Encoding

hadoop.http.cross-origin.allowed-methods

Comma-separated list of methods that are allowed

GET,PUT,POST,OPTIONS,HEAD,DELETE

hadoop.http.cross-origin.max-age

Number of seconds a pre-flighted request can be cached

1800

core_site.enable_cors.active

Enables CORS (Cross-Origin Resource Sharing)

true

hdfs-site.xml
Parameter Description Default value

dfs.client.block.write.replace-datanode-on-failure.enable

If there is a DataNode/network failure in the write pipeline, DFSClient will try to remove the failed DataNode from the pipeline and then continue writing with the remaining DataNodes. As a result, the number of DataNodes in the pipeline is decreased. The feature is to add new DataNodes to the pipeline. This is a site-wide property to enable/disable the feature. When the cluster size is extremely small, e.g. 3 nodes or less, cluster administrators may want to set the policy to NEVER in the default configuration file or disable this feature. Otherwise, users may experience an unusually high rate of pipeline failures since it is impossible to find new DataNodes for replacement. See also dfs.client.block.write.replace-datanode-on-failure.policy

true

dfs.client.block.write.replace-datanode-on-failure.policy

This property is used only if the value of dfs.client.block.write.replace-datanode-on-failure.enable is true. Possible values:

  • ALWAYS. Always adds a new DataNode, when an existing DataNode is removed.

  • NEVER. Never adds a new DataNode.

  • DEFAULT. Let r be the replication number. Let n be the number of existing DataNodes. Add a new DataNode only, if r is greater than or equal to 3 and either:

    1. floor(r/2) is greater than or equal to n;

    2. r is greater than n and the block is hflushed/appended.

DEFAULT

dfs.client.block.write.replace-datanode-on-failure.best-effort

This property is used only if the value of dfs.client.block.write.replace-datanode-on-failure.enable is true. Best effort means, that the client will try to replace a failed DataNode in write pipeline (provided that the policy is satisfied), however, it continues the write operation in case that the DataNode replacement also fails. Suppose, the DataNode replacement fails: false — an exception should be thrown so that the write will fail; true — the write should be resumed with the remaining DataNodes. Note, that setting this property to true allows writing to a pipeline with a smaller number of DataNodes. As a result, it increases the probability of data loss

false

dfs.client.block.write.replace-datanode-on-failure.min-replication

Minimum number of replications needed not to fail the write pipeline if new DataNodes can not be found to replace failed DataNodes (could be due to network failure) in the write pipeline. If the number of the remaining DataNodes in the write pipeline is greater than or equal to this property value, continue writing to the remaining nodes. Otherwise throw exception. If this is set to 0, an exception will be thrown, when a replacement can not be found. See also dfs.client.block.write.replace-datanode-on-failure.policy

0

dfs.balancer.dispatcherThreads

The size of the thread pool for the HDFS balancer block mover — dispatchExecutor

200

dfs.balancer.movedWinWidth

Time window in milliseconds for the HDFS balancer tracking blocks and its locations

5400000

dfs.balancer.moverThreads

The thread pool size for executing block moves — moverThreadAllocator

1000

dfs.balancer.max-size-to-move

Maximum number of bytes that can be moved by the balancer in a single thread

10737418240

dfs.balancer.getBlocks.min-block-size

Minimum block threshold size in bytes to ignore, when fetching a source block list

10485760

dfs.balancer.getBlocks.size

The total size in bytes of DataNode blocks to get, when fetching a source block list

2147483648

dfs.balancer.block-move.timeout

Maximum amount of time for a block to move (in milliseconds). If set greater than 0, the balancer will stop waiting for a block move completion after this time. In typical clusters, a 3-5 minute timeout is reasonable. If the timeout is set for a large proportion of block moves, this needs to be increased. It could also be that too much work is dispatched and many nodes are constantly exceeding the bandwidth limit as a result. In that case, other balancer parameters might need to be adjusted. It is disabled (0) by default

0

dfs.balancer.max-no-move-interval

If this specified amount of time has elapsed and no blocks have been moved out of a source DataNode, one more attempt will be made to move blocks out of this DataNode in the current Balancer iteration

60000

dfs.balancer.max-iteration-time

Maximum amount of time an iteration can be run by the Balancer. After this time the Balancer will stop the iteration, and re-evaluate the work needed to be done to balance the cluster. The default value is 20 minutes

1200000

dfs.blocksize

The default block size for new files (in bytes). You can use the following suffixes to define size units (case insensitive): k (kilo), m (mega), g (giga), t (tera), p (peta), e (exa). For example, 128k, 512m, 1g, etc. You can also specify the block size in bytes (such as 134217728 for 128 MB)

134217728

dfs.client.read.shortcircuit

Turns on short-circuit local reads

true

dfs.datanode.balance.max.concurrent.moves

Maximum number of threads for DataNode balancer pending moves. This value is reconfigurable via the dfsadmin -reconfig command

50

dfs.datanode.data.dir

Determines, where on the local filesystem a DFS data node should store its blocks. If multiple directories are specified, then data will be stored in all named directories, typically on different devices. The directories should be tagged with corresponding storage types (SSD/DISK/ARCHIVE/RAM_DISK) for HDFS storage policies. The default storage type will be DISK if the directory does not have a storage type tagged explicitly. Directories, that do not exist, will be created, if the local filesystem permission allows

/srv/hadoop-hdfs/data:DISK

dfs.disk.balancer.max.disk.throughputInMBperSec

Maximum disk bandwidth, used by the disk balancer during reads from a source disk. The unit is MB/sec

10

dfs.disk.balancer.block.tolerance.percent

The parameter specifies when a good enough value is reached for any copy step (in percents). For example, if set to 10 then getting close to 10% of the target value is considered as good enough. In other words, if the move operation is 20GB in size, if 18GB (20 * (1-10%)) can be moved, the entire operation is considered successful

10

dfs.disk.balancer.max.disk.errors

During a block move from a source to destination disk, there might be various errors. This parameter defines how many errors to tolerate before declaring a move between 2 disks (or a step) has failed

5

dfs.disk.balancer.plan.valid.interval

Maximum amount of time a disk balancer plan (a set of configurations that define the data volume to be redistributed between two disks) remains valid. This setting supports multiple time unit suffixes as described in dfs.heartbeat.interval. If no suffix is specified, then milliseconds are assumed

1d

dfs.disk.balancer.plan.threshold.percent

Defines a data storage threshold in percents at which disks start participating in data redistribution or balancing activities

10

dfs.domain.socket.path

Path to a UNIX domain socket that will be used for communication between the DataNode and local HDFS clients. If the string _PORT is present in this path, it will be replaced by the TCP port of the DataNode. The parameter is optional

/var/lib/hadoop-hdfs/dn_socket

dfs.hosts

Names a file that contains a list of hosts allowed to connect to the NameNode. The full pathname of the file must be specified. If the value is empty, all hosts are permitted

/etc/hadoop/conf/dfs.hosts

dfs.mover.movedWinWidth

Minimum time interval for a block to be moved to another location again (in milliseconds)

5400000

dfs.mover.moverThreads

Sets the balancer mover thread pool size

1000

dfs.mover.retry.max.attempts

Maximum number of retries before the mover considers the move as failed

10

dfs.mover.max-no-move-interval

If this specified amount of time has elapsed and no block has been moved out of a source DataNode, one more attempt will be made to move blocks out of this DataNode in the current mover iteration

60000

dfs.namenode.name.dir

Determines where on the local filesystem the DFS name node should store the name table (fsimage). If multiple directories are specified, then the name table is replicated in all of the directories, for redundancy

/srv/hadoop-hdfs/name

dfs.namenode.checkpoint.dir

Determines where on the local filesystem the DFS secondary name node should store the temporary images to merge. If multiple directories are specified, then the image is replicated in all of the directories for redundancy

/srv/hadoop-hdfs/checkpoint

dfs.namenode.hosts.provider.classname

The class that provides access for host files. org.apache.hadoop.hdfs.server.blockmanagement.HostFileManager is used by default that loads files specified by dfs.hosts and dfs.hosts.exclude. If org.apache.hadoop.hdfs.server.blockmanagement.CombinedHostFileManager is used, it will load the JSON file defined in dfs.hosts. To change the class name, NameNode restart is required. dfsadmin -refreshNodes only refreshes the configuration files, used by the class

org.apache.hadoop.hdfs.server.blockmanagement.CombinedHostFileManager

dfs.namenode.rpc-bind-host

The actual address, the RPC Server will bind to. If this optional address is set, it overrides only the hostname portion of dfs.namenode.rpc-address. It can also be specified per NameNode or name service for HA/Federation. This is useful for making the NameNode listen on all interfaces by setting it to 0.0.0.0

0.0.0.0

dfs.permissions.superusergroup

Name of the group of super-users. The value should be a single group name

hadoop

dfs.replication

The default block replication. The actual number of replications can be specified, when the file is created. The default is used, if replication is not specified in create time

3

dfs.journalnode.http-address

The HTTP address of the JournalNode web UI

0.0.0.0:8480

dfs.journalnode.https-address

The HTTPS address of the JournalNode web UI

0.0.0.0:8481

dfs.journalnode.rpc-address

The RPC address of the JournalNode web UI

0.0.0.0:8485

dfs.datanode.http.address

The address of the DataNode HTTP server

0.0.0.0:9864

dfs.datanode.https.address

The address of the DataNode HTTPS server

0.0.0.0:9865

dfs.datanode.address

The address of the DataNode for data transfer

0.0.0.0:9866

dfs.datanode.ipc.address

The IPC address of the DataNode

0.0.0.0:9867

dfs.namenode.http-address

The address and the base port to access the dfs NameNode web UI

0.0.0.0:9870

dfs.namenode.https-address

The secure HTTPS address of the NameNode

0.0.0.0:9871

dfs.ha.automatic-failover.enabled

Defines whether automatic failover is enabled

true

dfs.ha.fencing.methods

A list of scripts or Java classes that will be used to fence the Active NameNode during a failover

shell(/bin/true)

dfs.journalnode.edits.dir

The directory where to store journal edit files

/srv/hadoop-hdfs/journalnode

dfs.namenode.shared.edits.dir

The directory on shared storage between the multiple NameNodes in an HA cluster. This directory will be written by the active and read by the standby in order to keep the namespaces synchronized. This directory does not need to be listed in dfs.namenode.edits.dir. It should be left empty in a non-HA cluster

---

dfs.internal.nameservices

A unique nameservices identifier for a cluster or federation. For a single cluster, specify the name that will be used as an alias. For HDFS federation, specify, separated by commas, all namespaces associated with this cluster. This option allows you to use an alias instead of an IP address or FQDN for some commands, for example: hdfs dfs -ls hdfs://<dfs.internal.nameservices>. The value must be alphanumeric without underscores

 — 

dfs.block.access.token.enable

If set to true, access tokens are used as capabilities for accessing DataNodes. If set to false, no access tokens are checked on accessing DataNodes

false

dfs.namenode.kerberos.principal

The NameNode service principal. This is typically set to nn/_HOST@REALM.TLD. Each NameNode will substitute _HOST with its own fully qualified hostname during the startup. The _HOST placeholder allows using the same configuration setting on both NameNodes in an HA setup

nn/_HOST@REALM

dfs.namenode.keytab.file

The keytab file used by each NameNode daemon to login as its service principal. The principal name is configured with dfs.namenode.kerberos.principal

/etc/security/keytabs/nn.service.keytab

dfs.namenode.kerberos.internal.spnego.principal

HTTP Kerberos principal name for the NameNode

HTTP/_HOST@REALM

dfs.web.authentication.kerberos.principal

Kerberos principal name for the WebHDFS

HTTP/_HOST@REALM

dfs.web.authentication.kerberos.keytab

Kerberos keytab file for WebHDFS

/etc/security/keytabs/HTTP.service.keytab

dfs.journalnode.kerberos.principal

The JournalNode service principal. This is typically set to jn/_HOST@REALM.TLD. Each JournalNode will substitute _HOST with its own fully qualified hostname at startup. The _HOST placeholder allows using the same configuration setting on all JournalNodes

jn/_HOST@REALM

dfs.journalnode.keytab.file

The keytab file used by each JournalNode daemon to login as its service principal. The principal name is configured with dfs.journalnode.kerberos.principal

/etc/security/keytabs/jn.service.keytab

dfs.journalnode.kerberos.internal.spnego.principal

The server principal used by the JournalNode HTTP Server for SPNEGO authentication when Kerberos security is enabled. This is typically set to HTTP/_HOST@REALM.TLD. The SPNEGO server principal begins with the prefix HTTP/ by convention. If the value is *, the web server will attempt to login with every principal specified in the keytab file dfs.web.authentication.kerberos.keytab. For most deployments this can be set to ${dfs.web.authentication.kerberos.principal} that is use the value of dfs.web.authentication.kerberos.principal

HTTP/_HOST@REALM

dfs.datanode.data.dir.perm

Permissions for the directories on the local filesystem where the DFS DataNode stores its blocks. The permissions can either be octal or symbolic

700

dfs.datanode.kerberos.principal

The DataNode service principal. This is typically set to dn/_HOST@REALM.TLD. Each DataNode will substitute _HOST with its own fully qualified host name at startup. The _HOST placeholder allows using the same configuration setting on all DataNodes

dn/_HOST@REALM.TLD

dfs.datanode.keytab.file

The keytab file used by each DataNode daemon to login as its service principal. The principal name is configured with dfs.datanode.kerberos.principal

/etc/security/keytabs/dn.service.keytab

dfs.http.policy

Defines if HTTPS (SSL) is supported on HDFS. This configures the HTTP endpoint for HDFS daemons. The following values are supported: HTTP_ONLY — the service is provided only via http; HTTPS_ONLY — the service is provided only via https; HTTP_AND_HTTPS — the service is provided both via http and https

HTTP_ONLY

dfs.data.transfer.protection

A comma-separated list of SASL protection values used for secured connections to the DataNode when reading or writing block data. The possible values are:

  • authentication — provides only authentication; no integrity or privacy;

  • integrity — authentication and integrity are enabled;

  • privacy — authentication, integrity and privacy are enabled.

If dfs.encrypt.data.transfer=true, then it supersedes the setting for dfs.data.transfer.protection and enforces that all connections must use a specialized encrypted SASL handshake. This property is ignored for connections to a DataNode listening on a privileged port. In this case, it is assumed that the use of a privileged port establishes sufficient trust

 — 

dfs.encrypt.data.transfer

Defines whether or not actual block data that is read/written from/to HDFS should be encrypted on the wire. This only needs to be set on the NameNodes and DataNodes, clients will deduce this automatically. It is possible to override this setting per connection by specifying custom logic via dfs.trustedchannel.resolver.class

false

dfs.encrypt.data.transfer.algorithm

This value may be set to either 3des or rc4. If nothing is set, then the configured JCE default on the system is used (usually 3DES). It is widely believed that 3DES is more secure, but RC4 is substantially faster. Note that if AES is supported by both the client and server, then this encryption algorithm will only be used to initially transfer keys for AES

3des

dfs.encrypt.data.transfer.cipher.suites

This value can be either undefined or AES/CTR/NoPadding. If defined, then dfs.encrypt.data.transfer uses the specified cipher suite for data encryption. If not defined, then only the algorithm specified in dfs.encrypt.data.transfer.algorithm is used

 — 

dfs.encrypt.data.transfer.cipher.key.bitlength

The key bitlength negotiated by dfsclient and datanode for encryption. This value may be set to either 128, 192, or 256

128

ignore.secure.ports.for.testing

Allows skipping HTTPS requirements in the SASL mode

false

dfs.client.https.need-auth

Whether SSL client certificate authentication is required

false

httpfs-site.xml
Parameter Description Default value

httpfs.http.administrators

The ACL for the admins. This configuration is used to control who can access the default servlets for HttpFS server. The value should be a comma-separated list of users and groups. The user list comes first and is separated by a space, followed by the group list, for example: user1,user2 group1,group2. Both users and groups are optional, so you can define only users, or groups, or both of them. Notice that in all these cases you should always use the leading space in the groups list. Using the asterisk grants access to all users and groups

*

hadoop.http.temp.dir

The HttpFS temp directory

${hadoop.tmp.dir}/httpfs

httpfs.ssl.enabled

Defines whether SSL is enabled. Default is false, that is disabled

false

httpfs.hadoop.config.dir

The location of the Hadoop configuration directory

/etc/hadoop/conf

httpfs.hadoop.authentication.type

Defines the authentication mechanism used by httpfs for its HTTP clients. Valid values are simple and kerberos. If simple is used, clients must specify the username with the user.name query string parameter. If kerberos is used, HTTP clients must use HTTP SPNEGO or delegation tokens

simple

httpfs.hadoop.authentication.kerberos.keytab

The Kerberos keytab file with the credentials for the HTTP Kerberos principal used by httpfs in the HTTP endpoint. httpfs.authentication.kerberos.keytab is deprecated. Instead, use hadoop.http.authentication.kerberos.keytab

/etc/security/keytabs/httpfs.service.keytab

httpfs.hadoop.authentication.kerberos.principal

The HTTP Kerberos principal used by HttpFS in the HTTP endpoint. The HTTP Kerberos principal MUST start with HTTP/ as per Kerberos HTTP SPNEGO specification. httpfs.authentication.kerberos.principal is deprecated. Instead, use hadoop.http.authentication.kerberos.principal

HTTP/${httpfs.hostname}@${kerberos.realm}

ranger-hdfs-audit.xml
Parameter Description Default value

xasecure.audit.destination.solr.batch.filespool.dir

Spool directory path

/srv/ranger/hdfs_plugin/audit_solr_spool

xasecure.audit.destination.solr.urls

A URL of the Solr server to store audit events. Leave this property value empty or set it to NONE when using ZooKeeper to connect to Solr

 — 

xasecure.audit.destination.solr.zookeepers

Specifies the ZooKeeper connection string for the Solr destination

 — 

xasecure.audit.destination.solr.force.use.inmemory.jaas.config

Whether to use in-memory JAAS configuration file to connect to Solr

 — 

xasecure.audit.is.enabled

Enables Ranger audit

true

xasecure.audit.jaas.Client.loginModuleControlFlag

Specifies whether the success of the module is required, requisite, sufficient, or optional

 — 

xasecure.audit.jaas.Client.loginModuleName

Name of the authenticator class

 — 

xasecure.audit.jaas.Client.option.keyTab

Name of the keytab file to get the principal’s secret key

 — 

xasecure.audit.jaas.Client.option.principal

Name of the principal to be used

 — 

xasecure.audit.jaas.Client.option.serviceName

Name of a user or a service that wants to log in

 — 

xasecure.audit.jaas.Client.option.storeKey

Set this to true if you want the keytab or the principal’s key to be stored in the subject’s private credentials

false

xasecure.audit.jaas.Client.option.useKeyTab

Set this to true if you want the module to get the principal’s key from the keytab

false

ranger-hdfs-security.xml
Parameter Description Default value

ranger.plugin.hdfs.policy.rest.url

The URL to Ranger Admin

 — 

ranger.plugin.hdfs.service.name

The name of the Ranger service containing policies for this instance

 — 

ranger.plugin.hdfs.policy.cache.dir

The directory where Ranger policies are cached after successful retrieval from the source

/srv/ranger/hdfs/policycache

ranger.plugin.hdfs.policy.pollIntervalMs

Defines how often to poll for changes in policies

30000

ranger.plugin.hdfs.policy.rest.client.connection.timeoutMs

The HDFS Plugin RangerRestClient connection timeout (in milliseconds)

120000

ranger.plugin.hdfs.policy.rest.client.read.timeoutMs

The HDFS Plugin RangerRestClient read timeout (in milliseconds)

30000

ranger.plugin.hdfs.policy.rest.ssl.config.file

Path to the RangerRestClient SSL config file for the HDFS plugin

/etc/hadoop/conf/ranger-hdfs-policymgr-ssl.xml

httpfs-env.sh
Parameter Description Default value

Sources

A list of sources which will be written into httpfs-env.sh

 — 

HADOOP_CONF_DIR

Hadoop configuration directory

/etc/hadoop/conf

HADOOP_LOG_DIR

Location of the log directory

${HTTPFS_LOG}

HADOOP_PID_DIR

PID file directory location

${HTTPFS_TEMP}

HTTPFS_SSL_ENABLED

Defines if SSL is enabled for httpfs

false

HTTPFS_SSL_KEYSTORE_FILE

Path to the keystore file

admin

HTTPFS_SSL_KEYSTORE_PASS

The password to access the keystore

admin

Final HTTPFS_ENV_OPTS

Final value of the HTTPFS_ENV_OPTS parameter in httpfs-env.sh

 — 

hadoop-env.sh
Parameter Description Default value

Sources

A list of sources that will be written into hadoop-env.sh

 — 

HDFS_NAMENODE_OPTS

NameNode Heap Memory. Sets initial (-Xms) and maximum (-Xmx) Java heap memory size and environment options for the NameNode

-Xms1G -Xmx8G

HDFS_DATANODE_OPTS

DataNode Heap Memory. Sets initial (-Xms) and maximum (-Xmx) Java heap memory size and environment options for the DataNode

-Xms700m -Xmx8G

HDFS_HTTPFS_OPTS

HttpFS Heap Memory. Sets initial (-Xms) and maximum (-Xmx) Java heap memory size and environment options for the httpfs server

-Xms700m -Xmx8G

HDFS_JOURNALNODE_OPTS

JournalNode Heap Memory. Sets initial (-Xms) and maximum (-Xmx) Java heap memory size and environment options for the JournalNode

-Xms700m -Xmx8G

HDFS_ZKFC_OPTS

ZKFC Heap Memory. Sets initial (-Xms) and maximum (-Xmx) Java heap memory size and environment options for ZKFC

-Xms500m -Xmx8G

Final HADOOP_ENV_OPTS

Final value of the HADOOP_ENV_OPTS parameter in hadoop-env.sh

 — 

ssl-server.xml
Parameter Description Default value

ssl.server.truststore.location

The truststore to be used by NameNodes and DataNodes

 — 

ssl.server.truststore.password

The password to the truststore

 — 

ssl.server.truststore.type

The truststore file format

jks

ssl.server.truststore.reload.interval

The truststore reload check interval (in milliseconds)

10000

ssl.server.keystore.location

Path to the keystore file used by NameNodes and DataNodes

 — 

ssl.server.keystore.password

The password to the keystore

 — 

ssl.server.keystore.keypassword

The password to the key in the keystore

 — 

ssl.server.keystore.type

The keystore file format

 — 

Lists of decommissioned and in maintenance hosts
Parameter Description Default value

DECOMMISSIONED

When an administrator decommissions a DataNode, the DataNode will first be transitioned into DECOMMISSION_INPROGRESS state. After all blocks belonging to that DataNode are fully replicated elsewhere based on each block replication factor, the DataNode will be transitioned to DECOMMISSIONED state. After that, the administrator can shutdown the node to perform long-term repair and maintenance that could take days or weeks. After the machine has been repaired, the machine can be recommissioned back to the cluster

 — 

IN_MAINTENANCE

Sometimes administrators only need to take DataNodes down for minutes/hours to perform short-term repair/maintenance. For such scenarios, the HDFS block replication overhead, incurred by decommission, might not be necessary and a light-weight process is desirable. And that is what maintenance state is used for. When an administrator puts a DataNode in the maintenance state, the DataNode will first be transitioned to ENTERING_MAINTENANCE state. As long as all blocks belonging to that DataNode, are minimally replicated elsewhere, the DataNode will immediately be transitioned to IN_MAINTENANCE state. After the maintenance has completed, the administrator can take the DataNode out of the maintenance state. In addition, maintenance state supports the timeout that allows administrators to configure the maximum duration, in which a DataNode is allowed to stay in the maintenance state. After the timeout, the DataNode will be transitioned out of maintenance state automatically by HDFS without human intervention

 — 

Other
Parameter Description Default value

Additional nameservices

Additional (internal) names for an HDFS cluster that allows querying another HDFS cluster from the current one

 — 

Custom core-site.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file core-site.xml

 — 

Custom hdfs-site.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file hdfs-site.xml

 — 

Custom httpfs-site.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file httpfs-site.xml

 — 

Ranger plugin enabled

Whether or not Ranger plugin is enabled

 — 

Custom ranger-hdfs-audit.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file ranger-hdfs-audit.xml

 — 

Custom ranger-hdfs-security.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file ranger-hdfs-security.xml

 — 

Custom ranger-hdfs-policymgr-ssl.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file ranger-hdfs-policymgr-ssl.xml

 — 

Custom httpfs-env.sh

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file httpfs-env.sh

 — 

Custom hadoop-env.sh

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file hadoop-env.sh

 — 

Custom ssl-server.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file ssl-server.xml

 — 

Custom ssl-client.xml

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file ssl-client.xml

 — 

Topology script

The topology script used in HDFS

 — 

Topology data

An otional text file to map host names to the rack number for topology script. Stored to /etc/hadoop/conf/topology.data

 — 

Custom log4j.properties

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file log4j.properties

Custom httpfs-log4j.properties

In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file httpfs-log4j.properties

HDFS DataNode component
Monitoring
Parameter Description Default value

Java agent path

Path to the JMX Prometheus Java agent

/usr/lib/adh-utils/jmx/jmx_prometheus_javaagent.jar

Prometheus metrics port

Port on which to display HDFS DataNode metrics in the Prometheus format

9202

Mapping config path

Path to the metrics mapping configuration file

/etc/hadoop/conf/jmx_hdfs_datanode_metric_config.yml

Mapping config

Metrics mapping configuration file

HDFS JournalNode component
Monitoring
Parameter Description Default value

Java agent path

Path to the JMX Prometheus Java agent

/usr/lib/adh-utils/jmx/jmx_prometheus_javaagent.jar

Prometheus metrics port

Port on which to display HDFS JournalNode metrics in the Prometheus format

9203

Mapping config path

Path to the metrics mapping configuration file

/etc/hadoop/conf/jmx_hdfs_journalnode_metric_config.yml

Mapping config

Metrics mapping configuration file

HDFS NameNode component
Monitoring
Parameter Description Default value

Java agent path

Path to the JMX Prometheus Java agent

/usr/lib/adh-utils/jmx/jmx_prometheus_javaagent.jar

Prometheus metrics port

Port on which to display HDFS NameNode metrics in the Prometheus format

9201

Mapping config path

Path to the metrics mapping configuration file

/etc/hadoop/conf/jmx_hdfs_namenode_metric_config.yml

Mapping config

Metrics mapping configuration file

Found a mistake? Seleсt text and press Ctrl+Enter to report it