HDFS configuration parameters
To configure the service, use the following configuration parameters in ADCM.
|
NOTE
|
| Parameter | Description | Default value |
|---|---|---|
Encryption enable |
Enables or disables the credential encryption feature. When enabled, HDFS stores configuration passwords and credentials required for interacting with other services in the encrypted form |
false |
Credential provider path |
Path to a keystore file with secrets |
jceks://file/etc/hadoop/conf/hadoop.jceks |
Ranger plugin credential provider path |
Path to a Ranger keystore file with secrets |
jceks://file/etc/hadoop/conf/ranger-hdfs.jceks |
Custom jceks |
Set to |
false |
Password file name |
Name of the file in the service’s classpath that stores passwords |
hadoop_credstore_pass |
| Parameter | Description | Default value |
|---|---|---|
hadoop.http.cross-origin.enabled |
Enables cross-origin support for all web services |
true |
hadoop.http.cross-origin.allowed-origins |
Comma-separated list of origins that are allowed. Values prefixed with |
* |
hadoop.http.cross-origin.allowed-headers |
Comma-separated list of allowed headers |
X-Requested-With,Content-Type,Accept,Origin,WWW-Authenticate,Accept-Encoding,Transfer-Encoding |
hadoop.http.cross-origin.allowed-methods |
Comma-separated list of methods that are allowed |
GET,PUT,POST,OPTIONS,HEAD,DELETE |
hadoop.http.cross-origin.max-age |
Number of seconds a pre-flighted request can be cached |
1800 |
core_site.enable_cors.active |
Enables CORS (Cross-Origin Resource Sharing) |
true |
| Parameter | Description | Default value |
|---|---|---|
dfs.client.block.write.replace-datanode-on-failure.enable |
If there is a DataNode/network failure in the write pipeline, DFSClient will try to remove the failed DataNode from the pipeline and then continue writing with the remaining DataNodes.
As a result, the number of DataNodes in the pipeline is decreased.
The feature is to add new DataNodes to the pipeline.
This is a site-wide property to enable/disable the feature.
When the cluster size is extremely small, e.g. 3 nodes or less, cluster administrators may want to set the policy to |
true |
dfs.client.block.write.replace-datanode-on-failure.policy |
This property is used only if the value of
|
DEFAULT |
dfs.client.block.write.replace-datanode-on-failure.best-effort |
This property is used only if the value of |
false |
dfs.client.block.write.replace-datanode-on-failure.min-replication |
Minimum number of replications needed not to fail the write pipeline if new DataNodes can not be found to replace failed DataNodes (could be due to network failure) in the write pipeline.
If the number of the remaining DataNodes in the write pipeline is greater than or equal to this property value, continue writing to the remaining nodes.
Otherwise throw exception.
If this is set to |
0 |
dfs.balancer.dispatcherThreads |
The size of the thread pool for the HDFS balancer block mover — dispatchExecutor |
200 |
dfs.balancer.movedWinWidth |
Time window in milliseconds for the HDFS balancer tracking blocks and its locations |
5400000 |
dfs.balancer.moverThreads |
The thread pool size for executing block moves — moverThreadAllocator |
1000 |
dfs.balancer.max-size-to-move |
Maximum number of bytes that can be moved by the balancer in a single thread |
10737418240 |
dfs.balancer.getBlocks.min-block-size |
Minimum block threshold size in bytes to ignore, when fetching a source block list |
10485760 |
dfs.balancer.getBlocks.size |
The total size in bytes of DataNode blocks to get, when fetching a source block list |
2147483648 |
dfs.balancer.block-move.timeout |
Maximum amount of time for a block to move (in milliseconds).
If set greater than |
0 |
dfs.balancer.max-no-move-interval |
If this specified amount of time has elapsed and no blocks have been moved out of a source DataNode, one more attempt will be made to move blocks out of this DataNode in the current Balancer iteration |
60000 |
dfs.balancer.max-iteration-time |
Maximum amount of time an iteration can be run by the Balancer.
After this time the Balancer will stop the iteration, and re-evaluate the work needed to be done to balance the cluster.
The default value is |
1200000 |
dfs.blocksize |
The default block size for new files (in bytes).
You can use the following suffixes to define size units (case insensitive): |
134217728 |
dfs.client.read.shortcircuit |
Turns on short-circuit local reads |
true |
dfs.datanode.balance.max.concurrent.moves |
Maximum number of threads for DataNode balancer pending moves.
This value is reconfigurable via the |
50 |
dfs.datanode.data.dir |
Determines, where on the local filesystem a DFS data node should store its blocks.
If multiple directories are specified, then data will be stored in all named directories, typically on different devices.
The directories should be tagged with corresponding storage types ( |
/srv/hadoop-hdfs/data:DISK |
dfs.disk.balancer.max.disk.throughputInMBperSec |
Maximum disk bandwidth, used by the disk balancer during reads from a source disk. The unit is MB/sec |
10 |
dfs.disk.balancer.block.tolerance.percent |
The parameter specifies when a good enough value is reached for any copy step (in percents).
For example, if set to |
10 |
dfs.disk.balancer.max.disk.errors |
During a block move from a source to destination disk, there might be various errors. This parameter defines how many errors to tolerate before declaring a move between 2 disks (or a step) has failed |
5 |
dfs.disk.balancer.plan.valid.interval |
Maximum amount of time a disk balancer plan (a set of configurations that define the data volume to be redistributed between two disks) remains valid.
This setting supports multiple time unit suffixes as described in |
1d |
dfs.disk.balancer.plan.threshold.percent |
Defines a data storage threshold in percents at which disks start participating in data redistribution or balancing activities |
10 |
dfs.domain.socket.path |
Path to a UNIX domain socket that will be used for communication between the DataNode and local HDFS clients.
If the string |
/var/lib/hadoop-hdfs/dn_socket |
dfs.hosts |
Names a file that contains a list of hosts allowed to connect to the NameNode. The full pathname of the file must be specified. If the value is empty, all hosts are permitted |
/etc/hadoop/conf/dfs.hosts |
dfs.mover.movedWinWidth |
Minimum time interval for a block to be moved to another location again (in milliseconds) |
5400000 |
dfs.mover.moverThreads |
Sets the balancer mover thread pool size |
1000 |
dfs.mover.retry.max.attempts |
Maximum number of retries before the mover considers the move as failed |
10 |
dfs.mover.max-no-move-interval |
If this specified amount of time has elapsed and no block has been moved out of a source DataNode, one more attempt will be made to move blocks out of this DataNode in the current mover iteration |
60000 |
dfs.namenode.name.dir |
Determines where on the local filesystem the DFS name node should store the name table (fsimage). If multiple directories are specified, then the name table is replicated in all of the directories, for redundancy |
/srv/hadoop-hdfs/name |
dfs.namenode.checkpoint.dir |
Determines where on the local filesystem the DFS secondary name node should store the temporary images to merge. If multiple directories are specified, then the image is replicated in all of the directories for redundancy |
/srv/hadoop-hdfs/checkpoint |
dfs.namenode.hosts.provider.classname |
The class that provides access for host files.
|
org.apache.hadoop.hdfs.server.blockmanagement.CombinedHostFileManager |
dfs.namenode.rpc-bind-host |
The actual address, the RPC Server will bind to.
If this optional address is set, it overrides only the hostname portion of |
0.0.0.0 |
dfs.permissions.superusergroup |
Name of the group of super-users. The value should be a single group name |
hadoop |
dfs.replication |
The default block replication. The actual number of replications can be specified, when the file is created. The default is used, if replication is not specified in create time |
3 |
dfs.journalnode.http-address |
The HTTP address of the JournalNode web UI |
0.0.0.0:8480 |
dfs.journalnode.https-address |
The HTTPS address of the JournalNode web UI |
0.0.0.0:8481 |
dfs.journalnode.rpc-address |
The RPC address of the JournalNode web UI |
0.0.0.0:8485 |
dfs.datanode.http.address |
The address of the DataNode HTTP server |
0.0.0.0:9864 |
dfs.datanode.https.address |
The address of the DataNode HTTPS server |
0.0.0.0:9865 |
dfs.datanode.address |
The address of the DataNode for data transfer |
0.0.0.0:9866 |
dfs.datanode.ipc.address |
The IPC address of the DataNode |
0.0.0.0:9867 |
dfs.namenode.http-address |
The address and the base port to access the dfs NameNode web UI |
0.0.0.0:9870 |
dfs.namenode.https-address |
The secure HTTPS address of the NameNode |
0.0.0.0:9871 |
dfs.ha.automatic-failover.enabled |
Defines whether automatic failover is enabled |
true |
dfs.ha.fencing.methods |
A list of scripts or Java classes that will be used to fence the Active NameNode during a failover |
shell(/bin/true) |
dfs.journalnode.edits.dir |
The directory where to store journal edit files |
/srv/hadoop-hdfs/journalnode |
dfs.namenode.shared.edits.dir |
The directory on shared storage between the multiple NameNodes in an HA cluster.
This directory will be written by the active and read by the standby in order to keep the namespaces synchronized.
This directory does not need to be listed in |
--- |
dfs.internal.nameservices |
A unique nameservices identifier for a cluster or federation. For a single cluster, specify the name that will be used as an alias. For HDFS federation, specify, separated by commas, all namespaces associated with this cluster. This option allows you to use an alias instead of an IP address or FQDN for some commands, for example: |
— |
dfs.block.access.token.enable |
If set to |
false |
dfs.namenode.kerberos.principal |
The NameNode service principal.
This is typically set to |
nn/_HOST@REALM |
dfs.namenode.keytab.file |
The keytab file used by each NameNode daemon to login as its service principal.
The principal name is configured with |
/etc/security/keytabs/nn.service.keytab |
dfs.namenode.kerberos.internal.spnego.principal |
HTTP Kerberos principal name for the NameNode |
HTTP/_HOST@REALM |
dfs.web.authentication.kerberos.principal |
Kerberos principal name for the WebHDFS |
HTTP/_HOST@REALM |
dfs.web.authentication.kerberos.keytab |
Kerberos keytab file for WebHDFS |
/etc/security/keytabs/HTTP.service.keytab |
dfs.journalnode.kerberos.principal |
The JournalNode service principal.
This is typically set to |
jn/_HOST@REALM |
dfs.journalnode.keytab.file |
The keytab file used by each JournalNode daemon to login as its service principal.
The principal name is configured with |
/etc/security/keytabs/jn.service.keytab |
dfs.journalnode.kerberos.internal.spnego.principal |
The server principal used by the JournalNode HTTP Server for SPNEGO authentication when Kerberos security is enabled.
This is typically set to |
HTTP/_HOST@REALM |
dfs.datanode.data.dir.perm |
Permissions for the directories on the local filesystem where the DFS DataNode stores its blocks. The permissions can either be octal or symbolic |
700 |
dfs.datanode.kerberos.principal |
The DataNode service principal.
This is typically set to |
dn/_HOST@REALM.TLD |
dfs.datanode.keytab.file |
The keytab file used by each DataNode daemon to login as its service principal.
The principal name is configured with |
/etc/security/keytabs/dn.service.keytab |
dfs.http.policy |
Defines if HTTPS (SSL) is supported on HDFS.
This configures the HTTP endpoint for HDFS daemons.
The following values are supported: |
HTTP_ONLY |
dfs.data.transfer.protection |
A comma-separated list of SASL protection values used for secured connections to the DataNode when reading or writing block data. The possible values are:
If |
— |
dfs.encrypt.data.transfer |
Defines whether or not actual block data that is read/written from/to HDFS should be encrypted on the wire.
This only needs to be set on the NameNodes and DataNodes, clients will deduce this automatically.
It is possible to override this setting per connection by specifying custom logic via |
false |
dfs.encrypt.data.transfer.algorithm |
This value may be set to either |
3des |
dfs.encrypt.data.transfer.cipher.suites |
This value can be either undefined or |
— |
dfs.encrypt.data.transfer.cipher.key.bitlength |
The key bitlength negotiated by dfsclient and datanode for encryption.
This value may be set to either |
128 |
ignore.secure.ports.for.testing |
Allows skipping HTTPS requirements in the SASL mode |
false |
dfs.client.https.need-auth |
Whether SSL client certificate authentication is required |
false |
| Parameter | Description | Default value |
|---|---|---|
Federation nameservice |
The name of the federation nameservice |
ns-fed |
Import configuration |
Auto-generated configuration of imported clusters |
— |
Federation configuration |
Auto-generated federation parameters |
— |
External clusters configuration |
This section allows you to manually import an ADH cluster to a federation. To import a cluster, specify the following parameters:
|
— |
Proxy provider |
Class implementing the failover proxy provider used for Router HA |
org.apache.hadoop.hdfs.server.namenode.ha.ConfiguredFailoverProxyProvider |
dfs.federation.router.rpc-address |
RPC address to handle client request to the federation |
0.0.0.0:8888 |
dfs.federation.router.admin-address |
RPC address to handle admin requests |
0.0.0.0:8111 |
dfs.federation.router.http-address |
HTTP address to handle web requests to HDFS Router (web UI, WebHDFS REST API) |
0.0.0.0:50071 |
dfs.federation.router.https-address |
HTTPS address to handle web requests to HDFS Router (web UI, WebHDFS REST API) |
0.0.0.0:50071 |
dfs.federation.router.store.driver.zk.parent-path |
Parent znode path in ZooKeeper used by |
/hdfs-federation |
dfs.federation.router.store.serializer |
Class used to serialize/deserialize state store records |
org.apache.hadoop.hdfs.server.federation.store.driver.impl.StateStoreSerializerPBImpl |
dfs.federation.router.store.driver.class |
Implementation of the federation state store. The default implementation uses ZooKeeper as a state store |
org.apache.hadoop.hdfs.server.federation.store.driver.impl.StateStoreZooKeeperImpl |
dfs.federation.router.file.resolver.client.class |
Class responsible for resolving paths to subclusters within a federation |
org.apache.hadoop.hdfs.server.federation.resolver.MountTableResolver |
dfs.federation.router.monitor.namenode |
Identifier of the NameNodes to monitor and send heartbeats |
— |
dfs.nameservice.id |
Specifies which nameservice ID the client should use by default when connecting to a federation |
<current-hdfs-nameservice-id> |
| Parameter | Description | Default value |
|---|---|---|
dfs.federation.router.default.nameserviceId |
Nameservice ID of the default subcluster, to which HDFS Router forwards requests if no specific mount point is set |
— |
dfs.federation.router.default.nameservice.enable |
Enables reading and writing files to the default subcluster |
true |
dfs.federation.router.rpc.enable |
Allows HDFS Router to handle RPC requests from clients |
true |
dfs.federation.router.rpc-bind-host |
Address of the RPC server to bind to.
If this optional address is set, it overrides only the hostname portion of |
— |
dfs.federation.router.handler.count |
Number of threads for HDFS Router to handle RPC requests from clients |
10 |
dfs.federation.router.handler.queue.size |
Size of the queue to handle RPC client requests |
100 |
dfs.federation.router.reader.count |
Number of readers for HDFS Router to handle RPC client requests |
1 |
dfs.federation.router.reader.queue.size |
Size of the queue for readers to handle RPC client requests |
100 |
dfs.federation.router.connection.creator.queue-size |
Size of async connection creator queue |
100 |
dfs.federation.router.connection.pool-size |
Size of the pool of connections from HDFS Router to NameNodes |
1 |
dfs.federation.router.connection.min-active-ratio |
Minimum ratio of active connections from HDFS Router to NameNodes |
0.5f |
dfs.federation.router.connection.clean.ms |
Interval in milliseconds to check if the connection pool should remove unused connections |
10000 |
dfs.federation.router.enable.multiple.socket |
Enables/disables the use of multiple sockets for accessing NameNodes |
false |
dfs.federation.router.max.concurrency.per.connection |
Maximum number of requests a single connection can handle concurrently |
1 |
dfs.federation.router.connection.pool.clean.ms |
Interval in milliseconds to check, if the connection manager should remove unused connection pools |
60000 |
dfs.federation.router.metrics.enable |
Enables/disables generating HDFS Router metrics |
true |
dfs.federation.router.dn-report.time-out |
Timeout for |
1000 |
dfs.federation.router.dn-report.cache-expire |
Expiration time in seconds for a DataNode report |
10s |
dfs.federation.router.enable.get.dn.usage |
If set to |
true |
dfs.federation.router.metrics.class |
Class to monitor the RPC system in HDFS Router |
org.apache.hadoop.hdfs.server.federation.metrics.FederationRPCPerformanceMonitor |
dfs.federation.router.admin.enable |
Allows the RPC admin service in HDFS Router to handle client requests |
true |
dfs.federation.router.admin-bind-host |
Address for the RPC admin server to bind to |
— |
dfs.federation.router.admin.handler.count |
Number of threads for HDFS Router to handle admin RPC requests |
1 |
dfs.federation.router.admin.mount.check.enable |
If set to |
false |
dfs.federation.router.http-bind-host |
Address the HTTP server will bind to.
If this optional address is set, it overrides only the hostname portion of
|
— |
dfs.federation.router.https-bind-host |
Address the HTTPS server will bind to.
If this optional address is set, it overrides only the hostname portion of
|
— |
dfs.federation.router.http.enable |
Enables/disables handling client requests to HDFS Router over HTTP |
true |
dfs.federation.router.fs-limits.max-component-length |
Maximum number of bytes (in UTF-8 encoding) in each component of a path for HDFS Router.
Multiple size unit suffixes are supported (case-insensitive).
Acts similarly to |
0 |
dfs.federation.router.namenode.resolver.client.class |
Class to resolve NameNode membership in a subcluster |
org.apache.hadoop.hdfs.server.federation.resolver.MembershipNamenodeResolver |
dfs.federation.router.store.enable |
Enables the HDFS Router access to the state store |
true |
dfs.federation.router.store.connection.test |
Specifies how often to check the connection to the state store in milliseconds |
60000 |
dfs.federation.router.store.driver.zk.async.max.threads |
Maximum number of threads for |
-1 |
dfs.federation.router.heartbeat.enable |
Enables HDFS Router heartbeats to the state store |
true |
dfs.federation.router.heartbeat.interval |
Interval in milliseconds at which HDFS Router sends heartbeats to the state store |
5000 |
dfs.federation.router.health.monitor.timeout |
Timeout for HDFS Router to obtain |
30s |
dfs.federation.router.namenode.heartbeat.enable |
If set to |
true |
dfs.federation.router.namenode.heartbeat.jmx.interval |
Interval in milliseconds at which HDFS Router requests JMX reports from a NameNode.
If set to |
0 |
dfs.federation.router.store.router.expiration |
Expiration time in milliseconds for a state record |
5m |
dfs.federation.router.store.router.expiration.deletion |
Time in milliseconds before an expired router state is deleted. If an expired router state record exists longer than the time specified, it will be deleted. If set to a negative value, the deletion is disabled |
-1 |
dfs.federation.router.safemode.enable |
Enables the HDFS Router safe mode |
true |
dfs.federation.router.safemode.extension |
Time for HDFS Router to run in safe mode after startup. The parameter supports multiple time unit suffixes. If no suffix is specified, then milliseconds are assumed |
30s |
dfs.federation.router.safemode.expiration |
Time during which HDFS Router cannot access the state store to enter the safe mode. The parameter supports multiple time unit suffixes. If no suffix is specified, then milliseconds are assumed |
3m |
dfs.federation.router.safemode.checkperiod |
Interval to check for HDFS Router’s safe mode. The parameter supports multiple time unit suffixes. If no suffix is specified, then milliseconds are assumed |
5s |
dfs.federation.router.monitor.namenode.nameservice.resolution-enabled |
Used by HDFS Router to resolve NameNodes. Determines if the given monitored NameNode address is a domain name which needs to be resolved |
false |
dfs.federation.router.monitor.namenode.nameservice.resolver.impl |
Nameservice resolver implementation used by HDFS Router.
Effective in combination with |
— |
dfs.federation.router.monitor.localnamenode.enable |
If set to |
false |
dfs.federation.router.mount-table.max-cache-size |
Maximum number of entries in the mount table cache |
10000 |
dfs.federation.router.mount-table.cache.enable |
Enables/disables the mount table cache. Disabling the cache is recommended when a large number of unique paths are queried |
true |
dfs.federation.router.quota.enable |
Enables the quota system for HDFS Router. When enabled, setting or clearing a sub-cluster’s quota directly is not recommended, since the Router Admin server will override the sub-cluster’s quotas |
false |
dfs.federation.router.quota-cache.update.interval |
Interval for updating the quota usage cache in HDFS Router.
This property is effective only if |
60s |
dfs.federation.router.client.thread-size |
Maximum number of threads for |
32 |
dfs.federation.router.client.retry.max.attempts |
Maximum retry attempts for |
3 |
dfs.federation.router.client.reject.overload |
Setting |
false |
dfs.federation.router.client.allow-partial-listing |
Defines whether HDFS Router can return a partial list of files in a multi-destination mount point when one of the subclusters is unavailable.
Setting |
true |
dfs.federation.router.client.mount-status.time-out |
Timeout for HDFS Router when listing folders containing mount points. During this process, HDFS Router has to check the mount table and then check permissions in the subcluster. If the timeout expires, returns default values |
1s |
dfs.federation.router.connect.timeout |
Timeout for HDFS Router to connect to a subcluster |
2s |
dfs.federation.router.keytab.file |
The keytab file used by HDFS Router to log in as its service principal.
The principal name is configured with |
— |
dfs.federation.router.kerberos.principal |
The HDFS Router service principal.
This is typically set to |
— |
dfs.federation.router.kerberos.principal.hostname |
Host name of the HDFS Router containing this configuration file. This value differs for each machine. Defaults to the current host name |
— |
dfs.federation.router.kerberos.internal.spnego.principal |
Server principal used by HDFS Router for web UI SPNEGO authentication when Kerberos is enabled.
This is typically set to |
— |
dfs.federation.router.mount-table.cache.update |
Set to |
false |
dfs.federation.router.mount-table.cache.update.timeout |
Time to wait till all the admin servers finish their mount table cache update. This setting supports multiple time unit suffixes |
1m |
dfs.federation.router.mount-table.cache.update.client.max.time |
The remote Router mount table cache is updated through |
5m |
dfs.federation.router.secret.manager.class |
Class implementing state store for managing delegation tokens |
org.apache.hadoop.hdfs.server.federation.router.security.token.ZKDelegationTokenSecretManagerImpl |
dfs.federation.router.top.num.token.realowners |
Number of top owners of delegation tokens to report in HDFS Router’s JMX metrics, ordered by the number of issued tokens.
If set to |
10 |
dfs.federation.router.fairness.policy.controller.class |
Fairness policy controller class |
org.apache.hadoop.hdfs.server.federation.router.fairness.BasicFairnessPolicy |
dfs.federation.router.fairness.acquire.timeout |
Maximum time to wait for a permit |
1s |
dfs.federation.router.federation.rename.bandwidth |
Maximum bandwidth for cross-namespace rename operations |
10 |
dfs.federation.router.federation.rename.map |
Maximum number of concurrent rename maps to use for copy |
10 |
dfs.federation.router.federation.rename.delay |
Delay in milliseconds to retry a rename job |
1000 |
dfs.federation.router.federation.rename.diff |
Threshold of the diff entries used in the incremental copy stage |
0 |
dfs.federation.router.federation.rename.option |
Action to run when renaming across namespaces.
Possible values are |
NONE |
dfs.federation.router.federation.rename.force.close.open.file |
Enables force-closing of all open files when there are no diffs in the |
true |
dfs.federation.router.federation.rename.trash |
Controls the "trash" behavior when performing a cross-namespace rename. Supported values:
|
trash |
dfs.federation.router.observer.read.default |
Enables observer reads (served by standby or observer NameNodes) for all nameservices.
This parameter can be inverted for individual namespaces by adding them to |
false |
dfs.federation.router.observer.read.overrides |
Comma-separated list of namespaces, for which to invert the default observer read behavior ( |
— |
dfs.federation.router.observer.federated.state.propagation.maxsize |
Maximum size of the federated state to send in an RPC header.
Sending federated state removes the need to run |
5 |
dfs.federation.router.observer.state.id.refresh.period |
Interval to refresh namespace |
15s |
zk-dt-secret-manager.zkConnectionString |
ZooKeeper connection string for |
— |
zk-dt-secret-manager.zkAuthType |
Authentication type for connecting to ZooKeeper |
— |
| Parameter | Description | Default value |
|---|---|---|
httpfs.http.administrators |
The ACL for the admins.
This configuration is used to control who can access the default servlets for HttpFS server.
The value should be a comma-separated list of users and groups.
The user list comes first and is separated by a space, followed by the group list, for example: |
* |
hadoop.http.temp.dir |
The HttpFS temp directory |
${hadoop.tmp.dir}/httpfs |
httpfs.ssl.enabled |
Defines whether SSL is enabled.
Default is |
false |
httpfs.hadoop.config.dir |
The location of the Hadoop configuration directory |
/etc/hadoop/conf |
httpfs.hadoop.authentication.type |
Defines the authentication mechanism used by httpfs for its HTTP clients.
Valid values are |
simple |
httpfs.hadoop.authentication.kerberos.keytab |
The Kerberos keytab file with the credentials for the HTTP Kerberos principal used by httpfs in the HTTP endpoint.
|
/etc/security/keytabs/httpfs.service.keytab |
httpfs.hadoop.authentication.kerberos.principal |
The HTTP Kerberos principal used by HttpFS in the HTTP endpoint.
The HTTP Kerberos principal MUST start with |
HTTP/${httpfs.hostname}@${kerberos.realm} |
| Parameter | Description | Default value |
|---|---|---|
xasecure.audit.destination.solr.batch.filespool.dir |
Spool directory path |
/srv/ranger/hdfs_plugin/audit_solr_spool |
xasecure.audit.destination.solr.urls |
A URL of the Solr server to store audit events.
Leave this property value empty or set it to |
— |
xasecure.audit.destination.solr.zookeepers |
Specifies the ZooKeeper connection string for the Solr destination |
— |
xasecure.audit.destination.solr.force.use.inmemory.jaas.config |
Whether to use in-memory JAAS configuration file to connect to Solr |
— |
xasecure.audit.is.enabled |
Enables Ranger audit |
true |
xasecure.audit.jaas.Client.loginModuleControlFlag |
Specifies whether the success of the module is |
— |
xasecure.audit.jaas.Client.loginModuleName |
Name of the authenticator class |
— |
xasecure.audit.jaas.Client.option.keyTab |
Name of the keytab file to get the principal’s secret key |
— |
xasecure.audit.jaas.Client.option.principal |
Name of the principal to be used |
— |
xasecure.audit.jaas.Client.option.serviceName |
Name of a user or a service that wants to log in |
— |
xasecure.audit.jaas.Client.option.storeKey |
Set this to |
false |
xasecure.audit.jaas.Client.option.useKeyTab |
Set this to |
false |
| Parameter | Description | Default value |
|---|---|---|
ranger.plugin.hdfs.policy.rest.url |
The URL to Ranger Admin |
— |
ranger.plugin.hdfs.service.name |
The name of the Ranger service containing policies for this instance |
— |
ranger.plugin.hdfs.policy.cache.dir |
The directory where Ranger policies are cached after successful retrieval from the source |
/srv/ranger/hdfs/policycache |
ranger.plugin.hdfs.policy.pollIntervalMs |
Defines how often to poll for changes in policies |
30000 |
ranger.plugin.hdfs.policy.rest.client.connection.timeoutMs |
The HDFS Plugin RangerRestClient connection timeout (in milliseconds) |
120000 |
ranger.plugin.hdfs.policy.rest.client.read.timeoutMs |
The HDFS Plugin RangerRestClient read timeout (in milliseconds) |
30000 |
ranger.plugin.hdfs.policy.rest.ssl.config.file |
Path to the RangerRestClient SSL config file for the HDFS plugin |
/etc/hadoop/conf/ranger-hdfs-policymgr-ssl.xml |
| Parameter | Description | Default value |
|---|---|---|
Sources |
A list of sources which will be written into httpfs-env.sh |
— |
HADOOP_CONF_DIR |
Hadoop configuration directory |
/etc/hadoop/conf |
HADOOP_LOG_DIR |
Path to the directory that contains application logs (.log files) and startup logs (.out files) |
${HTTPFS_LOG} |
HADOOP_PID_DIR |
PID file directory location |
${HTTPFS_TEMP} |
HTTPFS_SSL_ENABLED |
Defines if SSL is enabled for httpfs |
false |
HTTPFS_SSL_KEYSTORE_FILE |
Path to the keystore file |
admin |
HTTPFS_SSL_KEYSTORE_PASS |
The password to access the keystore |
admin |
Final HTTPFS_ENV_OPTS |
Final value of the |
— |
| Parameter | Description | Default value |
|---|---|---|
Sources |
A list of sources that will be written into hadoop-env.sh |
— |
HDFS_NAMENODE_OPTS |
NameNode Heap Memory. Sets initial (-Xms) and maximum (-Xmx) Java heap memory size and environment options for the NameNode |
-Xms1G -Xmx8G |
HDFS_DATANODE_OPTS |
DataNode Heap Memory. Sets initial (-Xms) and maximum (-Xmx) Java heap memory size and environment options for the DataNode |
-Xms700m -Xmx8G |
HDFS_HTTPFS_OPTS |
HttpFS Heap Memory. Sets initial (-Xms) and maximum (-Xmx) Java heap memory size and environment options for the httpfs server |
-Xms700m -Xmx8G |
HDFS_JOURNALNODE_OPTS |
JournalNode Heap Memory. Sets initial (-Xms) and maximum (-Xmx) Java heap memory size and environment options for the JournalNode |
-Xms700m -Xmx8G |
HDFS_ZKFC_OPTS |
ZKFC Heap Memory. Sets initial (-Xms) and maximum (-Xmx) Java heap memory size and environment options for ZKFC |
-Xms500m -Xmx8G |
Final HADOOP_ENV_OPTS |
Final value of the |
— |
| Parameter | Description | Default value |
|---|---|---|
ssl.server.truststore.location |
The truststore to be used by NameNodes and DataNodes |
— |
ssl.server.truststore.password |
The password to the truststore |
— |
ssl.server.truststore.type |
The truststore file format |
jks |
ssl.server.truststore.reload.interval |
The truststore reload check interval (in milliseconds) |
10000 |
ssl.server.keystore.location |
Path to the keystore file used by NameNodes and DataNodes |
— |
ssl.server.keystore.password |
The password to the keystore |
— |
ssl.server.keystore.keypassword |
The password to the key in the keystore |
— |
ssl.server.keystore.type |
The keystore file format |
— |
| Parameter | Description | Default value |
|---|---|---|
DECOMMISSIONED |
When an administrator decommissions a DataNode, the DataNode will first be transitioned into |
— |
IN_MAINTENANCE |
Sometimes administrators only need to take DataNodes down for minutes/hours to perform short-term repair/maintenance.
For such scenarios, the HDFS block replication overhead, incurred by decommission, might not be necessary and a light-weight process is desirable.
And that is what maintenance state is used for.
When an administrator puts a DataNode in the maintenance state, the DataNode will first be transitioned to |
— |
| Parameter | Description | Default value |
|---|---|---|
Additional nameservices |
Additional (internal) names for an HDFS cluster that allows querying another HDFS cluster from the current one |
— |
Custom core-site.xml |
In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file core-site.xml |
— |
Custom hdfs-site.xml |
In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file hdfs-site.xml |
— |
Custom httpfs-site.xml |
In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file httpfs-site.xml |
— |
Ranger plugin enabled |
Whether or not Ranger plugin is enabled |
— |
Custom ranger-hdfs-audit.xml |
In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file ranger-hdfs-audit.xml |
— |
Custom ranger-hdfs-security.xml |
In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file ranger-hdfs-security.xml |
— |
Custom ranger-hdfs-policymgr-ssl.xml |
In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file ranger-hdfs-policymgr-ssl.xml |
— |
Custom httpfs-env.sh |
In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file httpfs-env.sh |
— |
Custom hadoop-env.sh |
In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file hadoop-env.sh |
— |
Custom ssl-server.xml |
In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file ssl-server.xml |
— |
Custom ssl-client.xml |
In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file ssl-client.xml |
— |
Topology script |
The topology script used in HDFS |
— |
Topology data |
An otional text file to map host names to the rack number for topology script. Stored to /etc/hadoop/conf/topology.data |
— |
Custom log4j.properties |
In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file log4j.properties |
|
Custom httpfs-log4j.properties |
In this section you can define values for custom parameters that are not displayed in ADCM UI, but are allowed in the configuration file httpfs-log4j.properties |
| Parameter | Description | Default value |
|---|---|---|
Java agent path |
Path to the JMX Prometheus Java agent |
/usr/lib/adh-utils/jmx/jmx_prometheus_javaagent.jar |
Prometheus metrics port |
Port on which to display HDFS DataNode metrics in the Prometheus format |
9202 |
Mapping config path |
Path to the metrics mapping configuration file |
/etc/hadoop/conf/jmx_hdfs_datanode_metric_config.yml |
Mapping config |
Metrics mapping configuration file |
| Parameter | Description | Default value |
|---|---|---|
Java agent path |
Path to the JMX Prometheus Java agent |
/usr/lib/adh-utils/jmx/jmx_prometheus_javaagent.jar |
Prometheus metrics port |
Port on which to display HDFS JournalNode metrics in the Prometheus format |
9203 |
Mapping config path |
Path to the metrics mapping configuration file |
/etc/hadoop/conf/jmx_hdfs_journalnode_metric_config.yml |
Mapping config |
Metrics mapping configuration file |
| Parameter | Description | Default value |
|---|---|---|
Java agent path |
Path to the JMX Prometheus Java agent |
/usr/lib/adh-utils/jmx/jmx_prometheus_javaagent.jar |
Prometheus metrics port |
Port on which to display HDFS NameNode metrics in the Prometheus format |
9201 |
Mapping config path |
Path to the metrics mapping configuration file |
/etc/hadoop/conf/jmx_hdfs_namenode_metric_config.yml |
Mapping config |
Metrics mapping configuration file |
| Parameter | Description | Default value |
|---|---|---|
Username |
Username for basic authentication |
— |
Password |
Password for basic authentication |
— |
|
NOTE
When the Monitoring authentication parameter group is enabled, access to metrics becomes restricted, and Prometheus uses the specified credentials for data collection.
|