HDFS service management via ADCM
Overview
The ADCM UI provides actions to manage the HDFS service and its components. For information on how to run service actions, refer to ADH service actions.
The actions available for the HDFS service are listed in the table below.
Action | Description |
---|---|
Check |
Runs service-specific tests to check the health of the service and its components |
Start |
Starts the service.
When you run this action, the option Apply configs from ADCM is available.
If it is set to |
Stop |
Stops the service |
Restart |
Restarts the service.
When you run this action, the option Apply configs from ADCM is available.
If it is set to |
Remove |
Removes the service from the cluster. This action should be used to remove already installed services. Whereas the control can be used to remove a non-mapped service (a service which components have not been distributed among cluster hosts) |
Add Client(s) |
Adds HDFS client(s) to the cluster. Running this action opens the component-host mapping interface where you can add new HDFS clients |
Remove Client(s) |
Removes HDFS client(s) from cluster hosts. Running this action opens the component-host mapping interface where you can remove HDFS clients from specific hosts |
Start balancer |
Starts the HDFS Balancer |
Stop balancer |
Stops the HDFS Balancer |
Expand DataNode |
Adds DataNodes to the cluster hosts. Running this action opens the component-host mapping interface where you can add new DataNodes |
Remove DataNode |
Removes DataNodes from the cluster hosts. Running this action opens the component-host mapping interface where you can remove DataNodes |
Decommiss DataNodes |
Allows you to replicate the selected DataNodes' data to other DataNodes, for example, to safely delete the DataNode or take it down for a long-term maintenance. For short-term decommission, use the maintenance action |
Maintenance DataNodes |
Takes a DataNode out of service. In the maintenance mode, the DataNode does not accept changes and does not replicate or delete blocks |
Recommiss DataNodes |
Reinstates the decommissioned DataNode and balances data between nodes |
Exit Maintenance Mode |
Returns the DataNode under maintenance back to work |
Start disk balancer |
Starts the disk balancer |
Stop disk balancer |
Stops the disk balancer |
Check disk balancer |
Gets the current Disk balancer status from the specified DataNodes. To view the report, go to the Jobs page |
Report disk balancer |
Reports volume information from the specified DataNodes. To view the report, go to the Jobs page |
Add HttpFS Server(s) |
Adds HDFS HttpFS servers components to the cluster hosts. Running this action opens the component-host mapping interface where you can add new HttpFS Server(s) |
Remove HttpFS Server(s) |
Removes HDFS HttpFS servers components from the cluster hosts. Running this action opens the component-host mapping interface where you can remove HttpFS Server(s) |
Add JournalNodes |
Adds JournalNodes to the cluster hosts. Running this action opens the component-host mapping interface where you can add new JournalNodes |
Remove JournalNodes |
Removes JournalNodes from the cluster hosts. Running this action opens the component-host mapping interface where you can remove JournalNodes |
Start mover |
Starts the Mover. When starting the mover, enter the directories whose storage policies must be ensured |
Stop mover |
Stops the Mover |
Add NameNode(s) |
Adds NameNodes to the cluster hosts. Running this action opens the component-host mapping interface where you can add new NameNode(s) |
Remove NameNode(s) |
Removes NameNodes from the cluster hosts. Running this action opens the component-host mapping interface where you can remove NameNode(s) |
Change internal nameservices |
Allows you to change the internal nameservices. The value must be alphanumeric without underscores |
Manage Ranger plugin |
Enables or disables Ranger plugin for HDFS |
Balancer
The Balancer helps managing DataNodes' load in a cluster. You can start the balancer whenever there’s an uneven distribution of data between DataNodes, for example, if a new DataNode has been created. The balancer stops when the DataNodes' load is at the acceptable threshold or lower.
The threshold represents how much the load of a specific DataNode may diverge from the load of the whole cluster, specified in percentage of the disk space.
After you select the Start balancer action, fill in the following fields in the window that appears (or leave empty to use the default values):
-
Threshold — a percentage value between 1 and 100. The default value is 10%. Smaller values make a more balanced cluster but the balancing will take longer. If a value is too small and the DataNodes' load changes concurrently, the cluster may not be able to reach the balanced state.
-
Hosts to exclude — FDQN of the hosts, whose DataNodes should be ignored by the balancer.
-
Hosts to include — FDQN of the hosts, whose DataNodes should be included in the balancing process. By default, all hosts are included.
-
Source hosts — FDQN of the hosts, whose DataNodes require balancing in the first place. The balancer will move blocks from only those specified DataNodes. By default, all hosts count as source hosts.
-
Idle iterations — the number of iterations a balancer can remain idle before it stops. The default value is 5.
You can run balancer with additional parameters by using the balancer CLI command or by changing its parameters in the hdfs-site.xml configuration file.
Disk balancer
Disk balancer helps managing load in a single DataNode between directories. You can add data directories in the dfs.datanode.data.dir
parameter of HDFS configuration.
You can run disk balancer with additional parameters by using the disk balancer CLI command or by changing its parameters in the hdfs-site.xml configuration file.
Mover
The Mover is a data migration tool that checks if the data in the specified directory complies with the storage policy and if it doesn’t, moves the replicas to a different storage in order to fulfill the storage policy requirement.
After changing a data storage policy, it is not applied automatically. Use the mover action to ensure the new data storage policy is fulfilled.
You can run mover with additional parameters by using the mover CLI command or by changing its parameters in the hdfs-site.xml configuration file.
Internal nameservice
Internal nameservice is an additional (internal) name for an HDFS cluster that allows to query another HDFS cluster from the current one. For example, to transfer data between clusters or create tasks.
You can query any nameservice specified in the dfs.internal.nameservices
parameter of the hdfs-site.xml configuration file. This cluster’s DataNodes will report to all the nameservices in this list.