DataNode hot swapping
DataNode hot swapping in HDFS is a process of changing a disk on a DataNode without shutting it down.
The following instructions describe how to configure a new disk and add it to the DataNode using CLI and ADCM.
To add a new disk to the DataNode:
-
Connect a disk to the desired host. You can check if it’s visible to the system by running the
lsblk
command on the host. Possible output:NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT vda 253:0 0 100G 0 disk ├─vda1 253:1 0 1M 0 part └─vda2 253:2 0 100G 0 part / vdb 253:16 0 20G 0 disk
-
Create a directory for HDFS:
$ mkdir -p /srv/hadoop-hdfs/data1
-
Create a file system on the disk:
$ mkfs.xfs /dev/vdb
-
Mount the disk:
$ mount /dev/vdb /srv/hadoop-hdfs/data1
-
Add the new filesystem to the fstab:
$ echo "/dev/vdb /srv/hadoop-hdfs/data1 xfs defaults,noatime 0 0" | sudo tee --append /etc/fstab
-
Mount the file system:
$ mount -a
You can check if the system has been mounted successfully using the
lsblk
command. Possible output:NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT vda 253:0 0 100G 0 disk ├─vda1 253:1 0 1M 0 part └─vda2 253:2 0 100G 0 part / vdb 253:16 0 20G 0 disk /srv/hadoop-hdfs/data1
-
Make
hdfs
to be the owner of the new directory and grant it permissions as follows:$ chown -R hdfs:hadoop /srv/hadoop-hdfs/data1 $ chmod -R 755 /srv/hadoop-hdfs/data1
-
Specify the created directory in the
dfs.datanode.data.dir
parameter for the selected DataNode. You can do this manually, by editing the hfds-site.xml file on the DataNode host, or by creating a config group in ADCM. For more information on how to change thedfs.datanode.data.dir
parameter value, see the Add HDFS data directories article. To learn how to create a config group, refer to the Set up configuration groups article.CAUTIONChange the
dfs.datanode.data.dir
property only for the DataNode whose host has the required directory. Changing the parameter for the whole system may result in an error. -
Start the DataNode reconfiguration by running the command:
$ hdfs dfsadmin -reconfig datanode <HOST>:9867 start
Where
<HOST>
is the FQDN of the DataNode host. To check the status of the reconfiguration task, run:$ hdfs dfsadmin -reconfig datanode <HOST>:9867 status
To delete a disk from the DataNode, edit the dfs.datanode.data.dir
property on the DataNode host and run the reconfiguration command.