HDFS command cheatsheet
List commands
hdfs dfs -ls / |
Lists all the files/directories in HDFS destination path |
hdfs dfs -ls -d /tmp |
Lists directories as plain files. In this case, this command lists the details of the tmp directory |
hdfs dfs -ls -h /hdfsDirectory |
Formats file sizes in a human-readable fashion in an HDFS directory |
hdfs dfs -ls -R /hdfsDirectory |
Recursively lists all files and all subdirectories in an HDFS directory |
hdfs dfs -ls /hdfsDirectory/hdfsFile* |
Lists all the files matching the pattern. In this case, lists all the files inside the hdfsDirectory HDFS directory that start with hdfsFile |
Read and write commands
hdfs dfs -text /hdfsDirectory/hdfsFile |
Takes a source file and outputs the file in text format to stdout. The allowed formats are ZIP and TextRecordInputStream |
hdfs dfs -cat /hdfsDirectory/hdfsFile |
Displays the content of an HDFS file to stdout |
hdfs dfs -appendToFile localfsFile /tmp/hdfsFile |
Appends the content from a local file to an HDFS file |
Download and upload commands
hdfs dfs -put localfsFile /hdfsDirectory |
Copies a file from the local file system to an HDFS directory |
hdfs dfs -put -f localfsFile /hdfsDirectory |
Copies a file from the local file system to an HDFS directory, and in
case the HDFS file already exists, overwrites the file using the |
hdfs dfs -put -l localfsFile /hdfsDirectory |
Copies a file from the local file system to an HDFS directory. Allows a DataNode to save the file to disk lazily |
hdfs dfs -put -p localfsFile /hdfsDirectory |
Copies a file from the local file system to an HDFS directory.
Passing |
hdfs dfs -get hdfsFile /localfsDirectory |
Copies a file from an HDFS directory to the local file system directory |
hdfs dfs -get -p hdfsFile /localfsDirectory |
Copies a file from an HDFS directory to the local file system directory.
Passing |
hdfs dfs -get /localfsDirectory/*.*/hdfsDirectory |
Copies all files matching the pattern from the local file system directory to an HDFS directory |
hdfs dfs -copyFromLocal localfsFile /hdfsDirectory |
Works similarly to the put command, except that the source is restricted to a local file reference |
hdfs dfs -copyToLocal hdfsFile /localfsDirectory |
Works similarly to the put command, except that the destination is restricted to a local file reference |
hdfs dfs -moveFromLocal localfsFile /hdfsDirectory |
Works similarly to the put command, except that the source is deleted after it’s copied |
File management commands
hdfs dfs -cp /hdfsDirectory1/hdfsFile1 /hdfsDirectory2 |
Copies a file from an HDFS directory to another HDFS directory |
hdfs dfs -cp -p /hdfsDirectory1/hdfsFile1 /hdfsDirectory2 |
Copies a file from an HDFS directory to another HDFS directory.
Passing |
hdfs dfs -cp -f /hdfsDirectory1/hdfsFile1 /hdfsDirectory2 |
Copies a file from an HDFS directory to another HDFS directory.
Passing |
hdfs dfs -mv /hdfsDirectory1/hdfsFile1 /hdfsDirectory2 |
Moves a file from an HDFS directory to another HDFS directory |
hdfs dfs -rm /hdfsDirectory/hdfsFile hdfs dfs -rm -r /hdfsDirectory hdfs dfs -rm -R /hdfsDirectory hdfs dfs -rmr /hdfsDirectory |
Deletes all subdirectories and files recursively from an HDFS directory |
hdfs dfs -rm -skipTrash /hdfsDirectory |
The |
hdfs dfs -rm -f hdfsFile |
If the file does not exist, a warning is not displayed when executing the command |
hdfs dfs -rmdir /hdfsDirectory |
Deletes an HDFS directory |
hdfs dfs -mkdir /hdfsDirectory |
Creates a directory in the specified HDFS location |
hdfs dfs -mkdir -f /hdfsDirectory1/hdfsDirectory2 |
Creates a directory in the specified HDFS location. This command does not fail even if the directory already exists |
hdfs dfs -touchz /hdfsDirectory/hdfsFile |
Creates a file of zero length at the specified path with the current time as the timestamp of that path |
Ownership and validation commands
hdfs dfs -checksum /hdfsDirectory/hdfsFile |
Dumps the checksum information for a file to stdout |
hdfs dfs -chmod 755 /hdfsDirectory/hdfsFile |
Changes the permissions of the file |
hdfs dfs -chmod -R 755 /hdfsDirectory |
Changes the permissions of the files recursively |
hdfs dfs -chown owner:group /hdfsDirectory |
Changes the owner of the file |
hdfs dfs -chown -R owner:group /hdfsDirectory |
Changes the owner of the files recursively |
hdfs dfs -chgrp group /hdfsDirectory |
Changes the group association of the file |
hdfs dfs -chgrp -R group /hdfsDirectory |
Changes the group association of the files recursively |
Filesystem commands
hdfs dfs -df / |
Shows the capacity, free, and used space |
hdfs dfs -df -h / |
Shows the capacity, free, and used space of the filesystem.
The key |
hdfs dfs -du /hdfsDirectory/hdfsFile |
Shows the occupied space (in bytes) used by the files that match the specified file pattern |
hdfs dfs -du -s /hdfsDirectory/hdfsFile |
Rather than showing the size of each file that matches the pattern, shows the total (summary) size |
hdfs dfs -du -h /hdfsDirectory/hdfsFile |
Shows the occupied space (in bytes) used by the files that match the specified file pattern. Formats the sizes of files in a human-readable format |
Administration commands
hdfs balancer -threshold 30 |
Launches the cluster balancing utility. The default threshold value is being overwritten |
hadoop version |
Prints the current Hadoop version |
hdfs fsck / |
Checks the health of HDFS |
hdfs dfsadmin -safemode leave |
Turns off the safe mode for a NameNode |
hdfs dfsadmin -refreshNodes |
Re-reads the hosts and exclude files to update the set of DataNodes that are allowed to connect to the NameNode and those that should be decommissioned or recommissioned |
hdfs namenode -format |
Formats the NameNode |