HDFS Command Cheatsheet

List commands

hdfs dfs -ls /

Lists all the files/directories in HDFS destination path

hdfs dfs -ls -d /tmp

Lists directories as plain files. In this case, this command lists the details of the tmp directory

hdfs dfs -ls -h /hdfsDirectory

Formats file sizes in a human-readable fashion in an HDFS directory

hdfs dfs -ls -R /hdfsDirectory

Recursively lists all files and all subdirectories in an HDFS directory

hdfs dfs -ls /hdfsDirectory/hdfsFile*

Lists all the files matching the pattern. In this case, lists all the files inside the hdfsDirectory HDFS directory that start with hdfsFile

Read and write commands

hdfs dfs -text /hdfsDirectory/hdfsFile

Takes a source file and outputs the file in text format to stdout. The allowed formats are ZIP and TextRecordInputStream

hdfs dfs -cat /hdfsDirectory/hdfsFile

Displays the content of an HDFS file to stdout

hdfs dfs -appendToFile localfsFile /tmp/hdfsFile

Appends the content from a local file to an HDFS file

Download and upload commands

hdfs dfs -put localfsFile /hdfsDirectory

Copies a file from the local file system to an HDFS directory

hdfs dfs -put -f localfsFile /hdfsDirectory

Copies a file from the local file system to an HDFS directory, and in case the HDFS file already exists, overwrites the file using the -f option

hdfs dfs -put -l localfsFile /hdfsDirectory

Copies a file from the local file system to an HDFS directory. Allows a DataNode to save the file to disk lazily

hdfs dfs -put -p localfsFile /hdfsDirectory

Copies a file from the local file system to an HDFS directory. Passing -p preserves access and modification times, ownership, and the mode

hdfs dfs -get hdfsFile /localfsDirectory

Copies a file from an HDFS directory to the local file system directory

hdfs dfs -get -p hdfsFile /localfsDirectory

Copies a file from an HDFS directory to the local file system directory. Passing -p preserves access and modification times, ownership, and the mode

hdfs dfs -get /localfsDirectory/*.*/hdfsDirectory

Copies all files matching the pattern from the local file system directory to an HDFS directory

hdfs dfs -copyFromLocal localfsFile /hdfsDirectory

Works similarly to the put command, except that the source is restricted to a local file reference

hdfs dfs -copyToLocal hdfsFile /localfsDirectory

Works similarly to the put command, except that the destination is restricted to a local file reference

hdfs dfs -moveFromLocal localfsFile /hdfsDirectory

Works similarly to the put command, except that the source is deleted after it’s copied

File management commands

hdfs dfs -cp /hdfsDirectory1/hdfsFile1 /hdfsDirectory2

Copies a file from an HDFS directory to another HDFS directory

hdfs dfs -cp -p /hdfsDirectory1/hdfsFile1 /hdfsDirectory2

Copies a file from an HDFS directory to another HDFS directory. Passing -p preserves access and modification times, ownership, and the mode

hdfs dfs -cp -f /hdfsDirectory1/hdfsFile1 /hdfsDirectory2

Copies a file from an HDFS directory to another HDFS directory. Passing -f overwrites the destination if it already exists

hdfs dfs -mv /hdfsDirectory1/hdfsFile1 /hdfsDirectory2

Moves a file from an HDFS directory to another HDFS directory

hdfs dfs -rm /hdfsDirectory/hdfsFile

hdfs dfs -rm -r /hdfsDirectory

hdfs dfs -rm -R /hdfsDirectory

hdfs dfs -rmr /hdfsDirectory

Deletes all subdirectories and files recursively from an HDFS directory

hdfs dfs -rm -skipTrash /hdfsDirectory

The -skipTrash option bypasses the trash, if enabled, and deletes the specified file(s) in an HDFS directory immediately

hdfs dfs -rm -f hdfsFile

If the file does not exist, a warning is not displayed when executing the command

hdfs dfs -rmdir /hdfsDirectory

Deletes an HDFS directory

hdfs dfs -mkdir /hdfsDirectory

Creates a directory in the specified HDFS location

hdfs dfs -mkdir -f /hdfsDirectory1/hdfsDirectory2

Creates a directory in the specified HDFS location. This command does not fail even if the directory already exists

hdfs dfs -touchz /hdfsDirectory/hdfsFile

Creates a file of zero length at the specified path with the current time as the timestamp of that path

Ownership and validation commands

hdfs dfs -checksum /hdfsDirectory/hdfsFile

Dumps the checksum information for a file to stdout

hdfs dfs -chmod 755 /hdfsDirectory/hdfsFile

Changes the permissions of the file

hdfs dfs -chmod -R 755 /hdfsDirectory

Changes the permissions of the files recursively

hdfs dfs -chown owner:group /hdfsDirectory

Changes the owner of the file

hdfs dfs -chown -R owner:group /hdfsDirectory

Changes the owner of the files recursively

hdfs dfs -chgrp group /hdfsDirectory

Changes the group association of the file

hdfs dfs -chgrp -R group /hdfsDirectory

Changes the group association of the files recursively

Filesystem commands

hdfs dfs -df /

Shows the capacity, free, and used space

hdfs dfs -df -h /

Shows the capacity, free, and used space of the filesystem. The key -h formats the sizes of files in a human-readable format

hdfs dfs -du /hdfsDirectory/hdfsFile

Shows the occupied space (in bytes) used by the files that match the specified file pattern

hdfs dfs -du -s /hdfsDirectory/hdfsFile

Rather than showing the size of each file that matches the pattern, shows the total (summary) size

hdfs dfs -du -h /hdfsDirectory/hdfsFile

Shows the occupied space (in bytes) used by the files that match the specified file pattern. Formats the sizes of files in a human-readable format

Administration commands

hdfs balancer -threshold 30

Launches the cluster balancing utility. The default threshold value is being overwritten

hadoop version

Prints the current Hadoop version

hdfs fsck /

Checks the health of HDFS

hdfs dfsadmin -safemode leave

Turns off the safe mode for a NameNode

hdfs dfsadmin -refreshNodes

Re-reads the hosts and exclude files to update the set of DataNodes that are allowed to connect to the NameNode and those that should be decommissioned or recommissioned

hdfs namenode -format

Formats the NameNode

Found a mistake? Seleсt text and press Ctrl+Enter to report it