Hadoop command-line

This article explains the difference between hadoop fs {args}, hadoop dfs {args}, and hdfs dfs {args} commands.

The fs parameter refers to a shared file system that can point to any file system, such as local, HDFS, etc.
So, fs can be used while working with various file systems, such as Local FS, (S)FTP, S3, and others.

HadoopFSLayers dark
Hadoop FS layers
HadoopFSLayers light
Hadoop FS layers

When you use the dfs parameter, you directly specify the interaction with HDFS.
So, if you want to work only with HDFS, use the hdfs dfs command.
But if you need to access/transfer data between different file systems, use the hadoop fs command.

To interact with the HDFS shell, a user must have sufficient read/write permissions to an HDFS directory(-ies). These directories can be created by a user that belongs to either hadoop or hdfs group. For demonstration purposes, we use the hdfs user in the example below.

First, you can grant root privileges to the current user:

$ sudo -s

And then you need to switch to hdfs user:

$ su - hdfs

Now you can execute commands in the HDFS shell.

Found a mistake? Seleсt text and press Ctrl+Enter to report it