Hadoop command-line
This article explains the difference between hadoop fs {args}
, hadoop dfs {args}
, and hdfs dfs {args}
commands.
The fs
parameter refers to a shared file system that can point to any file system, such as local, HDFS, etc.
So, fs
can be used while working with various file systems, such as Local FS, (S)FTP, S3, and others.
When you use the dfs
parameter, you directly specify the interaction with HDFS.
So, if you want to work only with HDFS, use the hdfs dfs
command.
But if you need to access/transfer data between different file systems, use the hadoop fs
command.
To interact with the HDFS shell, a user must have sufficient read/write permissions to an HDFS directory(-ies). These directories can be created by a user that belongs to either hadoop
or hdfs
group. For demonstration purposes, we use the hdfs
user in the example below.
First, you can grant root
privileges to the current user:
$ sudo -s
And then you need to switch to hdfs
user:
$ su - hdfs
Now you can execute commands in the HDFS shell.