Connect to MapReduce via CLI
Overview
MapReduce provides a command line interface implemented as a bin/mapred script. To interact with MapReduce via CLI, connect to a cluster host with a MapReduce component via SSH and run the desired MapReduce CLI command.
For example, check the version by running:
$ mapred version
Output example:
Hadoop 3.2.4 Source code repository git@ssh.gitlab.adsw.io:arenadata/infrastructure/code/ci/prj_adh.git -r 3cb85f40e394dcfb50fe77310908cce385381ba2 Compiled by jenkins on 2024-03-20T17:53Z Compiled with protoc 2.5.0 From source with checksum ee031c16fe785bbb35252c749418712 This command was run using /usr/lib/hadoop/hadoop-common-3.2.4.jar
Running mapred without any arguments prints the list of all MapReduce commands.
The commands have the following syntax:
$ mapred [SHELL_OPTIONS] COMMAND [GENERIC_OPTIONS] [COMMAND_OPTIONS]
Where:
-
SHELL_OPTIONS— Hadoop shell options listed in the Shell Options section of Hadoop Commands Reference. -
GENERIC_OPTIONS— standard options supported by multiple commands. For more information, see the Generic Options section of Hadoop Commands Reference. -
COMMAND_OPTIONS— command-specific options described in references for each command.
Usage examples
Typically, a MapReduce job is launched using the yarn jar or hadoop jar commands. But you can use MapReduce CLI, for example, to get information about specific jobs and control their execution.
To see the list of all jobs, run:
$ mapred job -list all
Output example:
Total jobs:13
JobId JobName State StartTime UserName Queue Priority UsedContainers RsvdContainers UsedMem RsvdMem NeededMem AM info
job_1713528166269_0001 QuasiMonteCarlo SUCCEEDED 1713529720429 yarn default DEFAULT N/A N/A N/A N/A N/A http://elenas-adh2.ru-central1.internal:8088/proxy/application_1713528166269_0001/
job_1713770455913_0001 QuasiMonteCarlo SUCCEEDED 1713770623388 yarn default DEFAULT N/A N/A N/A N/A N/A http://elenas-adh2.ru-central1.internal:8088/proxy/application_1713770455913_0001/
job_1713770455913_0002 QuasiMonteCarlo SUCCEEDED 1713771341014 yarn default DEFAULT N/A N/A N/A N/A N/A http://elenas-adh2.ru-central1.internal:8088/proxy/application_1713770455913_0002/
job_1713964485222_0001 QuasiMonteCarlo PREP 1713965480860 yarn default DEFAULT 1 0 2048M 0M 2048M http://elenas-adh2.ru-central1.internal:8088/proxy/application_1713964485222_0001/
job_1701175565199_0001 QuasiMonteCarlo SUCCEEDED 1701175708248 yarn default DEFAULT N/A N/A N/A N/A N/A http://elenas-adh2.ru-central1.internal:8088/proxy/application_1701175565199_0001/
job_1705311887839_0001 QuasiMonteCarlo SUCCEEDED 1705319626157 yarn default DEFAULT N/A N/A N/A N/A N/A http://elenas-adh2.ru-central1.internal:8088/proxy/application_1705311887839_0001/
job_1705311887839_0002 QuasiMonteCarlo SUCCEEDED 1705319754517 yarn default DEFAULT N/A N/A N/A N/A N/A http://elenas-adh2.ru-central1.internal:8088/proxy/application_1705311887839_0002/
job_1705573015955_0002 QuasiMonteCarlo SUCCEEDED 1705658611315 yarn default DEFAULT N/A N/A N/A N/A N/A http://elenas-adh2.ru-central1.internal:8088/proxy/application_1705573015955_0002/
job_1704872668642_0003 QuasiMonteCarlo SUCCEEDED 1704884498654 yarn default DEFAULT N/A N/A N/A N/A N/A http://elenas-adh2.ru-central1.internal:8088/proxy/application_1704872668642_0003/
job_1712415098046_0001 QuasiMonteCarlo SUCCEEDED 1712416353017 yarn default DEFAULT N/A N/A N/A N/A N/A http://elenas-adh2.ru-central1.internal:8088/proxy/application_1712415098046_0001/
job_1704872668642_0004 QuasiMonteCarlo SUCCEEDED 1704884603925 yarn default DEFAULT N/A N/A N/A N/A N/A http://elenas-adh2.ru-central1.internal:8088/proxy/application_1704872668642_0004/
job_1704872668642_0001 distcp SUCCEEDED 1704884156346 hdfs default DEFAULT N/A N/A N/A N/A N/A http://elenas-adh2.ru-central1.internal:8088/proxy/application_1704872668642_0001/
job_1704872668642_0002 distcp SUCCEEDED 1704884247607 hdfs default DEFAULT N/A N/A N/A N/A N/A http://elenas-adh2.ru-central1.internal:8088/proxy/application_1704872668642_0002/
To see the status of a specific job, run:
$ mapred job -status <job-ID>
Where job-ID is the ID of the job for which you want to check the status.
Output example:
Job: job_1713964485222_0001
Job File: hdfs://adh/user/yarn/.staging/job_1713964485222_0001/job.xml
Job Tracking URL : http://elenas-adh2.ru-central1.internal:8088/proxy/application_1713964485222_0001/
Uber job : false
Number of maps: 16
Number of reduces: 1
map() completion: 0.625
reduce() completion: 0.0
Job state: RUNNING
retired: false
reason for failure:
Counters: 33
File System Counters
FILE: Number of bytes read=0
FILE: Number of bytes written=2306730
FILE: Number of read operations=0
FILE: Number of large read operations=0
FILE: Number of write operations=0
HDFS: Number of bytes read=2536
HDFS: Number of bytes written=0
HDFS: Number of read operations=40
HDFS: Number of large read operations=0
HDFS: Number of write operations=0
Job Counters
Launched map tasks=11
Data-local map tasks=11
Total time spent by all maps in occupied slots (ms)=19885
Total time spent by all map tasks (ms)=19885
Total vcore-milliseconds taken by all map tasks=19885
Total megabyte-milliseconds taken by all map tasks=20362240
Map-Reduce Framework
Map input records=10
Map output records=20
Map output bytes=180
Map output materialized bytes=280
Input split bytes=1356
Combine input records=0
Spilled Records=20
Failed Shuffles=0
Merged Map outputs=0
GC time elapsed (ms)=406
CPU time spent (ms)=3600
Physical memory (bytes) snapshot=3162824704
Virtual memory (bytes) snapshot=28058554368
Total committed heap usage (bytes)=2188902400
Peak Map Physical memory (bytes)=346021888
Peak Map Virtual memory (bytes)=2810654720
File Input Format Counters
Bytes Read=1180