Connect to MapReduce via CLI
Overview
MapReduce provides a command line interface implemented as a bin/mapred script. To interact with MapReduce via CLI, connect to a cluster host with a MapReduce component via SSH and run the desired MapReduce CLI command.
For example, check the version by running:
$ mapred version
Output example:
Hadoop 3.2.4 Source code repository git@ssh.gitlab.adsw.io:arenadata/infrastructure/code/ci/prj_adh.git -r 3cb85f40e394dcfb50fe77310908cce385381ba2 Compiled by jenkins on 2024-03-20T17:53Z Compiled with protoc 2.5.0 From source with checksum ee031c16fe785bbb35252c749418712 This command was run using /usr/lib/hadoop/hadoop-common-3.2.4.jar
Running mapred
without any arguments prints the list of all MapReduce commands.
The commands have the following syntax:
$ mapred [SHELL_OPTIONS] COMMAND [GENERIC_OPTIONS] [COMMAND_OPTIONS]
Where:
-
SHELL_OPTIONS
— Hadoop shell options listed in the Shell Options section of Hadoop Commands Reference. -
GENERIC_OPTIONS
— standard options supported by multiple commands. For more information, see the Generic Options section of Hadoop Commands Reference. -
COMMAND_OPTIONS
— command-specific options described in references for each command.
Usage examples
Typically, a MapReduce job is launched using the yarn jar or hadoop jar commands. But you can use MapReduce CLI, for example, to get information about specific jobs and control their execution.
To see the list of all jobs, run:
$ mapred job -list all
Output example:
Total jobs:13 JobId JobName State StartTime UserName Queue Priority UsedContainers RsvdContainers UsedMem RsvdMem NeededMem AM info job_1713528166269_0001 QuasiMonteCarlo SUCCEEDED 1713529720429 yarn default DEFAULT N/A N/A N/A N/A N/A http://elenas-adh2.ru-central1.internal:8088/proxy/application_1713528166269_0001/ job_1713770455913_0001 QuasiMonteCarlo SUCCEEDED 1713770623388 yarn default DEFAULT N/A N/A N/A N/A N/A http://elenas-adh2.ru-central1.internal:8088/proxy/application_1713770455913_0001/ job_1713770455913_0002 QuasiMonteCarlo SUCCEEDED 1713771341014 yarn default DEFAULT N/A N/A N/A N/A N/A http://elenas-adh2.ru-central1.internal:8088/proxy/application_1713770455913_0002/ job_1713964485222_0001 QuasiMonteCarlo PREP 1713965480860 yarn default DEFAULT 1 0 2048M 0M 2048M http://elenas-adh2.ru-central1.internal:8088/proxy/application_1713964485222_0001/ job_1701175565199_0001 QuasiMonteCarlo SUCCEEDED 1701175708248 yarn default DEFAULT N/A N/A N/A N/A N/A http://elenas-adh2.ru-central1.internal:8088/proxy/application_1701175565199_0001/ job_1705311887839_0001 QuasiMonteCarlo SUCCEEDED 1705319626157 yarn default DEFAULT N/A N/A N/A N/A N/A http://elenas-adh2.ru-central1.internal:8088/proxy/application_1705311887839_0001/ job_1705311887839_0002 QuasiMonteCarlo SUCCEEDED 1705319754517 yarn default DEFAULT N/A N/A N/A N/A N/A http://elenas-adh2.ru-central1.internal:8088/proxy/application_1705311887839_0002/ job_1705573015955_0002 QuasiMonteCarlo SUCCEEDED 1705658611315 yarn default DEFAULT N/A N/A N/A N/A N/A http://elenas-adh2.ru-central1.internal:8088/proxy/application_1705573015955_0002/ job_1704872668642_0003 QuasiMonteCarlo SUCCEEDED 1704884498654 yarn default DEFAULT N/A N/A N/A N/A N/A http://elenas-adh2.ru-central1.internal:8088/proxy/application_1704872668642_0003/ job_1712415098046_0001 QuasiMonteCarlo SUCCEEDED 1712416353017 yarn default DEFAULT N/A N/A N/A N/A N/A http://elenas-adh2.ru-central1.internal:8088/proxy/application_1712415098046_0001/ job_1704872668642_0004 QuasiMonteCarlo SUCCEEDED 1704884603925 yarn default DEFAULT N/A N/A N/A N/A N/A http://elenas-adh2.ru-central1.internal:8088/proxy/application_1704872668642_0004/ job_1704872668642_0001 distcp SUCCEEDED 1704884156346 hdfs default DEFAULT N/A N/A N/A N/A N/A http://elenas-adh2.ru-central1.internal:8088/proxy/application_1704872668642_0001/ job_1704872668642_0002 distcp SUCCEEDED 1704884247607 hdfs default DEFAULT N/A N/A N/A N/A N/A http://elenas-adh2.ru-central1.internal:8088/proxy/application_1704872668642_0002/
To see the status of a specific job, run:
$ mapred job -status <job-ID>
Where job-ID
is the ID of the job for which you want to check the status.
Output example:
Job: job_1713964485222_0001 Job File: hdfs://adh/user/yarn/.staging/job_1713964485222_0001/job.xml Job Tracking URL : http://elenas-adh2.ru-central1.internal:8088/proxy/application_1713964485222_0001/ Uber job : false Number of maps: 16 Number of reduces: 1 map() completion: 0.625 reduce() completion: 0.0 Job state: RUNNING retired: false reason for failure: Counters: 33 File System Counters FILE: Number of bytes read=0 FILE: Number of bytes written=2306730 FILE: Number of read operations=0 FILE: Number of large read operations=0 FILE: Number of write operations=0 HDFS: Number of bytes read=2536 HDFS: Number of bytes written=0 HDFS: Number of read operations=40 HDFS: Number of large read operations=0 HDFS: Number of write operations=0 Job Counters Launched map tasks=11 Data-local map tasks=11 Total time spent by all maps in occupied slots (ms)=19885 Total time spent by all map tasks (ms)=19885 Total vcore-milliseconds taken by all map tasks=19885 Total megabyte-milliseconds taken by all map tasks=20362240 Map-Reduce Framework Map input records=10 Map output records=20 Map output bytes=180 Map output materialized bytes=280 Input split bytes=1356 Combine input records=0 Spilled Records=20 Failed Shuffles=0 Merged Map outputs=0 GC time elapsed (ms)=406 CPU time spent (ms)=3600 Physical memory (bytes) snapshot=3162824704 Virtual memory (bytes) snapshot=28058554368 Total committed heap usage (bytes)=2188902400 Peak Map Physical memory (bytes)=346021888 Peak Map Virtual memory (bytes)=2810654720 File Input Format Counters Bytes Read=1180