Конференция Arenadata
Новое время — новый Greenplum
Мы приглашаем вас принять участие в конференции, посвященной будущему Open-Source Greenplum 19 сентября в 18:00:00 UTC +3. Встреча будет проходить в гибридном формате — и офлайн, и онлайн. Онлайн-трансляция будет доступна для всех желающих.
Внезапное закрытие Greenplum его владельцем — компанией Broadcom - стало неприятным сюрпризом для всех, кто использует или планирует начать использовать решения на базе этой технологии. Многие ожидают выхода стабильной версии Greenplum 7 и надеются на её дальнейшее активное развитие.
Arenadata не могла допустить, чтобы разрабатываемый годами Open-Source проект Greenplum прекратил своё существование, поэтому 19 сентября мы представим наш ответ на данное решение Broadcom, а участники сообщества получат исчерпывающие разъяснения на все вопросы о дальнейшей судьбе этой технологии.

На конференции вас ждёт обсуждение следующих тем:

  • План возрождения Greenplum;
  • Дорожная карта;
  • Экспертное обсуждение и консультации.
Осталось до события

Connect to MapReduce via CLI

Overview

MapReduce provides a command line interface implemented as a bin/mapred script. To interact with MapReduce via CLI, connect to a cluster host with a MapReduce component via SSH and run the desired MapReduce CLI command.

For example, check the version by running:

$ mapred version

Output example:

Hadoop 3.2.4
Source code repository git@ssh.gitlab.adsw.io:arenadata/infrastructure/code/ci/prj_adh.git -r 3cb85f40e394dcfb50fe77310908cce385381ba2
Compiled by jenkins on 2024-03-20T17:53Z
Compiled with protoc 2.5.0
From source with checksum ee031c16fe785bbb35252c749418712
This command was run using /usr/lib/hadoop/hadoop-common-3.2.4.jar

Running mapred without any arguments prints the list of all MapReduce commands.

The commands have the following syntax:

$ mapred [SHELL_OPTIONS] COMMAND [GENERIC_OPTIONS] [COMMAND_OPTIONS]

Where:

  • SHELL_OPTIONS — Hadoop shell options listed in the Shell Options section of Hadoop Commands Reference.

  • GENERIC_OPTIONS — standard options supported by multiple commands. For more information, see the Generic Options section of Hadoop Commands Reference.

  • COMMAND_OPTIONS — command-specific options described in references for each command.

Usage examples

Typically, a MapReduce job is launched using the yarn jar or hadoop jar commands. But you can use MapReduce CLI, for example, to get information about specific jobs and control their execution.

To see the list of all jobs, run:

$ mapred job -list all

Output example:

Total jobs:13
                  JobId              JobName         State           StartTime      UserName           Queue      Priority       UsedContainers  RsvdContainers      UsedMem         RsvdMem         NeededMem         AM info
 job_1713528166269_0001      QuasiMonteCarlo     SUCCEEDED       1713529720429          yarn         default       DEFAULT                  N/A             N/A          N/A             N/A               N/A      http://elenas-adh2.ru-central1.internal:8088/proxy/application_1713528166269_0001/
 job_1713770455913_0001      QuasiMonteCarlo     SUCCEEDED       1713770623388          yarn         default       DEFAULT                  N/A             N/A          N/A             N/A               N/A      http://elenas-adh2.ru-central1.internal:8088/proxy/application_1713770455913_0001/
 job_1713770455913_0002      QuasiMonteCarlo     SUCCEEDED       1713771341014          yarn         default       DEFAULT                  N/A             N/A          N/A             N/A               N/A      http://elenas-adh2.ru-central1.internal:8088/proxy/application_1713770455913_0002/
 job_1713964485222_0001      QuasiMonteCarlo          PREP       1713965480860          yarn         default       DEFAULT                    1               0        2048M              0M             2048M      http://elenas-adh2.ru-central1.internal:8088/proxy/application_1713964485222_0001/
 job_1701175565199_0001      QuasiMonteCarlo     SUCCEEDED       1701175708248          yarn         default       DEFAULT                  N/A             N/A          N/A             N/A               N/A      http://elenas-adh2.ru-central1.internal:8088/proxy/application_1701175565199_0001/
 job_1705311887839_0001      QuasiMonteCarlo     SUCCEEDED       1705319626157          yarn         default       DEFAULT                  N/A             N/A          N/A             N/A               N/A      http://elenas-adh2.ru-central1.internal:8088/proxy/application_1705311887839_0001/
 job_1705311887839_0002      QuasiMonteCarlo     SUCCEEDED       1705319754517          yarn         default       DEFAULT                  N/A             N/A          N/A             N/A               N/A      http://elenas-adh2.ru-central1.internal:8088/proxy/application_1705311887839_0002/
 job_1705573015955_0002      QuasiMonteCarlo     SUCCEEDED       1705658611315          yarn         default       DEFAULT                  N/A             N/A          N/A             N/A               N/A      http://elenas-adh2.ru-central1.internal:8088/proxy/application_1705573015955_0002/
 job_1704872668642_0003      QuasiMonteCarlo     SUCCEEDED       1704884498654          yarn         default       DEFAULT                  N/A             N/A          N/A             N/A               N/A      http://elenas-adh2.ru-central1.internal:8088/proxy/application_1704872668642_0003/
 job_1712415098046_0001      QuasiMonteCarlo     SUCCEEDED       1712416353017          yarn         default       DEFAULT                  N/A             N/A          N/A             N/A               N/A      http://elenas-adh2.ru-central1.internal:8088/proxy/application_1712415098046_0001/
 job_1704872668642_0004      QuasiMonteCarlo     SUCCEEDED       1704884603925          yarn         default       DEFAULT                  N/A             N/A          N/A             N/A               N/A      http://elenas-adh2.ru-central1.internal:8088/proxy/application_1704872668642_0004/
 job_1704872668642_0001               distcp     SUCCEEDED       1704884156346          hdfs         default       DEFAULT                  N/A             N/A          N/A             N/A               N/A      http://elenas-adh2.ru-central1.internal:8088/proxy/application_1704872668642_0001/
 job_1704872668642_0002               distcp     SUCCEEDED       1704884247607          hdfs         default       DEFAULT                  N/A             N/A          N/A             N/A               N/A      http://elenas-adh2.ru-central1.internal:8088/proxy/application_1704872668642_0002/

To see the status of a specific job, run:

$ mapred job -status <job-ID>

Where job-ID is the ID of the job for which you want to check the status.

Output example:

Job: job_1713964485222_0001
Job File: hdfs://adh/user/yarn/.staging/job_1713964485222_0001/job.xml
Job Tracking URL : http://elenas-adh2.ru-central1.internal:8088/proxy/application_1713964485222_0001/
Uber job : false
Number of maps: 16
Number of reduces: 1
map() completion: 0.625
reduce() completion: 0.0
Job state: RUNNING
retired: false
reason for failure:
Counters: 33
        File System Counters
                FILE: Number of bytes read=0
                FILE: Number of bytes written=2306730
                FILE: Number of read operations=0
                FILE: Number of large read operations=0
                FILE: Number of write operations=0
                HDFS: Number of bytes read=2536
                HDFS: Number of bytes written=0
                HDFS: Number of read operations=40
                HDFS: Number of large read operations=0
                HDFS: Number of write operations=0
        Job Counters
                Launched map tasks=11
                Data-local map tasks=11
                Total time spent by all maps in occupied slots (ms)=19885
                Total time spent by all map tasks (ms)=19885
                Total vcore-milliseconds taken by all map tasks=19885
                Total megabyte-milliseconds taken by all map tasks=20362240
        Map-Reduce Framework
                Map input records=10
                Map output records=20
                Map output bytes=180
                Map output materialized bytes=280
                Input split bytes=1356
                Combine input records=0
                Spilled Records=20
                Failed Shuffles=0
                Merged Map outputs=0
                GC time elapsed (ms)=406
                CPU time spent (ms)=3600
                Physical memory (bytes) snapshot=3162824704
                Virtual memory (bytes) snapshot=28058554368
                Total committed heap usage (bytes)=2188902400
                Peak Map Physical memory (bytes)=346021888
                Peak Map Virtual memory (bytes)=2810654720
        File Input Format Counters
                Bytes Read=1180
Found a mistake? Seleсt text and press Ctrl+Enter to report it