Use MapReduce REST APIs

ADH MapReduce components support the ability to send REST API requests to Application Master and History Server.

Using MapReduce REST API, you can get information about the running and finished applications, jobs, and tasks. The same information is available in YARN UI.

Certain requests might require permissions. You can configure user permissions for your cluster in the ACL settings.

Application Master REST API

The Application Master accepts requests sent via a proxy. You can use Resource Manager as a proxy or configure a separate host.

The URL for requests can have one of the following formats (the output will be the same):

http://<proxy-address>:<port>/proxy/<app-ID>/ws/v1/mapreduce

http://<proxy-address>:<port>/proxy/<app-ID>/ws/v1/mapreduce/info

Where:

  • <proxy-address> — the IP address of the proxy server;

  • <port> — the port of the proxy server;

  • <app-ID> — the ID of the application.

The example request to Application Master:

$ curl -L http://127.0.0.1:8088/proxy/application_1713528166269_0001/ws/v1/mapreduce/info -H "Accept: application/xml"

This command requests information about the running application with the ID application_1713528166269_0001 in the XML format. You can request a JSON in the response by using the Accept: application/json header.

The Application Master responds with the following:

<?xml version="1.0" encoding="UTF-8" standalone="yes"?><info><appId>application_1713528166269_0001</appId><name>QuasiMonteCarlo</name><user>yarn</user><startedOn>1713529723410</startedOn><elapsedTime>38797</elapsedTime></info>

The full API specification is available at MapReduce Application Master REST API’s.

History Server REST API

The History Server REST API provides information about the finished applications.

Here’s an example of the command that returns the jobs list:

$ curl -X GET http://<history-server-address>:8042/ws/v1/history/mapreduce/jobs -H "Accept: application/json"

Where <history-server-address> is the IP address or FQDN of the History Server host.

This command requests information about the finished jobs in the JSON format. You can request an XML response by using the Accept: application/xml header.

There are additionall parameters that can be used to filter the output:

  • user — name of the user that started the application;

  • state — job state;

  • queue — name of the queue;

  • limit — number of application objects to return;

  • startedTimeBegin — jobs that started after the specified time in milliseconds;

  • startedTimeEnd — jobs that started before the specified time in milliseconds;

  • finishedTimeBegin — jobs that finished after the specified time in milliseconds;

  • finishedTimeEnd — jobs that finished before the specified time in milliseconds.

Here’s an example command that requests the list of jobs that started between 1713770628450 and 1713770680980:

$ curl -X GET http://127.0.0.1:8088:19888/ws/v1/history/mapreduce/jobs?startedTimeBegin=1713770628450&startedTimeEnd=1713770680980 -H "Accept: application/json"

The server responds with the object:

{"jobs":{"job":[{"submitTime":1713770623388,"startTime":1713770628457,"finishTime":1713770680975,"id":"job_1713770455913_0001","name":"QuasiMonteCarlo","queue":"default","user":"yarn","state":"SUCCEEDED","mapsTotal":16,"mapsCompleted":16,"reducesTotal":1,"reducesCompleted":1},{"submitTime":1713771341013,"startTime":1713771345766,"finishTime":1713771398188,"id":"job_1713770455913_0002","name":"QuasiMonteCarlo","queue":"default","user":"yarn","state":"SUCCEEDED","mapsTotal":16,"mapsCompleted":16,"reducesTotal":1,"reducesCompleted":1}]}}

The full API specification is available at MapReduce History Server REST API’s.

Found a mistake? Seleсt text and press Ctrl+Enter to report it