archive-logs

For clusters with a lot of aggregated YARN logs, you can combine logs into Hadoop archives to reduce the number of small files, and hence the NameNode load. This command provides an easy way to do this.

You can read aggregated logs in Hadoop archives by using the Job History Server and the YARN logs command.

The tool usage is as follows:

$ mapred archive-logs
Arguments

-force

Forces re-creating the working directory if an existing one is found. This should only be used if you know that another instance is not currently running

-maxEligibleApps <n>

The maximum number of eligible applications to process Defaults to -1 (all)

-maxTotalLogsSize <megabytes>

The maximum total logs size required to be eligible for archival (in megabytes). Defaults to 1024

-memory <megabytes>

The amount of memory for each container (in megabytes). Defaults to 1024

-minNumberLogFiles <n>

The minimum number of log files required to be eligible for archival. Defaults to 20

-noProxy

When specified, all processing is done under the user running this command (or the YARN user if DefaultContainerExecutor is in use). If not specified, all processing is done under the user who owns that application; if the user running this command is not allowed to impersonate that user, it will fail

-help

Prints this help message

Found a mistake? Seleсt text and press Ctrl+Enter to report it