Конференция Arenadata
Новое время — новый Greenplum
Мы приглашаем вас принять участие в конференции, посвященной будущему Open-Source Greenplum 19 сентября в 18:00:00 UTC +3. Встреча будет проходить в гибридном формате — и офлайн, и онлайн. Онлайн-трансляция будет доступна для всех желающих.
Внезапное закрытие Greenplum его владельцем — компанией Broadcom - стало неприятным сюрпризом для всех, кто использует или планирует начать использовать решения на базе этой технологии. Многие ожидают выхода стабильной версии Greenplum 7 и надеются на её дальнейшее активное развитие.
Arenadata не могла допустить, чтобы разрабатываемый годами Open-Source проект Greenplum прекратил своё существование, поэтому 19 сентября мы представим наш ответ на данное решение Broadcom, а участники сообщества получат исчерпывающие разъяснения на все вопросы о дальнейшей судьбе этой технологии.

На конференции вас ждёт обсуждение следующих тем:

  • План возрождения Greenplum;
  • Дорожная карта;
  • Экспертное обсуждение и консультации.
Осталось до события

dfsadmin

Runs a HDFS dfsadmin.

The usage is as follows:

$ hdfs dfsadmin [-report [-live] [-dead] [-decommissioning] [-enteringmaintenance] [-inmaintenance]]
$ hdfs dfsadmin [-safemode enter | leave | get | wait | forceExit]
$ hdfs dfsadmin [-saveNamespace [-beforeShutdown]]
$ hdfs dfsadmin [-rollEdits]
$ hdfs dfsadmin [-restoreFailedStorage true |false |check]
$ hdfs dfsadmin [-refreshNodes]
$ hdfs dfsadmin [-setQuota <quota> <dirname>...<dirname>]
$ hdfs dfsadmin [-clrQuota <dirname>...<dirname>]
$ hdfs dfsadmin [-setSpaceQuota <quota> [-storageType <storagetype>] <dirname>...<dirname>]
$ hdfs dfsadmin [-clrSpaceQuota [-storageType <storagetype>] <dirname>...<dirname>]
$ hdfs dfsadmin [-finalizeUpgrade]
$ hdfs dfsadmin [-rollingUpgrade [<query> |<prepare> |<finalize>]]
$ hdfs dfsadmin [-upgrade [query | finalize]]
$ hdfs dfsadmin [-refreshServiceAcl]
$ hdfs dfsadmin [-refreshUserToGroupsMappings]
$ hdfs dfsadmin [-refreshSuperUserGroupsConfiguration]
$ hdfs dfsadmin [-refreshCallQueue]
$ hdfs dfsadmin [-refresh <host:ipc_port> <key> [arg1..argn]]
$ hdfs dfsadmin [-reconfig <namenode|datanode> <host:ipc_port> <start |status |properties>]
$ hdfs dfsadmin [-printTopology]
$ hdfs dfsadmin [-refreshNamenodes datanodehost:port]
$ hdfs dfsadmin [-getVolumeReport datanodehost:port]
$ hdfs dfsadmin [-deleteBlockPool datanode-host:port blockpoolId [force]]
$ hdfs dfsadmin [-setBalancerBandwidth <bandwidth in bytes per second>]
$ hdfs dfsadmin [-getBalancerBandwidth <datanode_host:ipc_port>]
$ hdfs dfsadmin [-fetchImage <local directory>]
$ hdfs dfsadmin [-allowSnapshot <snapshotDir>]
$ hdfs dfsadmin [-disallowSnapshot <snapshotDir>]
$ hdfs dfsadmin [-shutdownDatanode <datanode_host:ipc_port> [upgrade]]
$ hdfs dfsadmin [-evictWriters <datanode_host:ipc_port>]
$ hdfs dfsadmin [-getDatanodeInfo <datanode_host:ipc_port>]
$ hdfs dfsadmin [-metasave filename]
$ hdfs dfsadmin [-triggerBlockReport [-incremental] <datanode_host:ipc_port> [-namenode <namenode_host:ipc_port>]]
$ hdfs dfsadmin [-listOpenFiles [-blockingDecommission] [-path <path>]]
$ hdfs dfsadmin [-help [cmd]]
Arguments

--report [-live] [-dead] [-decommissioning] [-enteringmaintenance] [-inmaintenance]

Reports basic filesystem information and statistics.
The dfs usage can be different from du usage, because it measures raw space used by replication, checksums, snapshots etc., on all the DNs.
Optional flags may be used to filter the list of displayed DataNodes

safemode enter|leave|get|wait|forceExit

Safe mode maintenance command.

Safe mode is a NameNode state in which it:

  • does not accept changes to a namespace (read-only);

  • does not replicate or delete blocks.

Safe mode is entered automatically at NameNode startup, and the node leaves the safe mode automatically when the configured minimum percentage of blocks satisfies the minimum replication condition.
If NameNode detects any anomaly then it will linger in safe mode till that issue is resolved.
If that anomaly is the consequence of a deliberate action, then administrator can use -safemode forceExit to exit safe mode.
The cases where forceExit may be required are:

  • NameNode metadata is not consistent.
    If a NameNode detects that metadata has been modified out of band and can cause
    data loss, then the NameNode enters the forceExit state. At that point a user can either restart the NameNode with correct metadata files or forceExit (if data loss is acceptable);

  • Rollback causes metadata to be replaced, and rarely it can trigger the safe mode forceExit state in NameNode. In that case you may proceed by issuing -safemode forceExit.
    Safe mode can also be entered manually, but then it can only be turned off manually as well.

-saveNamespace [-beforeShutdown]

Saves the current namespace into storage directories and resets the edits log.
If the beforeShutdown option is given, the NameNode creates a checkpoint if and only if no checkpoint has been done during a time frame (a configurable number of checkpoint periods).
This is usually used before shutting down the NameNode to prevent potential fsimage/editlog corruption

-rollEdits

Rolls the edit log on the active NameNode

-restoreFailedStorage true|false|check

Turns on/off automatic attempt to restore failed storage replicas.
If a failed storage becomes available again the system attempts to restore edits and/or fsimage during the checkpoint

-refreshNodes

Re-reads the hosts and exclude files to update the set of DataNodes that are allowed to connect to the NameNode and those that should be decommissioned or recommissioned

-setQuota <quota> <dirname>…<dirname>

Sets the quota for each directory

-clrQuota <dirname>…<dirname>

Removes any name quota for each directory.
Best effort for each directory, with faults reported if the directory does not exist, or it is a file.
A directory may have no quota at all

-setSpaceQuota <quota> [-storageType <storagetype>] <dirname>…<dirname>

Sets the storage type quota to bytes of storage type specified for each directory

-clrSpaceQuota [-storageType <storagetype>] <dirname>…<dirname>

Removes storage type quota specified for each directory.
Best effort for each directory, with faults reported if the directory does not exist, or it is a file.
It is not a fault if the directory has no storage type quota on for storage type specified.
The storage type specific quota is cleared when -storageType option is specified

-finalizeUpgrade

Finalizes upgrade of HDFS.
DataNodes delete their previous version working directories, followed by NameNode doing the same.
This completes the upgrade process

-rollingUpgrade [<query>|<prepare>|<finalize>]

Executes a rolling upgrade action:

  • query — queries the current rolling upgrade status;

  • prepare — prepares a new rolling upgrade;

  • finalize — finalizes the current rolling upgrade.

-upgrade query|finalize

Queries the current upgrade status
Finalize upgrade of HDFS (equivalent to -finalizeUpgrade)

-refreshServiceAcl

Reloads the service-level authorization policy file

-refreshUserToGroupsMappings

Refreshes user-to-groups mappings

-refreshSuperUserGroupsConfiguration

Refreshes superuser proxy groups mappings

-refreshCallQueue

Reloads the call queue from config

-refresh <host:ipc_port> <key> [arg1..argn]

Triggers a runtime-refresh of the resource specified by key on <host:ipc_port>.
All other args after are sent to the host

-reconfig <datanode |namenode> <host:ipc_port> <start|status|properties>

Starts reconfiguration or gets the status of an ongoing reconfiguration, or gets a list of reconfigurable properties.
The second parameter specifies the node type

-printTopology

Prints a tree of the racks and their nodes as reported by the NameNode

-refreshNameNodes datanodehost:port

For the given DataNode, reloads the configuration files, stops serving the removed block-pools and starts serving new block-pools

-getVolumeReport datanodehost:port

For the given DataNode, get the volume report

-deleteBlockPool datanode-host:port blockpoolId [force]

If force is passed, block pool directory for the given blockpool ID on the given DataNode is deleted along with its contents, otherwise the directory is deleted only if it is empty.
The command will fail if DataNode is still serving the block pool. Refer to refresh NameNodes to shutdown a block
pool service on a DataNode

-setBalancerBandwidth <bandwidth in bytes per second>

Changes the network bandwidth used by each DataNode during HDFS block balancing.
<bandwidth> is the maximum number of bytes per second that will be used by each DataNode.
This value overrides the fs.datanode.balance.bandwidthPerSec parameter.
The new value is not persistent on the DataNode

-getBalancerBandwidth <datanode_host:ipc_port>

Gets the network bandwidth for the given DataNode (in bytes per second).
This is the maximum network bandwidth used by the DataNode during HDFS block balancing

-fetchImage <local directory>

Downloads the most recent fsimage from the NameNode and saves it in the specified local directory

-allowSnapshot <snapshotDir>

Allows snapshots of a directory to be created.
If the operation completes successfully, the directory becomes snapshot-table

-disallowSnapshot <snapshotDir>

Disallows snapshots of a directory to be created.
All snapshots of the directory must be deleted before disallowing snapshots

-shutdownDatanode <datanode_host:ipc_port> [upgrade]

Submits a shutdown request for the given DataNode

-evictWriters <datanode_host:ipc_port>

Makes the DataNode evict all clients that are writing a block.
This is useful if decommissioning is hung due to slow writers

-getDatanodeInfo <datanode_host:ipc_port>

Gets the information about the given DataNode

-metasave filename

Saves NameNode’s primary data structures to filename in the directory specified by the hadoop.log.dir property.
The filename will be overwritten if it exists.
The filename will contain one line for each of the following:

  • DataNodes heart beating with NameNode;

  • Blocks waiting to be replicated;

  • Blocks currently being replicated;

  • Blocks waiting to be deleted.

-triggerBlockReport [-incremental] <datanode_host:ipc_port> [-namenode <namenode_host:ipc_port>]

Triggers a block report for the given DataNode.
If -incremental is specified, it will be otherwise, it will be a full block report.
If -namenode <namenode_host:ipc_port> is given, it only sends block report to a specified NameNode

-listOpenFiles [-blockingDecommission] [-path <path>]

Lists all open files currently managed by the NameNode along with client name and client machine accessing them.
Open files list will be filtered by given type and path. Add -blockingDecommission option if you only want to list open files that are blocking the DataNode decommissioning

-help [cmd]

Displays help for the given command or all commands if none is specified

Found a mistake? Seleсt text and press Ctrl+Enter to report it