Check files via fsck

In HDFS, the fsck command prints out the detailed file system information, its health status, and provides several actions for damaged files.

By the means of fsck command, you can:

  • Print out replication information for files: under replicated blocks, blocks by racks, block size, etc.

  • Print out system information: health; network topology; number of DataNodes, racks, and directories, etc.

  • Print out erasure coding information.

  • Manage corrupted files: delete them or move to the /lost+found directory.

The example fsck command that performs a check on the /tmp/ directory and additionally prints out the storage policy compliance:

$ hdfs fsck /tmp/ -storagepolicies

The output should look similar to this:

Connecting to namenode via http://127.0.0.1:9870/fsck?ugi=admin&storagepolicies=1&path=%2Ftmp
FSCK started by admin (auth:SIMPLE) from /127.0.0.1 for path /tmp at Wed Sep 13 12:56:43 UTC 2023

.
/tmp/test/test_file.tgz:  Under replicated BP-2122642454-127.0.0.1-1690205635727:blk_1073741974_1150. Target Replicas is 3 but found 2 live replica(s), 0 decommissioned replica(s), 0 decommissioning replica(s).
.
/tmp/test/test_file.zip:  Under replicated BP-2122642454-127.0.0.1-1690205635727:blk_1073741965_1141. Target Replicas is 3 but found 2 live replica(s), 0 decommissioned replica(s), 0 decommissioning replica(s).

/tmp/test/test_file.zip:  Under replicated BP-2122642454-127.0.0.1-1690205635727:blk_1073741966_1142. Target Replicas is 3 but found 2 live replica(s), 0 decommissioned replica(s), 0 decommissioning replica(s).

/tmp/test/test_file.zip:  Under replicated BP-2122642454-127.0.0.1-1690205635727:blk_1073741967_1143. Target Replicas is 3 but found 2 live replica(s), 0 decommissioned replica(s), 0 decommissioning replica(s).

Status: HEALTHY
 Number of data-nodes:  2
 Number of racks:               1
 Total dirs:                    2
 Total symlinks:                0

Replicated Blocks:
 Total size:    303063571 B
 Total files:   2
 Total blocks (validated):      4 (avg. block size 75765892 B)
 Minimally replicated blocks:   4 (100.0 %)
 Over-replicated blocks:        0 (0.0 %)
 Under-replicated blocks:       4 (100.0 %)
 Mis-replicated blocks:         0 (0.0 %)
 Default replication factor:    3
 Average block replication:     2.0
 Missing blocks:                0
 Corrupt blocks:                0
 Missing replicas:              4 (33.333332 %)

Erasure Coded Block Groups:
 Total size:    0 B
 Total files:   0
 Total block groups (validated):        0
 Minimally erasure-coded block groups:  0
 Over-erasure-coded block groups:       0
 Under-erasure-coded block groups:      0
 Unsatisfactory placement block groups: 0
 Average block group size:      0.0
 Missing block groups:          0
 Corrupt block groups:          0
 Missing internal blocks:       0

Blocks satisfying the specified storage policy:
Storage Policy                  # of blocks       % of blocks
DISK:2(HOT)                        4             100.0000%

All blocks satisfy specified storage policy.
FSCK ended at Wed Sep 13 12:56:43 UTC 2023 in 7 milliseconds


The filesystem under path '/tmp' is HEALTHY
Found a mistake? Seleсt text and press Ctrl+Enter to report it