Erasure coding

Overview

Erasure coding (EC) is a data protection mechanism for rarely used data, which uses encoding instead of replication for ensuring data resiliency. It allows decreasing resource consumption for warm and cold data while maintaining fault-tolerance.

The replication factor of three enables HDFS to sustain two data failures, but it comes at a price of storage space, costing as much as 200% overhead. Erasure coding, typically, allows the same amount of failures while costing significantly less: for example, a file with the RS-3-2-1024k EC policy would sustain the same two data failures, but incurs only a 50% overhead.

When an erasure coding policy is applied to a file, it gets striped into small blocks called striping cells. Then, HDFS uses an encoding algorithm on a group of cells to create a parity file. If a striping cell has been corrupted or lost, the parity file can be used to restore the missing cell by reconstructing it entirely using the parity file and the remaining cell. This process is called decoding.

Erasure coding is compute intensive and network intensive: data needs to be encoded during write operations and decoded for read operations if data integrity is lost.

NOTE

Since the erasure coding is an alternative to replication, any operations related to replication rate (for example, the -setrep command) won’t have any effect on erasure coded files.

For more information about the replication, see the Replication factor article.

Policies

The erasure coding policy determines how data is encoded and how much data resiliency it provides. The name of the policy looks like RS-10-4-1024k and comprises the following information:

  • The type of codec: either Reed-Solomon (RS) or XOR.

  • The number of data blocks.

  • The number of parity blocks.

  • The size of a stiping cell.

Currently, five built-in policies are supported: RS-3-2-1024k, RS-6-3-1024k, RS-10-4-1024k, RS-LEGACY-6-3-1024k, XOR-2-1-1024k.

You can create a custom policy by creating an XML file user_ec_policies.xml describing the policy settings and using the hdfs ec -addPolicies command. For more information on how to create this file, see the user_ec_policies.xml.template example.

Requirements

Erasure coding places additional hardware requirements for clusters:

  • CPU. Encoding and decoding work consumes additional CPU on both HDFS clients and DataNodes.

  • DataNodes. Erasure coding requires a minimum of as many DataNodes as the sum of data and parity blocks. For example, for the RS-3-2-1024k policy the minimum number of DataNodes would be 5 (3 data blocks + 2 parity blocks). This number of DataNodes is required for the EC policy to work correctly, but if your cluster doesn’t satisfy the requirement, HDFS will still implement the policy.

  • Racks. Erasure coding supports rack awareness. It is beneficial to have at least as many racks as the number of parity blocks. For clusters with fewer racks, HDFS cannot maintain rack fault-tolerance, but will still attempt to spread a striped file across multiple DataNodes. For this reason, it is recommended to setup racks with similar number of DataNodes. For information about rack awareness, refer to Rack awareness.

Manage EC

Erasure coding can only be applied to a directory. Once applied, it affects only the new files added to the directory. This makes it possible for files with different EC policies and replicated files to be in the same directory.

Setting or changing a policy for a file requires rewriting its data. If you want to apply EC to existing files, you can create a new directory, set an erasure coding policy for it, and copy the files there using the distcp command.

You can manage erasure coding using the hdfs ec command.

For example, to set the XOR-2-1-1024k policy for a directory:

  1. Make sure it’s enabled by running the command:

    $ hdfs ec -listPolicies

    Example output:

    ErasureCodingPolicy=[Name=RS-10-4-1024k, Schema=[ECSchema=[Codec=rs, numDataUnits=10, numParityUnits=4]], CellSize=1048576, Id=5], State=DISABLED
    ErasureCodingPolicy=[Name=RS-3-2-1024k, Schema=[ECSchema=[Codec=rs, numDataUnits=3, numParityUnits=2]], CellSize=1048576, Id=2], State=DISABLED
    ErasureCodingPolicy=[Name=RS-6-3-1024k, Schema=[ECSchema=[Codec=rs, numDataUnits=6, numParityUnits=3]], CellSize=1048576, Id=1], State=ENABLED
    ErasureCodingPolicy=[Name=RS-LEGACY-6-3-1024k, Schema=[ECSchema=[Codec=rs-legacy, numDataUnits=6, numParityUnits=3]], CellSize=1048576, Id=3], State=DISABLED
    ErasureCodingPolicy=[Name=XOR-2-1-1024k, Schema=[ECSchema=[Codec=xor, numDataUnits=2, numParityUnits=1]], CellSize=1048576, Id=4], State=DISABLED

    Before applying a policy, it must be enabled first. The RS-6-3-1024k policy is enabled by default.

  2. To enable the XOR-2-1-1024k policy, use the command:

    $ hdfs ec -enablePolicy -policy XOR-2-1-1024k
  3. To set the enabled policy for the tmp directory, use the command:

    $ hdfs ec -setPolicy -path /tmp -policy XOR-2-1-1024k
  4. Check the directory policy:

    $ hdfs ec -getPolicy -path /tmp

    The output will look like this:

XOR-2-1-1024k
Found a mistake? Seleсt text and press Ctrl+Enter to report it