Ozone rack awareness
Rack awareness in Ozone is a feature that takes into account the physical network topology when placing data. It is crucial for data locality, fault tolerance, and overall performance, particularly in a geographically distributed cluster. If rack awareness is on, Ozone will place each key replica on a host in a different rack. This insures availability of data in case of a network failure or other unavailability issues.
To configure rack awareness for Ozone via ADCM, perform the following steps:
-
Go to the ADCM UI and select your cluster on the Clusters page.
-
Go to the Services tab and select Ozone.
-
Switch on the Show advanced toggle and locate the Topology script and Topology data parameters.
-
Paste your network topology script as the value of the Topology script parameter.
Example topology script#!/bin/bash # Adjust/Add the property "net.topology.script.file.name" # to core-site.xml with the "absolute" path the this # file. ENSURE the file is "executable". # Supply appropriate rack prefix RACK_PREFIX=default # To test, supply a hostname as script input: if [ $# -gt 0 ]; then CTL_FILE=${CTL_FILE:-"topology.data"} HADOOP_CONF=${HADOOP_CONF:-"/etc/hadoop/conf"} if [ ! -f ${HADOOP_CONF}/${CTL_FILE} ]; then echo -n "/$RACK_PREFIX/rack " exit 0 fi while [ $# -gt 0 ] ; do nodeArg=$1 exec< ${HADOOP_CONF}/${CTL_FILE} result="" while read line ; do ar=( $line ) if [ "${ar[0]}" = "$nodeArg" ] ; then result="${ar[1]}" fi done shift if [ -z "$result" ] ; then echo -n "/$RACK_PREFIX/rack " else echo -n "/$RACK_PREFIX/rack_$result " fi done else echo -n "/$RACK_PREFIX/rack " fi
You can find additional script examples in the Rack Awareness article.
-
As value of the Topology data parameter, list the racks IPs and their corresponding IDs as shown in the following example:
Example topology data# This file should be: # - Placed in the /etc/hadoop/conf directory # - On the Namenode (and backups IE: HA, Failover, etc) # - On the Job Tracker OR Resource Manager (and any Failover JT's/RM's) # This file should be placed in the /etc/hadoop/conf directory. # Add Hostnames to this file. Format <host ip> <rack_location> 10.92.42.178 01 10.92.43.172 02 10.92.42.229 03
-
Click Save, then Create.
-
In the Actions drop-down menu, select Restart.
-
Make sure the Apply configs from ADCM option is set to
true
and click Run.

To check if rack awareness has been successfully configured, you can use the following command:
$ ozone admin datanode list
The output should look like this:
Datanode 10.92.42.178:9866 - Rack: /rack01 - Status: UP Datanode 10.92.43.172:9866 - Rack: /rack02 - Status: UP Datanode 10.92.42.229:9866 - Rack: /rack03 - Status: UP