Kerberos overview

The cluster can be kerberized with the use of the following KDC types:

  • MIT KDC that consists of the principals database and Kerberos keys storage.

  • Active Directory that consists of the principals database and Windows Server keys storage.

  • FreeIPA which is a free open source identity management system for Linux/UNIX environments.

The process of identification and authentication via either scheme is practically similar. The only difference is the KDC type being used.

Common requirements

This section describes the common prerequisites that are necessary to perform the kerberization of an ADH cluster.

To kerberize an ADH cluster, you have to assure the following:

  • All machines have IPTables and SELinux turned off.

  • ADH cluster is installed and set up. You can turn on Kerberos only if the cluster is ready for work.

  • All cluster nodes are set up for straight and for reverse DNS.

  • There is an Active Directory or a MIT Kerberos Administrator account that has full access to create, remove, and manage user accounts.

  • Active Directory or MIT Kerberos keys and users storage is set up and ready.

Configuring Hadoop Group Mapping for LDAP/AD

This section describes the prerequisites that are necessary to set up Active Directory for Kerberos.

To provide forced authorization at the LDAP/AD group level in Hadoop, you need to configure Hadoop group mapping for LDAP/AD.

TIP
LDAP settings may vary depending on the LDAP implementation used.

There are two ways to configure Hadoop group mapping.

  • Configuring Hadoop Group mapping for LDAP/AD using SSSD (recommended).

    For the groups mapping, we recommend using SSSD or one of the following services for the Linux and LDAP connection:

    • Centrify;

    • NSLCD;

    • Winbind;

    • SAMBA.

    Most of the listed services allow you to search for a user, list groups, and perform other actions on the host. However, none of these actions are required to map LDAP groups in Hadoop. Therefore, when evaluating these services it’s necessary to understand the difference between the NSS module (which performs user/group authorization) and the PAM module (which performs user authentication). NSS is required to search or validate a user in LDAP and enumerate groups. PAM can also pose a security threat.

  • Configuring Hadoop group mapping via ADCM (core-site.xml section).

    Configuration of the LDAP-based group mapping in the core-site file.xml file is implemented in the following order:

    1. Add the properties shown in the example below to the core-site.xml file. You should specify a value for the linked user, a password, and other properties specific to the LDAP instance. Make sure that the filters of object classes, users, and groups match the values specified in the LDAP instance.

      Configuring authentication settings in core-site.xml
      <property>
      <name>hadoop.security.group.mapping</name>
      <value>org.apache.hadoop.security.LdapGroupsMapping</value>
      </property>
      
      <property>
      <name>hadoop.security.group.mapping.ldap.bind.user</name>
      <value>cn=Manager,dc=hadoop,dc=apache,dc=org</value>
      </property>
      
      <!–
      <property>
      <name>hadoop.security.group.mapping.ldap.bind.password.file</name>
      <value>/etc/hadoop/conf/ldap-conn-pass.txt</value>
      </property>
      –>
      
      <property>
      <name>hadoop.security.group.mapping.ldap.bind.password</name>
      <value>hadoop</value>
      </property>
      
      <property>
      <name>hadoop.security.group.mapping.ldap.url</name>
      <value>ldap://localhost:389/dc=hadoop,dc=apache,dc=org</value>
      </property>
      
      <property>
      <name>hadoop.security.group.mapping.ldap.url</name>
      <value>ldap://localhost:389/dc=hadoop,dc=apache,dc=org</value>
      </property>
      
      <property>
      <name>hadoop.security.group.mapping.ldap.base</name>
      <value></value>
      </property>
      
      <property>
      <name>hadoop.security.group.mapping.ldap.search.filter.user</name>
      <value>(&amp;(|(objectclass=person)(objectclass=applicationProcess))(cn={0}))</value>
      </property>
      
      <property>
      <name>hadoop.security.group.mapping.ldap.search.filter.group</name>
      <value>(objectclass=groupOfNames)</value>
      </property>
      
      <property>
      <name>hadoop.security.group.mapping.ldap.search.attr.member</name>
      <value>member</value>
      </property>
      
      <property>
      <name>hadoop.security.group.mapping.ldap.search.attr.group.name</name>
      <value>cn</value>
      </property>
    2. Depending on the configuration, you can update user and group mappings using the following HDFS and YARN commands:

      $ hdfs dfsadmin -refreshUserToGroupsMappings
      $ yarn rmadmin -refreshUserToGroupsMappings
    3. Check the mapping of LDAP groups by running the hdfs groups command. The command displays groups from LDAP for the current user. With LDAP group mapping configured, HDFS permissions can use LDAP-defined groups for access control.

Found a mistake? Seleсt text and press Ctrl+Enter to report it