Ranger architecture
Overview
Apache Ranger is an open-source security framework that provides centralized policy management for Hadoop and other big data ecosystems.
Ranger is a complex service that consists of three components: Ranger Admin, Ranger KMS, and Ranger User synchronizer.
Ranger defines authentication policies for services through plugins and stores them on a policy server.
Ranger Admin
Ranger Admin is a centralizing interface that allows you to manage policies, users, and audit. It has a web interface and also supports REST API. Ranger Admin can work in the high availability mode if there are at least two instances. When a user accesses Ranger Admin’s web UI using a browser, a load balancer automatically selects the instance that is the least occupied to provide services. Also, Ranger Admin’s metadata can be stored in an external database (PostgreSQL/MySQL).
Ranger Admin supports additional authentication with LDAP/Active Directory.
Ranger KMS
Ranger KMS (Key Management Service) is used for cluster encryption. It is based on Hadoop KMS, but it allows you to store keys in a secure database instead of Java keystore files. The KMS server can be managed through Ranger Admin.
Ranger KMS allows you to create, delete, or update keys using Ranger Admin or via REST API. Ranger KMS is fully compatible with Hadoop KMS’s REST API. It is recommended to use Ranger KMS since it provides a secure key storage along with scalability — Ranger KMS supports the high availability mode. Just like Ranger Admin’s, Ranger KMS’s metadata can be stored in an external database (PostgreSQL/MySQL).
Ranger User synchronizer
Ranger User synchronizer (UserSync) is a feature that allows Ranger to synchronize users/groups from external systems. By default, only users from Unix are imported to Ranger, but you can configure an additional LDAP source.
By default, the incremental synchronization mode is used. In each synchronization period, UserSync updates only new or modified users and user groups. When a user or user group is deleted, UserSync does not synchronize the change to Ranger Admin. To improve performance, UserSync does not synchronize empty user groups. The synchronization is one-way; i.e., if users are deleted from Ranger Admin, they would still remain in the original system.
The synchronization happens periodically, and you can set how often the cycle should happen in the Ranger configuration. The parameter responsible for it is ranger.usersync.sleeptimeinmillisbetweensynccycle
in the ranger-ugsync-site.xml parameter group. If the value is unset, the default value is used: 300000
for Unix and 21600000
for LDAP.
Ranger policy server
Ranger policy server allows defining authorization policies for Hadoop applications.
Ranger plugins
Ranger provides access control plugins that replace the original authentication plugins of the Hadoop applications. If a Ranger plugin is enabled for a service, you can set authentication policies for it in Ranger Admin. Ranger plugins periodically update policies from Ranger Admin and cache them in local storage. It is also possible to setup audit logging for plugins.