Use Ranger for Impala on Kubernetes
Prerequisites
-
An ADPS cluster (2.0.0 or later) is installed and running.
-
An ADH cluster (4.2.0 or later) is installed and running.
-
Impala is installed according to the instruction.
Step 1. Create a service in Ranger
This guide describes how to create a service via Ranger REST API. Alternatively, you can create a service in the Ranger web UI.
-
Define a service in a JSON file:
ranger-impala-k8s.json{ "isEnabled": true, "type": "hive", "name": "impala_k8s", (1) "displayName": "impala_k8s", "description": "Service for Kubernetes Impala", "configs": { "username": "impala", (2) "password": "bigdata", (3) "ranger.plugin.audit.filters": "[ {'accessResult': 'DENIED', 'isAudited': true}, {'actions':['METADATA OPERATION'], 'isAudited': false}, {'users':['hive','hue'],'actions':['SHOW_ROLES'],'isAudited':false} ]", "jdbc.driverClassName": "org.apache.hive.jdbc.HiveDriver", "jdbc.url": "jdbc:impala://10.92.41.149:21050" (4) } }1 A name of the Impala service in Ranger. Must be unique. 2 A username for the service. 3 A password for the service. 4 A JDBC string for connecting to Impala that is exposed by load balancer. -
Push the defined service to Ranger:
$ curl -u admin:<admin_pwd> -H "Content-Type: application/json" -X POST -d @ranger-impala-k8s.json http://<ranger-admin>:6080/service/public/v2/api/service
Step 2. Update Kubernetes secrets
-
Prepare a secret with Hadoop configuration. Your Impala’s Hadoop configuration secret should already contain the core-site.xml, hdfs-site.xml, and hive-site.xml keys. For Ranger integration, add the following keys:
-
ranger-impala-security.xml — Ranger plugin security configuration.
ranger-impala-security.xml<?xml version="1.0" encoding="UTF-8"?> <configuration> <property> <name>ranger.plugin.impala.policy.cache.dir</name> <value>/srv/ranger/impala/policycache</value> </property> <property> <name>ranger.plugin.impala.service.name</name> <value>impala_k8s</value> (1) </property> <property> <name>ranger.plugin.impala.policy.rest.url</name> <value>http://tsn-adps2-1.ru-central1.internal:6080</value> (2) </property> <property> <name>ranger.plugin.impala.policy.source.impl</name> <value>org.apache.ranger.admin.client.RangerAdminRESTClient</value> </property> <property> <name>ranger.plugin.impala.username</name> <value>impala</value> (3) </property> <property> <name>ranger.plugin.impala.password</name> <value>bigdata</value> (4) </property> <property> <name>ranger.plugin.impala.use.rangerGroups</name> <value>True</value> </property> <property> <name>ranger.plugin.impala.use.only.rangerGroups</name> <value>True</value> </property> </configuration>1 A name of the Impala service in Ranger. 2 URL to Ranger Admin. 3 A username for the service. 4 A password for the service. If SSL is enabled, add the following property:
<property> <name>ranger.plugin.impala.policy.rest.ssl.config.file</name> <value>/opt/impala/conf/ranger-impala-policymgr-ssl.xml</value> (1) </property>1 Name of the file with Ranger SSL configuration. -
ranger-impala-audit.xml — Ranger audit configuration.
ranger-impala-audit.xml<configuration> <property> <name>xasecure.audit.destination.solr</name> <value>true</value> </property> <property> <name>xasecure.audit.destination.solr.batch.filespool.dir</name> <value>/srv/ranger/impala_k8s/audit_solr_spool</value> </property> <property> <name>xasecure.audit.destination.solr.zookeepers</name> <value>tsn-adps2-1.ru-central1.internal:2181/Arenadata.Hadoop-3.solr.server</value> (1) </property> <property> <name>xasecure.audit.is.enabled</name> <value>True</value> </property> </configuration>1 A string for ZooKeeper connection in the <host-1>:<port-1>…<host-N>:<port-N>/Arenadata.Hadoop-<cluster_id>.solr.serverformat.If Kerberos is enabled, add the following properties:
<property> <name>xasecure.audit.jaas.Client.loginModuleControlFlag</name> <value>required</value> </property> <property> <name>xasecure.audit.jaas.Client.loginModuleName</name> <value>com.sun.security.auth.module.Krb5LoginModule</value> </property> <property> <name>xasecure.audit.jaas.Client.option.keyTab</name> <value>/opt/impala/kerberos/keytab</value> (1) </property> <property> <name>xasecure.audit.jaas.Client.option.principal</name> <value>impala/impala-cloud.ru-central1.internal@RU-CENTRAL1.INTERNAL</value> (2) </property> <property> <name>xasecure.audit.jaas.Client.option.serviceName</name> <value>solr</value> </property> <property> <name>xasecure.audit.jaas.Client.option.storeKey</name> <value>False</value> </property> <property> <name>xasecure.audit.jaas.Client.option.useKeyTab</name> <value>True</value> </property>1 Path to a keytab inside a pod. 2 Principal name for Impala. -
ranger-impala-policymgr-ssl.xml — Ranger policymgr-ssl configuration (if SSL is enabled).
ranger-impala-policymgr-ssl.xml<configuration> <property> <name>xasecure.policymgr.clientssl.truststore</name> <value>/etc/ssl/truststore.jks</value> (1) </property> <property> <name>xasecure.policymgr.clientssl.truststore.credential.file</name> <value>jceks://file/opt/impala/conf/ranger-impala.jceks</value> (2) </property> </configuration>1 The path to a truststore. 2 The path to a file with credentials for truststore.
-
-
Update the secret for Hadoop configuration:
$ kubectl delete secret <hadoop-config> -n <impala-cluster-ns> $ kubectl create secret generic <hadoop-config> -n <impala-cluster-ns> --from-file=<hadoop-conf-folder>where:
-
<hadoop-config>— name of the secret for Hadoop config. -
<impala-cluster-ns>— namespace which the Impala cluster uses. -
<hadoop-conf-folder>— folder with files from which a secret is generated. If you pass file contents as string values inside values.yaml, specify the path to that file.
-
Step 3. Update the Impala cluster
-
Update the configuration file for the Impala cluster:
impala-cluster-values.yamlimage: registry: "<image_registry>" repository: "<image_repository>" tag: "<image_tag>" pullPolicy: Always useRanger: true clusterDomain: cluster.local configsSecretName: <hadoop-config> catalog: coordinator: executor: replicas: 2 statestore:The crucial parameters should already be defined for an Impala cluster to work. However, you should set
useRangertotrue. -
Update the Impala cluster installation:
$ helm upgrade --install impala-cluster oci://"$PRIVATE_REGISTRY"/adc-enterprise/charts/impala-cluster --version <version> -f impala-cluster-values.yaml --namespace <impala-cluster-ns> --create-namespace -
Delete old pods so that the Impala operator creates new ones from an updated config:
$ kubectl delete pods -n <impala-cluster-ns> -l app.kubernetes.io/instance=impala -
Check that all the pods are in the
Runningstate:$ kubectl get pods -n <impala-cluster-ns>The expected output is:
NAME READY STATUS RESTARTS AGE impala-cluster-catalog-0 1/1 Running 0 65m impala-cluster-coordinator-0 1/1 Running 0 65m impala-cluster-executor-0 1/1 Running 0 65m impala-cluster-executor-1 1/1 Running 0 65m impala-cluster-statestore-0 1/1 Running 0 65m