Kerberos and SSL for Impala on Kubernetes

This article shows how to enable Kerberos authentication and SSL encryption for Impala running in Kubernetes.

The configuration scenario provided below assumes that you have deployed a clean, non-secured Impala cluster on Kubernetes as described in the Install Impala on Kubernetes article.

Prerequisites

Configurations steps

Step 1. Install Kerberos operator

Install Kerberos operator and Kerberos config using Helm as described in the Install Kerberos operator on Kubernetes article. Sample Helm values files for installing Kerberos operator and Kerberos config are below.

ko-values.yaml
replicas: 1
image:
  registry: hub.arenadata.io (1)
  repository: adc-enterprise/kerberos-operator (2)
  pullPolicy: IfNotPresent
  tag: <tag> (3)

serviceAccount:
  create: true
  automount: true

service:
  type: ClusterIP
  port: 8443

payloadNamespaces:
  names: (4)
    - kerberos-prod
    - kerberos-staging
  allowClusterRole: false (5)
  deleteProtection: true (6)
  avoidCreation: false (7)

terminationGracePeriodSeconds: 10
1 Your image storage URL.
2 Path to the Kerberos operator image repository within your storage.
3 Version of the image that Kubernetes will use.
4 List of namespaces the operator manages.
5 Explicit opt-in for cluster-wide access. When true and payloadNamespaces.names is empty, the chart creates ClusterRole/ClusterRoleBinding for access to all namespaces.
6 Add the helm.sh/resource-policy: keep annotation to payload namespaces to prevent deletion on helm uninstall.
7 Skip creating namespace resources. Use only when namespaces already exist (e.g. created by Kerberos operator).
kc-values.yaml
ldapSecret:
  enabled: true
  provider: freeipa (1)
  address: ldap://tsn-freeipa.ru-central1.internal (2)
  adminUser: uid=admin,cn=users,cn=accounts,dc=ru-central1,dc=internal (3)
  adminPassword: bigdata (4)
  baseDN: cn=services,cn=accounts,dc=ru-central1,dc=internal (5)
  ca: | <pem-certificate> (6)

kdcConfig:
  labelSelector:
    env: prod
  libdefaults:
    debug: 'false'
    default_realm: RU-CENTRAL1.INTERNAL
    dns_lookup_kdc: 'false'
    dns_lookup_realm: 'false'
    udp_preference_limit: '1'
  realm: RU-CENTRAL1.INTERNAL (7)
  domainRealm:
    ru-central1.internal: RU-CENTRAL1.INTERNAL
  realms:
    RU-CENTRAL1.INTERNAL: |
      kdc = tsn-freeipa.ru-central1.internal (8)
      admin_server = tsn-freeipa.ru-central1.internal (9)
1 Type of Kerberos provider. Can be one of the following: ad, samba, freeipa.
2 LDAP connection URL.
3 Administrator username.
4 Administrator password.
5 Search base.
6 CA certificate used to trust the LDAP server’s TLS certificate.
7 Kerberos realm.
8 Host with KDC available.
9 Host with kadmin available.

The commands for installing Helm charts:

$ helm upgrade --install kerberos-operator oci://"$PRIVATE_REGISTRY"/adc-enterprise/charts/kerberos-operator --version <chart_version> -f ko-values.yaml --namespace kerberos-operator --create-namespace
$ helm upgrade --install kerberos-config oci://"$PRIVATE_REGISTRY"/adc-enterprise/charts/kerberos-config --version <chart_version> -f kc-values.yaml --namespace kerberos-operator --create-namespace

To verify the installation, use the commands:

$ kubectl get secrets -n kerberos-operator
$ kubectl get configmaps -n kerberos-operator

Sample output:

kubectl get secrets -n kerberos-operator
NAME                                      TYPE                                 DATA   AGE
kerberos-config-ldap-credentials          krb5.arenadata.io/ldap-credentials   5      7s
sh.helm.release.v1.kerberos-config.v1     helm.sh/release.v1                   1      7s
sh.helm.release.v1.kerberos-operator.v1   helm.sh/release.v1                   1      48m
kubectl get configmaps -n kerberos-operator
NAME               DATA   AGE
kube-root-ca.crt   1      50m

Step 2. Create keytab and principals

  1. Create the impala_keytab.yaml manifest file with Impala principals. For example:

    apiVersion: krb5.arenadata.io/v1alpha1
    kind: Keytab
    metadata:
      name: impala-keytab
      namespace: impala
    spec:
      items:
        - realm: RU-CENTRAL1.INTERNAL (1)
          labelSelector:
            env: prod
          principals: (2)
            - HTTP/impala-cloud.ru-central1.internal
            - impala/impala-cloud.ru-central1.internal
            - impala/impala-jdbc.ru-central1.internal
      rotation:
        interval: 720h
        checkInterval: 1h
    1 Kerberos realm.
    2 Kerberos principals for accessing kerberized ADH services.
  2. Apply the configuration:

    $ kubectl apply -f impala_keytab.yaml
  3. Verify the keytab creation using the commands:

    $ kubectl get keytab -n impala
    $ kubectl get secret impala_keytab -n impala

    Sample output:

    NAME            ROTATION            READY             AGE    NEXTROTATION
    impala-keytab   RotationScheduled   SecretGenerated   2d1h   Next rotation required 2026-05-22T14:13:11Z
    NAME            TYPE                       DATA   AGE
    impala-keytab   krb5.arenadata.io/bundle   2      2d1h

Step 3. Update Kubernetes secrets

Your Impala installation already uses a Kubernetes secret with ADH configuration files to work with unprotected ADH services. To enable Kerberos/SSL and allow Impala to work with SSL-protected ADH services, adjust the ADH configuration files and re-create the secret.

  1. Adjust the core-site.xml, hdfs-site.xml, and hive-site.xml configuration files. The updated files should include the properties as shown below.

    core-site.xml
    <?xml version="1.0"?>
    <configuration>
            <property>
                    <name>fs.defaultFS</name>
                    <value>hdfs://adh</value>
            </property>
            <property>
                    <name>dfs.nameservices</name>
                    <value>adh</value>
            </property>
            <property>
                    <name>hadoop.security.authentication</name>
                    <value>kerberos</value>
            </property>
            <property>
                    <name>dfs.ha.namenodes.adh</name>
                    <value>nn_ka-adh-1,nn_ka-adh-2</value>
            </property>
            <property>
                    <name>dfs.namenode.rpc-address.adh.nn_ka-adh-1</name>
                    <value>ka-adh-1.ru-central1.internal:8020</value>
            </property>
            <property>
                    <name>dfs.namenode.rpc-address.adh.nn_ka-adh-2</name>
                    <value>ka-adh-2.ru-central1.internal:8020</value>
            </property>
            <property>
                    <name>dfs.namenode.kerberos.principal</name>
                    <value>nn/_HOST@RU-CENTRAL1.INTERNAL</value>
            </property>
    </configuration>
    hdfs-site.xml
    <?xml version="1.0"?>
    <configuration>
            <property>
                    <name>dfs.nameservices</name>
                    <value>adh</value>
            </property>
            <property>
                    <name>dfs.ha.namenodes.adh</name>
                    <value>nn_ka-adh-1,nn_ka-adh-2</value>
            </property>
            <property>
                    <name>dfs.namenode.rpc-address.adh.nn_ka-adh-1</name>
                    <value>ka-adh-1.ru-central1.internal:8020</value>
            </property>
            <property>
                    <name>dfs.namenode.rpc-address.adh.nn_ka-adh-2</name>
                    <value>ka-adh-2.ru-central1.internal:8020</value>
            </property>
            <property>
                    <name>dfs.client.failover.proxy.provider.adh</name>
                   <value>org.apache.hadoop.hdfs.server.namenode.ha.ObserverReadProxyProvider</value>
            </property>
    </configuration>
    hive-site.xml
    <?xml version="1.0"?>
    <configuration>
        <property>
            <name>hive.metastore.uris</name>
            <value>thrift://ka-adh-2.ru-central1.internal:9083</value>
        </property>
        <property>
            <name>metastore.use.SSL</name>
            <value>True</value>
        </property>
        <property>
    		<name>metastore.truststore.password</name>
    		<value>bigdata</value>
    	</property>
    	<property>
    		<name>metastore.truststore.path</name>
    		<value>/etc/ssl/truststore.jks</value>
    	</property>
    	<property>
    		<name>hive.metastore.sasl.enabled</name>
    		<value>True</value>
    	</property>
    	<property>
    		<name>hive.metastore.kerberos.principal</name>
    		<value>hive/_HOST@RU-CENTRAL1.INTERNAL</value>
    	</property>
    </configuration>

    where RU-CENTRAL1.INTERNAL is your Kerberos realm.

  2. If you use Impala with Ranger, update the Ranger configuration according to the instruction.

  3. Re-create the Kubernetes secret:

    $ kubectl delete secret <hadoop-conf> -n <impala-cluster-ns>
    $ kubectl create secret generic <hadoop-conf> -n <impala-cluster-ns> --from-file=core-site.xml --from-file=hdfs-site.xml --from-file=hive-site.xml

    where:

    • <hadoop-conf> — name of the Kubernetes secret with ADH configs.

    • <impala-cluster-ns> — namespace used by Impala.

  4. Create a JKS truststore. This truststore should include all certificates from your ADH cluster. Create a secret for the truststore:

    $ kubectl create secret generic ca-store --namespace <impala-cluster-ns> --from-file=truststore.jks=/etc/ssl/truststore.jks
  5. Generate a certificate for Impala:

    $ openssl req -x509 -nodes -days 365 -newkey rsa:2048 \
    -keyout impala-jdbc.ru-central1.internal.key \
    -out impala-jdbc.ru-central1.internal.crt \
    -subj "/CN=impala-jdbc.ru-central1.internal"
  6. Create a secret for incoming JDBC connections:

    $ kubectl -n <impala-cluster-ns> create secret generic impala-tls --from-file=cert.crt=impala-jdbc.ru-central1.internal.crt --from-file=crt.key=impala-jdbc.ru-central1.internal.key

Step 4. Update Impala cluster configuration

  1. Modify the impala_cluster_values.yaml cluster configuration file by adding Kerberos/SSL parameters to it. The updated file should look as follows:

    image:
      registry: "<registry>"
      repository: "<image>"
      tag: "<tag>"
      pullPolicy: Always
    
    useRanger: false
    clusterDomain: cluster.local
    configsSecretName: "hadoop-conf"
    
    kerberos: (1)
      realm: RU-CENTRAL1.INTERNAL
      hostname: impala-jdbc.ru-central1.internal
      keytab:
        create: false
        secretName: impala-keytab
        labelSelector:
          env: prod
    
    ssl:
      secretName: ca-store (2)
      trustStoreKey: truststore.jks
    
    tls:
      secretName: impala-tls (3)
      certificateKey: cert.crt
      privateKey: crt.key
    #  clientCaCertificate: ca.pem
    
    catalog:
    
    coordinator:
    
    executor:
      replicas: 2
    
    statestore:
    1 Kerberos parameters for accessing the Impala cluster.
    2 Truststore secret name.
    3 Impala certificate secret name.
    NOTE

    When editing configuration files for Impala, ensure that the following parameters are identical:

  2. Update the Impala cluster installation:

    $ helm upgrade --install impala-cluster oci://"$PRIVATE_REGISTRY"/adc-enterprise/charts/impala-cluster --version <version> -f impala_cluster_values.yaml --namespace <impala-cluster-ns> --create-namespace
  3. Delete old pods so that the Impala operator creates new ones from an updated config:

    $ kubectl delete pods -n <impala-cluster-ns> -l app.kubernetes.io/instance=impala
  4. Check that all the pods are in the Running state:

    $ kubectl get pods -n <impala-cluster-ns>

    The expected output is:

    NAME                           READY   STATUS    RESTARTS   AGE
    impala-cluster-catalog-0       1/1     Running   0          65m
    impala-cluster-coordinator-0   1/1     Running   0          65m
    impala-cluster-executor-0      1/1     Running   0          65m
    impala-cluster-executor-1      1/1     Running   0          65m
    impala-cluster-statestore-0    1/1     Running   0          65m

Step 5. Connect to Impala via JDBC

To access Impala web interface, you need to expose the service using one of the supported publication methods, for example, through a load balancer or Ingress controller. All configurations related to exposing a service, including DNS, annotations, Ingress settings, load balancing rules, and other platform-specific settings, should be specified according to your Kubernetes environment.

  1. Get the external IP address of your Ingress controller or load balancer. For example:

    impala-lb                    LoadBalancer   10.96.231.158   10.92.42.144   21050:32154/TCP,26000:30753/TCP,24000:32645/TCP   25h

    Copy the external IP address (10.92.42.144 in this example) for the next steps.

  2. Connect to the Impala cluster over JDBC, for example, using DBeaver. For this, the JDBC connection string looks as follows:

    jdbc:impala://<external-ip>:21050/default;AuthMech=1;KrbServiceName=impala;KrbHostFQDN=impala-jdbc.ru-central1.internal;SSL=1;SSLTrustStore=<path>truststore.jks;SSLTrustStorePwd=<password>;httpPath=cliservice

    where:

    • <external-ip> — the external IP address of your load balancer or Ingress controller.

    • KrbServiceName=impala — Kerberos service name.

    • KrbHostFQDN=impala-jdbc.ru-central1.internal — Kerberos host name specified in the Impala cluster values file (impala_cluster_values.yaml).

    • SSLTrustStore=<path>/truststore.jks — path to the truststore with certificates used by DBeaver.

    • SSLTrustStorePwd=<password> — password for accessing the truststore.

Found a mistake? Seleсt text and press Ctrl+Enter to report it