High availability in Hive

By default, the high availability (HA) mode for Hive is enabled out-of-the-box.

To be more specific, it is the HiveServer2 component that operates in the HA mode. The HA mode for the HiveServer2 component is enabled regardless of the number of components in the cluster. When new HiveServer2 components are added to the cluster, ZooKeeper automatically registers them and updates the JDBC connection string.

JDBC connection string in HA

In the HA mode, the JDBC connection string includes the ZooKeeper ensemble. You can find the up-to-date JDBC connection string on the Hive Info page in ADCM (ClustersServicesHiveInfo).

Below are examples of JDBC connection strings depending on the security mechanism being used.

  • HA without SSL/Kerberos

  • HA with SSL

  • HA with SSL+Kerberos

High availability, insecure connection without SSL/Kerberos:

jdbc:hive2://<cluster_host_0>:2181,<cluster_host_1>:2181,<cluster_host_N>:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=arenadata/cluster/<cluster_id>/<namespace>

High availability, secure connection with SSL:

jdbc:hive2://<cluster_host_0>:2181,<cluster_host_1>:2181,<cluster_host_N>:2181;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=arenadata/cluster/<cluster_id>/<namespace>;ssl=true;sslTrustStore=/tmp/truststore.jks;trustStorePassword=bigdata

High availability, secure connection with SSL+Kerberos:

jdbc:hive2://<cluster_host_0>:2181,<cluster_host_1>:2181,<cluster_host_N>:2181;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=arenadata/cluster/<cluster_id>/<namespace>;ssl=true;sslTrustStore=/tmp/truststore.jks;trustStorePassword=bigdata;principal=hive/_HOST@EXAMPLE.COM

Where EXAMPLE.COM is your Kerberos realm, for example RU-CENTRAL1.INTERNAL.

NOTE
You can still connect to a HiveServer2 instance directly in the non-HA mode using the Thrift port (10000 by default). In this case, the JDBC string looks like jdbc:hive2://<cluster_host>:10000/.
Found a mistake? Seleсt text and press Ctrl+Enter to report it