Use JDBC to connect to Hive

Konstantin Alpashkin

Contents

JDBC connection string
JDBC data types mapping
Examples
- Java
- Connect using DBeaver

This section describes ways to connect to HiveServer2 using the JDBC interface which is the recommended way for client interaction with HiveServer2. The section contains a sample Java program that opens a connection to HiveServer2 and also shows how to connect to Hive using DBeaver which is a database tool that relies on JDBC.

To connect to Hive that runs in a kerberized cluster, be sure to complete Kerberos prerequisites, like installing Kerberos client, getting a ticket, configuring krb5.conf, etc. For more information, see Kerberos overview.

JDBC connection string

Below are examples of JDBC connection strings depending on the security mechanism used in an ADH cluster. You can find the up-to-date JDBC connection string on the Hive Info page in ADCM (Clusters → Services → Hive → Info).

Without SSL/Kerberos
HTTP mode
SSL enabled
SSL+Kerberos

Connects to Hive in the high availability mode, the connection is insecure:

jdbc:hive2://<cluster_host_0>:2181,<cluster_host_1>:2181,<cluster_host_N>:2181/;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=arenadata/cluster/<cluster_id>/<namespace>

Connects to Hive in the HTTP mode, the connection is insecure:

jdbc:hive2://<cluster_host_0>:2181,<cluster_host_1>:2181,<cluster_host_N>:2181/;transportMode=http;httpPath=<hive_endpoint>

Where <hive_endpoint> is the HTTP endpoint defined as hive.server2.thrift.http.path in hive-site.xml.

Connects to Hive in the high availability mode, the connection is secured with SSL:

jdbc:hive2://<cluster_host_0>:2181,<cluster_host_1>:2181,<cluster_host_N>:2181;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=arenadata/cluster/<cluster_id>/<namespace>;ssl=true;sslTrustStore=/tmp/truststore.jks;trustStorePassword=bigdata

Connects to Hive in the high availability mode, the connection is secured with SSL+Kerberos:

jdbc:hive2://<cluster_host_0>:2181,<cluster_host_1>:2181,<cluster_host_N>:2181;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=arenadata/cluster/<cluster_id>/<namespace>;ssl=true;sslTrustStore=/tmp/truststore.jks;trustStorePassword=bigdata;principal=hive/_HOST@EXAMPLE.COM

Where EXAMPLE.COM is your Kerberos realm, for example RU-CENTRAL1.INTERNAL.

The sample JDBC connection strings above are intended to connect to Hive in the high availability (HA) mode. However, you can still connect to a HiveServer2 instance directly in the non-HA mode using the Thrift server port (10000 by default). In this case, the JDBC string looks like jdbc:hive2://<cluster_host>:10000/.

JDBC data types mapping

The following table describes the mapping between Hive/Java data types.

Hive data type

Java data type

Comment

TINYINT

byte

Signed or unsigned 1-byte integer

SMALLINT

short

Signed 2-byte integer

INT

int

Signed 4-byte integer

BIGINT

long

Signed 8-byte integer

FLOAT

double

Single-precision number

DOUBLE

double

Double-precision number

DECIMAL

java.math.BigDecimal

Fixed-precision decimal value

BOOLEAN

boolean

Single bit (0 or 1)

STRING

String

Character string or variable-length character string

TIMESTAMP

java.sql.Timestamp

Date and time values

BINARY

String

Binary data

ARRAY

String (json-encoded)

Values of the same data type

MAP

String (json-encoded)

Key/value pairs

STRUCT

String (json-encoded)

Structured data

Examples

Java

The following snippet shows how to connect to HiveServer2 using standard JDBC tools. This program uses the org.apache.hive.jdbc.HiveDriver driver class from hive-jdbc-3.1.1-arenadata-standalone.jar. You can pull this JAR from Maven repository. This standalone JAR also contains all the dependencies required to use the driver on a standalone (non-ADH) host.

import java.sql.SQLException;
import java.sql.Connection;
import java.sql.ResultSet;
import java.sql.Statement;
import java.sql.DriverManager;

public class HiveJdbcClient {
  private static String driverName = "org.apache.hive.jdbc.HiveDriver"; (1)

  public static void main(String[] args) throws SQLException {
      try {
      Class.forName(driverName);
    } catch (ClassNotFoundException e) {
      e.printStackTrace();
      System.exit(1);
    }

    Connection con = DriverManager.getConnection("jdbc:hive2://ka-adh-1.ru-central1.internal:2181," +
"ka-adh-3.ru-central1.internal:2181,ka-adh-2.ru-central1.internal:2181/;serviceDiscoveryMode=zooKeeper;" +
"zooKeeperNamespace=arenadata/cluster/2/hiveserver2", "hive", ""); (2)

    // Connection con = DriverManager.getConnection("jdbc:hive2://ka-adh-1.ru-central1.internal:10000/test_db", "hive", ""); (3)

    Statement stmt = con.createStatement();
    String tableName = "testHiveDriverTable";
    stmt.execute("drop table if exists " + tableName);
    stmt.execute("create table " + tableName + " (key int, value string)");
    // show tables
    String sql = "show tables '" + tableName + "'";
    System.out.println("Running query: " + sql);
    ResultSet res = stmt.executeQuery(sql);
    while (res.next()) {
      System.out.println(res.getString(1));
    }

    // other Hive queries
  }
}

1	The JDBC driver class name.
2	The JDBC connection string to connect to Hive in the HA mode.
3	The JDBC connection string to connect to Hive in non-HA mode, using Thrift server port.

Connect using DBeaver

DBeaver is an open-source database tool that connects to Hive using JDBC. To connect to HiveServer2 from DBeaver, all you need is to provide an appropriate JDBC connection string, and Hive will download the default driver org.apache.hive.jdbc.HiveDriver automatically.

Hive connection settings

Found a mistake? Seleсt text and press Ctrl+Enter to report it