Use the Beeline shell with Hive
HiveServer2 supports the Beeline command shell. It is a JDBC client based on the SQLLine CLI. There is detailed documentation of SQLLine which applies to Beeline as well.
For information on replacing the implementation of Hive CLI with Beeline and the reasons to do so, see Apache Hive documentation page.
The Beeline shell works in both embedded and remote modes. In the embedded mode, it runs an embedded Hive (similar to Hive CLI) whereas the remote mode is for connecting to a separate HiveServer2 process over Thrift. Starting Hive 0.14, when Beeline is used with HiveServer2, it also prints the log messages from HiveServer2 for queries it executes to STDERR. The remote HiveServer2 mode is recommended for production use, as it is more secure and doesn’t require direct HDFS/metastore access to be granted for users.
In the remote mode, HiveServer2 only accepts valid Thrift calls – even in HTTP mode, the message body should contain Thrift payloads.
Examples
Beeline basics
This section demonstrates basic interactions with Beeline.
To open the Beeline shell, run the following:
$ bin/beeline
The sample Beeline output is below.
SLF4J: Class path contains multiple SLF4J bindings. SLF4J: Found binding in [jar:file:/usr/lib/hive/lib/log4j-slf4j-impl-2.10.0.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: Found binding in [jar:file:/usr/lib/hadoop/lib/slf4j-log4j12-1.7.25.jar!/org/slf4j/impl/StaticLoggerBinder.class] SLF4J: See http://www.slf4j.org/codes.html#multiple_bindings for an explanation. SLF4J: Actual binding is of type [org.apache.logging.slf4j.Log4jLoggerFactory] Beeline version 3.1.1 by Apache Hive
To connect to Hive, execute the following command in the Beeline shell:
!connect jdbc:hive2://localhost:10000 scott tiger
This produces an output like the one below.
Connecting to jdbc:hive2://localhost:10000 Connected to: Apache Hive (version 3.1.1) Driver: Hive JDBC (version 3.1.1) Transaction isolation: TRANSACTION_REPEATABLE_READ
NOTE
If Hive runs in the high availability mode, use the special JDBC string format.
|
To view all Hive tables, run the following in the Beeline shell:
show tables;
The command produces the following output:
+-------------------+ | tab_name | +-------------------+ | primitives | | src | | src1 | | src_json | | src_sequencefile | | src_thrift | | srcbucket | | srcbucket2 | | srcpart | +-------------------+ 9 rows selected (1.079 seconds)
You can also specify the connection parameters in the command line. This means you can find the command with the connection string from your UNIX shell history.
$ beeline -u jdbc:hive2://localhost:10000/default -n scott -w password_file
Beeline with NoSASL connection
If you would like to connect using the NOSASL mode, specify the authentication mode explicitly:
!connect jdbc:hive2://<host>:<port>/<db>;auth=noSasl hiveuser pass
See the available Beeline commands at the Beeline command line page.
Beeline output formats
In Beeline, the results can be displayed in different formats.
The format mode can be set with the outputformat
option.
Type | Description |
---|---|
table |
The result is displayed in a table. A row of the result corresponds to a row in the table and the values in one row are displayed in separate columns in the table. This is the default format mode |
vertical |
Each row of the result is displayed in a block of key/value format, where the keys are the names of the columns |
xmlattr |
The result is displayed in an XML format where each row is a |
xmlelements |
The result is displayed in an XML format where each row is a |
json |
(Hive 4.0) The result is displayed in JSON format where each row is a |
jsonfile |
(Hive 4.0) The result is displayed in JSON format where each row is a distinct JSON object. This matches the expected format for a table created as JSONFILE format |
sv output formats |
The values of a row are separated by different delimiters. There are five separated-value output formats available: csv, tsv, csv2, tsv2, and dsv |
Cancelling the query
When you press CTRL+C
on the Beeline shell and there is a running query, Beeline attempts to cancel the query by closing the socket connection to HiveServer2.
This behavior is enabled only when hive.server2.close.session.on.disconnect
is set to true
.
Starting from Hive 2.2.0, Beeline does not exit the command line shell if a running query is cancelled as a result of CTRL+C
.
If you wish to exit the shell, you can press CTRL+C
for the second time while the query is being cancelled.
However, if there is no query currently running, the first CTRL+C
will exit the Beeline shell.
This behavior is similar to how the Hive CLI handles CTRL+C
.
TIP
!quit is the recommended command to exit the Beeline shell.
|
Background query in terminal script
You can run Beeline disconnected from a terminal for batch processing and automation scripts via such commands as nohup
and disown
.
Some versions of Beeline client may require a workaround to allow the nohup
command to correctly put the Beeline process in the background without stopping it.
The following environment variable can be updated:
export HADOOP_CLIENT_OPTS="$HADOOP_CLIENT_OPTS -Djline.terminal=jline.UnsupportedTerminal"
Running with nohup
and &
will place the process in the background and allow the terminal to disconnect while keeping the Beeline process running.
nohup beeline --silent=true --showHeader=true --outputformat=dsv -f query.hql </dev/null > /tmp/output.log 2> /tmp/error.log &