Use filters in HBase

Overview

HBase in ADH offers a Thrift filter language. It allows you to filter the results of reading the data from HBase using the get or scan operations. Filtering occurs on the server side, so it doesn’t reduce the load on the HBase. It does reduce, however, the load on the network as less data is transmitted to a client. Filters can be used both while using the Java API and in the HBase shell.

HBase filter syntax

Simple filter

A simple filter is specified by a string of the following kind:

"<FilterName> (<arg1>, <arg2>, ..., <argN>)"

where:

  • <FilterName> is a name of one of the individual filters;

  • <arg1>, <arg2> etc. are the arguments of that filter.

Example of a simple filter usage in HBase shell:

scan 'articles', { FILTER => "ColumnPaginationFilter (2, 1)"}

Full set of arguments is the filter condition. Different individual filters may require zero or more arguments.

Arguments may have the following types:

Timestamps

Timestamps in HBase shell table scan results are shown in human-readable format. However, when a filter requires a timestamp as an argument value, it should be specified in Unix epoch format. To convert timestamps from one format to another, use online tools like EpochConverter.

NOTE
HBase requires millisecond precision for timestamps. You need to manually add milliseconds to the timestamps after conversion to Unix epoch format.

Comparison operators and comparators

Comparison operators and comparators are used in filter arguments to compose conditions for lexicographic matches and comparisons.

The following comparison operators are used as filter arguments.

Comparison operators
Syntax Description

<

Less

>

Greater

=

Equal

<=

Less or equal

>=

Greater or equal

!=

Not equal

The following comparators are used as filter arguments.

Comparators
Name Description Example

BinaryComparator

Lexicographically compares against the specified string

(>, 'binary:arc')

This comparator will match everything lexicographically greater than arc

BinaryPrefixComparator

Lexicographically compares against a specified string. It only compares up to the length of this string

(=, 'binaryprefix:bot')

This comparator will match everything that begins with bot

RegexStringComparator

Lexicographically compares against the specified regular expression for a string. Only = and != comparisons can be used with this comparator

(!=, 'regexstring:be*ng')

This comparator will match any string except those beginning with be and ending with ng

SubStringComparator

Searches for the given substring, case insensitive. Only = and != comparisons can be used with this comparator

(!=, substring:con)

This comparator will match any string except those containing con

Filter syntax rules

 
Filter syntax obeys the following rules:

  • Filter expression must be enclosed in double quotes (").

  • Filter name must be a single word. All ASCII characters are suitable except parentheses, single quotes, and whitespaces.

  • Arguments must be separated by commas and altogether enclosed in parentheses.

  • String arguments must be each enclosed in single quotes (').

  • Arguments of other types (boolean, integer, comparison operator, etc.) should not be enclosed in quotes.

  • Filter arguments can contain any ASCII characters. However, possible single quotes in the argument must be preceded with an extra single quote acting as an escape character.

Compound filter

A compound filter consists of individual filters united by logical operators. There are binary and unary logical operators. Binary operators unite the filters to the left and right of them:

  • AND — the key/value is only returned if it satisfies both filters.

  • OR — the key/value is returned if it satisfies either filter.

Unary operators precede the filter and modify its behavior:

  • SKIP — if any of the key/value pairs does not satisfy the filter condition, the entire row is skipped.

  • WHILE — for a particular row, the key/value pairs keep being emitted until one of them fails to satisfy the filter condition.

To build an order of individual filter processing in compound filters, use parentheses. Example:

(<Filter1> OR <Filter2>) AND (<Filter3> OR <Filter4>)

First, the result of <Filter1> and <Filter2> unification by the OR operator is processed. Next, the same happens to <Filter3> and <Filter4>. Finally, those processions are combined by the AND operator.

Logical operators and parentheses have the following order of precedence, highest to lowest:

  1. parentheses;

  2. SKIP and WHILE (share the same precedence);

  3. AND;

  4. OR.

Example of a compound filter usage in HBase shell:

scan 'articles', { FILTER => "ColumnPaginationFilter (2, 1) AND PrefixFilter ('co')"}

Individual filters in HBase

The individual filters presented in HBase are listed below along with their syntax. Example queries are executed against the test table loaded with values from test file people.csv. This file is an extended version of a test file used in the Bulk loading via built-in MapReduce jobs article. You can use instructions in that article to create the same test table which is used here. If you do, make sure you change the command for the ImportTsv job and use the -Dimporttsv.columns=HBASE_ROW_KEY,basic:age,location:town,location:state,location:country flag instead of -Dimporttsv.columns=HBASE_ROW_KEY,basic:age. Change the file and table names accordingly if necessary.

The test table’s row keys are the names of the people. The first column family called basic has the only qualifier called age. The second column family called location has three qualifiers called country, state, and town. There are 997 rows in the table.

KeyOnlyFilter

 
Returns the key component of each key/value. Takes no arguments.

Syntax:

"KeyOnlyFilter ()"

Command example:

scan 'people', { FILTER => "KeyOnlyFilter ()" }

Result (the last eight keys):

 ...
 Zimmerman Gene                                  column=basic:age, timestamp=2024-07-30T08:10:19.297, value=
 Zimmerman Gene                                  column=location:country, timestamp=2024-07-30T08:10:19.297, value=
 Zimmerman Gene                                  column=location:state, timestamp=2024-07-30T08:10:19.297, value=
 Zimmerman Gene                                  column=location:town, timestamp=2024-07-30T08:10:19.297, value=
 Zimmerman Madge                                 column=basic:age, timestamp=2024-07-30T08:10:19.297, value=
 Zimmerman Madge                                 column=location:country, timestamp=2024-07-30T08:10:19.297, value=
 Zimmerman Madge                                 column=location:state, timestamp=2024-07-30T08:10:19.297, value=
 Zimmerman Madge                                 column=location:town, timestamp=2024-07-30T08:10:19.297, value=
997 row(s)
Took 0.5022 seconds
FirstKeyOnlyFilter

 
Returns the first key/value from each row. Takes no arguments.

Syntax:

"FirstKeyOnlyFilter ()"

Command example:

scan 'people', { FILTER => "KeyOnlyFilter ()" }

Result (the first and last five key/value pairs):

ROW                                              COLUMN+CELL
 Abbott Delia                                    column=basic:age, timestamp=2024-07-30T08:10:19.297, value=62
 Abbott Howard                                   column=basic:age, timestamp=2024-07-30T08:10:19.297, value=24
 Abbott Jack                                     column=basic:age, timestamp=2024-07-30T08:10:19.297, value=29
 Adams Clyde                                     column=basic:age, timestamp=2024-07-30T08:10:19.297, value=29
 Aguilar Myrtie                                  column=basic:age, timestamp=2024-07-30T08:10:19.297, value=23
 ...
 Young Della                                     column=basic:age, timestamp=2024-07-30T08:10:19.297, value=21
 Young Josephine                                 column=basic:age, timestamp=2024-07-30T08:10:19.297, value=29
 Young Mattie                                    column=basic:age, timestamp=2024-07-30T08:10:19.297, value=39
 Zimmerman Gene                                  column=basic:age, timestamp=2024-07-30T08:10:19.297, value=35
 Zimmerman Madge                                 column=basic:age, timestamp=2024-07-30T08:10:19.297, value=46
997 row(s)
Took 0.1058 seconds
PrefixFilter

 
Returns all key/value pairs from rows the keys of which begin with the prefix specified by the argument. Takes one argument: row key prefix.

Syntax:

"PrefixFilter ('Roy')"

Command example:

scan 'people', { FILTER => "PrefixFilter ('Roy')" }

Result:

ROW                                             COLUMN+CELL
 Roy Alfred                                     column=basic:age, timestamp=2024-07-30T08:10:19.297, value=55
 Roy Alfred                                     column=location:country, timestamp=2024-07-30T08:10:19.297, value=MEX
 Roy Alfred                                     column=location:state, timestamp=2024-07-30T08:10:19.297, value=Chihuahua
 Roy Alfred                                     column=location:town, timestamp=2024-07-30T08:10:19.297, value=Juarez
 Roy Belle                                      column=basic:age, timestamp=2024-07-30T08:10:19.297, value=35
 Roy Belle                                      column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Roy Belle                                      column=location:state, timestamp=2024-07-30T08:10:19.297, value=AZ
 Roy Belle                                      column=location:town, timestamp=2024-07-30T08:10:19.297, value=Nogales
 Roy Lora                                       column=basic:age, timestamp=2024-07-30T08:10:19.297, value=52
 Roy Lora                                       column=location:country, timestamp=2024-07-30T08:10:19.297, value=MEX
 Roy Lora                                       column=location:state, timestamp=2024-07-30T08:10:19.297, value=Yucatan
 Roy Lora                                       column=location:town, timestamp=2024-07-30T08:10:19.297, value=Uman
 Roy Ronald                                     column=basic:age, timestamp=2024-07-30T08:10:19.297, value=63
 Roy Ronald                                     column=location:country, timestamp=2024-07-30T08:10:19.297, value=MEX
 Roy Ronald                                     column=location:state, timestamp=2024-07-30T08:10:19.297, value=California Baja
 Roy Ronald                                     column=location:town, timestamp=2024-07-30T08:10:19.297, value=Mexicali
4 row(s)
Took 0.0252 seconds
ColumnPrefixFilter

 
Returns all key/value pairs from a column with a qualifier that begins with the prefix specified by the argument. Takes one argument: prefix of the column qualifier.

Syntax:

"ColumnPrefixFilter ('tow')"

Command example:

scan 'people', { FILTER => "ColumnPrefixFilter ('tow')" }

Since all rows of the test table have non-empty values for all columns, this filter returns either zero results if no column qualifier has the specified prefix, or all 997 of them if such column qualifier exists. The command given above results in 997 key/value pairs, the first and last five of which are shown below:

ROW                                             COLUMN+CELL
 Abbott Delia                                   column=location:town, timestamp=2024-07-30T08:10:19.297, value=Dallas
 Abbott Howard                                  column=location:town, timestamp=2024-07-30T08:10:19.297, value=Baton Rouge
 Abbott Jack                                    column=location:town, timestamp=2024-07-30T08:10:19.297, value=Juarez
 Adams Clyde                                    column=location:town, timestamp=2024-07-30T08:10:19.297, value=Lafayette
 Aguilar Myrtie                                 column=location:town, timestamp=2024-07-30T08:10:19.297, value=Uman
 ...
 Young Della                                    column=location:town, timestamp=2024-07-30T08:10:19.297, value=Chetumal
 Young Josephine                                column=location:town, timestamp=2024-07-30T08:10:19.297, value=Redding
 Young Mattie                                   column=location:town, timestamp=2024-07-30T08:10:19.297, value=San Jose
 Zimmerman Gene                                 column=location:town, timestamp=2024-07-30T08:10:19.297, value=Tijuana
 Zimmerman Madge                                column=location:town, timestamp=2024-07-30T08:10:19.297, value=Nogales
997 row(s)
Took 0.1360 seconds
MultipleColumnPrefixFilter

 
Returns all key/value pairs from columns with qualifiers that begin with any of the prefixes specified by the arguments. Takes one or more arguments: prefixes of the column qualifiers. In the case of a single argument, works the same as the ColumnPrefixFilter filter.

Syntax:

"MultipleColumnPrefixFilter ('sta', 'tow')"

Command example:

scan 'people', { FILTER => "MultipleColumnPrefixFilter ('sta', 'tow')" }

The same situation as in the previous example (the ColumnPrefixFilter filter) applies here: since all rows of the test table have non-empty values for all columns, this filter returns either zero results if no column qualifier has any of the specified prefixes, or all 997 of them per prefix specified if respective qualifiers exist. If one of the specified prefixes is the prefix of another (e.g. to and tow), the number of results does not multiply. The command given above results in 1994 key/value pairs, the first and last six of which are shown below:

ROW                                             COLUMN+CELL
 Abbott Delia                                   column=location:state, timestamp=2024-07-30T08:10:19.297, value=TX
 Abbott Delia                                   column=location:town, timestamp=2024-07-30T08:10:19.297, value=Dallas
 Abbott Howard                                  column=location:state, timestamp=2024-07-30T08:10:19.297, value=LA
 Abbott Howard                                  column=location:town, timestamp=2024-07-30T08:10:19.297, value=Baton Rouge
 Abbott Jack                                    column=location:state, timestamp=2024-07-30T08:10:19.297, value=Chihuahua
 Abbott Jack                                    column=location:town, timestamp=2024-07-30T08:10:19.297, value=Juarez
 ...
 Young Mattie                                   column=location:state, timestamp=2024-07-30T08:10:19.297, value=CA
 Young Mattie                                   column=location:town, timestamp=2024-07-30T08:10:19.297, value=San Jose
 Zimmerman Gene                                 column=location:state, timestamp=2024-07-30T08:10:19.297, value=California Baja
 Zimmerman Gene                                 column=location:town, timestamp=2024-07-30T08:10:19.297, value=Tijuana
 Zimmerman Madge                                column=location:state, timestamp=2024-07-30T08:10:19.297, value=AZ
 Zimmerman Madge                                column=location:town, timestamp=2024-07-30T08:10:19.297, value=Nogales
997 row(s)
Took 0.1360 seconds
ColumnCountGetFilter

 
Returns columns of the row starting with the first and up to the argument value. This filter only works correctly with single rows, so it should be used with the get command. Using this filter with the scan command does not lead to an error, but provides an incorrect result. Takes one argument: number of the final column.

Syntax:

"ColumnCountGetFilter (3)"

Command example:

get 'people', 'Abbott Delia', { FILTER => "ColumnCountGetFilter (3)" }

Result:

COLUMN                                         CELL
 basic:age                                     timestamp=2024-07-30T08:10:19.297, value=62
 location:country                              timestamp=2024-07-30T08:10:19.297, value=USA
 location:state                                timestamp=2024-07-30T08:10:19.297, value=TX
1 row(s)
Took 0.6243 seconds
PageFilter

 
Returns rows of the table starting with the first and up to the argument value from each region. Takes one argument: number of the final row.

Syntax:

"PageFilter (2)"

Command example:

scan 'people', { FILTER => "PageFilter (2)" }

Since the test table was created with five regions split at F, K, P, and W, this command will return two rows with four key/value pairs apiece from each region. Result:

ROW                                             COLUMN+CELL
 Abbott Delia                                   column=basic:age, timestamp=2024-07-30T08:10:19.297, value=62
 Abbott Delia                                   column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Abbott Delia                                   column=location:state, timestamp=2024-07-30T08:10:19.297, value=TX
 Abbott Delia                                   column=location:town, timestamp=2024-07-30T08:10:19.297, value=Dallas
 Abbott Howard                                  column=basic:age, timestamp=2024-07-30T08:10:19.297, value=24
 Abbott Howard                                  column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Abbott Howard                                  column=location:state, timestamp=2024-07-30T08:10:19.297, value=LA
 Abbott Howard                                  column=location:town, timestamp=2024-07-30T08:10:19.297, value=Baton Rouge
 Farmer Alan                                    column=basic:age, timestamp=2024-07-30T08:10:19.297, value=24
 Farmer Alan                                    column=location:country, timestamp=2024-07-30T08:10:19.297, value=MEX
 Farmer Alan                                    column=location:state, timestamp=2024-07-30T08:10:19.297, value=Quintana Roo
 Farmer Alan                                    column=location:town, timestamp=2024-07-30T08:10:19.297, value=Cancun
 Farmer Dean                                    column=basic:age, timestamp=2024-07-30T08:10:19.297, value=65
 Farmer Dean                                    column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Farmer Dean                                    column=location:state, timestamp=2024-07-30T08:10:19.297, value=LA
 Farmer Dean                                    column=location:town, timestamp=2024-07-30T08:10:19.297, value=Baton Rouge
 Keller Elmer                                   column=basic:age, timestamp=2024-07-30T08:10:19.297, value=29
 Keller Elmer                                   column=location:country, timestamp=2024-07-30T08:10:19.297, value=MEX
 Keller Elmer                                   column=location:state, timestamp=2024-07-30T08:10:19.297, value=Quintana Roo
 Keller Elmer                                   column=location:town, timestamp=2024-07-30T08:10:19.297, value=Chetumal
 Keller Flora                                   column=basic:age, timestamp=2024-07-30T08:10:19.297, value=23
 Keller Flora                                   column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Keller Flora                                   column=location:state, timestamp=2024-07-30T08:10:19.297, value=TX
 Keller Flora                                   column=location:town, timestamp=2024-07-30T08:10:19.297, value=Houston
 Padilla Ethan                                  column=basic:age, timestamp=2024-07-30T08:10:19.297, value=61
 Padilla Ethan                                  column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Padilla Ethan                                  column=location:state, timestamp=2024-07-30T08:10:19.297, value=NV
 Padilla Ethan                                  column=location:town, timestamp=2024-07-30T08:10:19.297, value=Novac
 Padilla Scott                                  column=basic:age, timestamp=2024-07-30T08:10:19.297, value=28
 Padilla Scott                                  column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Padilla Scott                                  column=location:state, timestamp=2024-07-30T08:10:19.297, value=TX
 Padilla Scott                                  column=location:town, timestamp=2024-07-30T08:10:19.297, value=Beaumont
 Wade Janie                                     column=basic:age, timestamp=2024-07-30T08:10:19.297, value=25
 Wade Janie                                     column=location:country, timestamp=2024-07-30T08:10:19.297, value=MEX
 Wade Janie                                     column=location:state, timestamp=2024-07-30T08:10:19.297, value=California Baja
 Wade Janie                                     column=location:town, timestamp=2024-07-30T08:10:19.297, value=Ensenada
 Wade Jerome                                    column=basic:age, timestamp=2024-07-30T08:10:19.297, value=27
 Wade Jerome                                    column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Wade Jerome                                    column=location:state, timestamp=2024-07-30T08:10:19.297, value=TX
 Wade Jerome                                    column=location:town, timestamp=2024-07-30T08:10:19.297, value=Houston
10 row(s)
Took 0.0138 seconds
ColumnPaginationFilter

 
Returns the number of columns specified by the first argument (limit) after the number of columns specified by the second argument (offset). Takes two arguments: limit and offset.

Syntax:

"ColumnPaginationFilter (2, 1)"

Command example:

scan 'people', { FILTER => "ColumnPaginationFilter (2, 1)" }

This command results in 1994 key/value pairs, the first and last six of which are shown below:

ROW                                            COLUMN+CELL
 Abbott Delia                                  column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Abbott Delia                                  column=location:state, timestamp=2024-07-30T08:10:19.297, value=TX
 Abbott Howard                                 column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Abbott Howard                                 column=location:state, timestamp=2024-07-30T08:10:19.297, value=LA
 Abbott Jack                                   column=location:country, timestamp=2024-07-30T08:10:19.297, value=MEX
 Abbott Jack                                   column=location:state, timestamp=2024-07-30T08:10:19.297, value=Chihuahua
 ...
 Young Mattie                                  column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Young Mattie                                  column=location:state, timestamp=2024-07-30T08:10:19.297, value=CA
 Zimmerman Gene                                column=location:country, timestamp=2024-07-30T08:10:19.297, value=MEX
 Zimmerman Gene                                column=location:state, timestamp=2024-07-30T08:10:19.297, value=California Baja
 Zimmerman Madge                               column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Zimmerman Madge                               column=location:state, timestamp=2024-07-30T08:10:19.297, value=AZ
997 row(s)
Took 1.0474 seconds
InclusiveStopFilter

 
Scans the table row by row until it finds the row with the specified key, then returns all rows up to and including the one where the key was found. Takes one argument: row key.

Syntax:

"InclusiveStopFilter (3)"

Command example:

scan 'people', { FILTER => "InclusiveStopFilter ('Allen Austin')" }

Result:

ROW                                            COLUMN+CELL
 Abbott Delia                                  column=basic:age, timestamp=2024-07-30T08:10:19.297, value=62
 Abbott Delia                                  column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Abbott Delia                                  column=location:state, timestamp=2024-07-30T08:10:19.297, value=TX
 Abbott Delia                                  column=location:town, timestamp=2024-07-30T08:10:19.297, value=Dallas
 Abbott Howard                                 column=basic:age, timestamp=2024-07-30T08:10:19.297, value=24
 Abbott Howard                                 column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Abbott Howard                                 column=location:state, timestamp=2024-07-30T08:10:19.297, value=LA
 Abbott Howard                                 column=location:town, timestamp=2024-07-30T08:10:19.297, value=Baton Rouge
 Abbott Jack                                   column=basic:age, timestamp=2024-07-30T08:10:19.297, value=29
 Abbott Jack                                   column=location:country, timestamp=2024-07-30T08:10:19.297, value=MEX
 Abbott Jack                                   column=location:state, timestamp=2024-07-30T08:10:19.297, value=Chihuahua
 Abbott Jack                                   column=location:town, timestamp=2024-07-30T08:10:19.297, value=Juarez
 Adams Clyde                                   column=basic:age, timestamp=2024-07-30T08:10:19.297, value=29
 Adams Clyde                                   column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Adams Clyde                                   column=location:state, timestamp=2024-07-30T08:10:19.297, value=LA
 Adams Clyde                                   column=location:town, timestamp=2024-07-30T08:10:19.297, value=Lafayette
 Aguilar Myrtie                                column=basic:age, timestamp=2024-07-30T08:10:19.297, value=23
 Aguilar Myrtie                                column=location:country, timestamp=2024-07-30T08:10:19.297, value=MEX
 Aguilar Myrtie                                column=location:state, timestamp=2024-07-30T08:10:19.297, value=Yucatan
 Aguilar Myrtie                                column=location:town, timestamp=2024-07-30T08:10:19.297, value=Uman
 Aguilar Terry                                 column=basic:age, timestamp=2024-07-30T08:10:19.297, value=65
 Aguilar Terry                                 column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Aguilar Terry                                 column=location:state, timestamp=2024-07-30T08:10:19.297, value=CA
 Aguilar Terry                                 column=location:town, timestamp=2024-07-30T08:10:19.297, value=Fresno
 Alexander Derrick                             column=basic:age, timestamp=2024-07-30T08:10:19.297, value=46
 Alexander Derrick                             column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Alexander Derrick                             column=location:state, timestamp=2024-07-30T08:10:19.297, value=TX
 Alexander Derrick                             column=location:town, timestamp=2024-07-30T08:10:19.297, value=Dallas
 Alexander Gregory                             column=basic:age, timestamp=2024-07-30T08:10:19.297, value=54
 Alexander Gregory                             column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Alexander Gregory                             column=location:state, timestamp=2024-07-30T08:10:19.297, value=LA
 Alexander Gregory                             column=location:town, timestamp=2024-07-30T08:10:19.297, value=Lafayette
 Alexander Leon                                column=basic:age, timestamp=2024-07-30T08:10:19.297, value=42
 Alexander Leon                                column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Alexander Leon                                column=location:state, timestamp=2024-07-30T08:10:19.297, value=AZ
 Alexander Leon                                column=location:town, timestamp=2024-07-30T08:10:19.297, value=Kingman
 Allen Austin                                  column=basic:age, timestamp=2024-07-30T08:10:19.297, value=34
 Allen Austin                                  column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Allen Austin                                  column=location:state, timestamp=2024-07-30T08:10:19.297, value=NV
 Allen Austin                                  column=location:town, timestamp=2024-07-30T08:10:19.297, value=Primm
10 row(s)
Took 0.7167 seconds
TimestampsFilter

 
Returns all key/value pairs with timestamps matching any of the timestamps specified by the arguments. Takes any number of arguments: timestamps.

Syntax:

"TimestampsFilter (1721203180857, 1721316861863)"

All the key/value pairs in the test table have the same timestamp, so for a more concise and illustrative output you might want to tweak a couple of them using the put command:

put 'people', 'Young Mattie', 'location:town', 'Carlsbad', 1722327020000
put 'people', 'Young Mattie', 'location:state', 'NM', 1722327020000
put 'people', 'Yates Douglas', 'location:state', 'NM', 1722327020000
put 'people', 'Yates Douglas', 'location:town', 'Albuquerque', 1722327020000

Command example:

scan 'people', { FILTER => "TimestampsFilter (1722327020000)" }

Result:

ROW                                            COLUMN+CELL
 Yates Douglas                                 column=location:state, timestamp=2024-07-30T08:10:20, value=NM
 Yates Douglas                                 column=location:town, timestamp=2024-07-30T08:10:20, value=Albuquerque
 Young Mattie                                  column=location:state, timestamp=2024-07-30T08:10:20, value=NM
 Young Mattie                                  column=location:town, timestamp=2024-07-30T08:10:20, value=Carlsbad
2 row(s)
Took 0.1001 seconds
RowFilter

 
Checks all rows and returns all key/value pairs in the row if the row key value matches the result of comparison specified by the arguments. This filter does not work with the get command. Takes two arguments: a comparison operator and a comparator.

Syntax:

"RowFilter (>=, 'binaryprefix:B')"

Command example:

scan 'people', { FILTER => "RowFilter (<, 'binaryprefix:B')" }

This filter should return all key/value pairs from the rows the keys of which begin with something lexicographically less than B. In case of our test table this basically means all rows about people whose last names begin with an A. Result:

ROW                                            COLUMN+CELL
 Abbott Delia                                  column=basic:age, timestamp=2024-07-30T08:10:19.297, value=62
 Abbott Delia                                  column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Abbott Delia                                  column=location:state, timestamp=2024-07-30T08:10:19.297, value=TX
 Abbott Delia                                  column=location:town, timestamp=2024-07-30T08:10:19.297, value=Dallas
 Abbott Howard                                 column=basic:age, timestamp=2024-07-30T08:10:19.297, value=24
 Abbott Howard                                 column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Abbott Howard                                 column=location:state, timestamp=2024-07-30T08:10:19.297, value=LA
 Abbott Howard                                 column=location:town, timestamp=2024-07-30T08:10:19.297, value=Baton Rouge
 Abbott Jack                                   column=basic:age, timestamp=2024-07-30T08:10:19.297, value=29
 Abbott Jack                                   column=location:country, timestamp=2024-07-30T08:10:19.297, value=MEX
 Abbott Jack                                   column=location:state, timestamp=2024-07-30T08:10:19.297, value=Chihuahua
 Abbott Jack                                   column=location:town, timestamp=2024-07-30T08:10:19.297, value=Juarez
 Adams Clyde                                   column=basic:age, timestamp=2024-07-30T08:10:19.297, value=29
 Adams Clyde                                   column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Adams Clyde                                   column=location:state, timestamp=2024-07-30T08:10:19.297, value=LA
 Adams Clyde                                   column=location:town, timestamp=2024-07-30T08:10:19.297, value=Lafayette
 Aguilar Myrtie                                column=basic:age, timestamp=2024-07-30T08:10:19.297, value=23
 Aguilar Myrtie                                column=location:country, timestamp=2024-07-30T08:10:19.297, value=MEX
 Aguilar Myrtie                                column=location:state, timestamp=2024-07-30T08:10:19.297, value=Yucatan
 Aguilar Myrtie                                column=location:town, timestamp=2024-07-30T08:10:19.297, value=Uman
 Aguilar Terry                                 column=basic:age, timestamp=2024-07-30T08:10:19.297, value=65
 Aguilar Terry                                 column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Aguilar Terry                                 column=location:state, timestamp=2024-07-30T08:10:19.297, value=CA
 Aguilar Terry                                 column=location:town, timestamp=2024-07-30T08:10:19.297, value=Fresno
 Alexander Derrick                             column=basic:age, timestamp=2024-07-30T08:10:19.297, value=46
 Alexander Derrick                             column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Alexander Derrick                             column=location:state, timestamp=2024-07-30T08:10:19.297, value=TX
 Alexander Derrick                             column=location:town, timestamp=2024-07-30T08:10:19.297, value=Dallas
 Alexander Gregory                             column=basic:age, timestamp=2024-07-30T08:10:19.297, value=54
 Alexander Gregory                             column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Alexander Gregory                             column=location:state, timestamp=2024-07-30T08:10:19.297, value=LA
 Alexander Gregory                             column=location:town, timestamp=2024-07-30T08:10:19.297, value=Lafayette
 Alexander Leon                                column=basic:age, timestamp=2024-07-30T08:10:19.297, value=42
 Alexander Leon                                column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Alexander Leon                                column=location:state, timestamp=2024-07-30T08:10:19.297, value=AZ
 Alexander Leon                                column=location:town, timestamp=2024-07-30T08:10:19.297, value=Kingman
 Allen Austin                                  column=basic:age, timestamp=2024-07-30T08:10:19.297, value=34
 Allen Austin                                  column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Allen Austin                                  column=location:state, timestamp=2024-07-30T08:10:19.297, value=NV
 Allen Austin                                  column=location:town, timestamp=2024-07-30T08:10:19.297, value=Primm
 Allison Dustin                                column=basic:age, timestamp=2024-07-30T08:10:19.297, value=54
 Allison Dustin                                column=location:country, timestamp=2024-07-30T08:10:19.297, value=MEX
 Allison Dustin                                column=location:state, timestamp=2024-07-30T08:10:19.297, value=Yucatan
 Allison Dustin                                column=location:town, timestamp=2024-07-30T08:10:19.297, value=Merida
 Alvarado Dominic                              column=basic:age, timestamp=2024-07-30T08:10:19.297, value=63
 Alvarado Dominic                              column=location:country, timestamp=2024-07-30T08:10:19.297, value=MEX
 Alvarado Dominic                              column=location:state, timestamp=2024-07-30T08:10:19.297, value=California Baja
 Alvarado Dominic                              column=location:town, timestamp=2024-07-30T08:10:19.297, value=Ensenada
 Alvarado Maria                                column=basic:age, timestamp=2024-07-30T08:10:19.297, value=58
 Alvarado Maria                                column=location:country, timestamp=2024-07-30T08:10:19.297, value=MEX
 Alvarado Maria                                column=location:state, timestamp=2024-07-30T08:10:19.297, value=Chihuahua
 Alvarado Maria                                column=location:town, timestamp=2024-07-30T08:10:19.297, value=Chihuahua
 Alvarado Melvin                               column=basic:age, timestamp=2024-07-30T08:10:19.297, value=34
 Alvarado Melvin                               column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Alvarado Melvin                               column=location:state, timestamp=2024-07-30T08:10:19.297, value=TX
 Alvarado Melvin                               column=location:town, timestamp=2024-07-30T08:10:19.297, value=Austin
 Alvarado Timothy                              column=basic:age, timestamp=2024-07-30T08:10:19.297, value=27
 Alvarado Timothy                              column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Alvarado Timothy                              column=location:state, timestamp=2024-07-30T08:10:19.297, value=CA
 Alvarado Timothy                              column=location:town, timestamp=2024-07-30T08:10:19.297, value=San Francisco
 Alvarez Bessie                                column=basic:age, timestamp=2024-07-30T08:10:19.297, value=34
 Alvarez Bessie                                column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Alvarez Bessie                                column=location:state, timestamp=2024-07-30T08:10:19.297, value=NV
 Alvarez Bessie                                column=location:town, timestamp=2024-07-30T08:10:19.297, value=Las Vegas
 Alvarez Bruce                                 column=basic:age, timestamp=2024-07-30T08:10:19.297, value=60
 Alvarez Bruce                                 column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Alvarez Bruce                                 column=location:state, timestamp=2024-07-30T08:10:19.297, value=CA
 Alvarez Bruce                                 column=location:town, timestamp=2024-07-30T08:10:19.297, value=Modesto
 Alvarez Harvey                                column=basic:age, timestamp=2024-07-30T08:10:19.297, value=57
 Alvarez Harvey                                column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Alvarez Harvey                                column=location:state, timestamp=2024-07-30T08:10:19.297, value=NV
 Alvarez Harvey                                column=location:town, timestamp=2024-07-30T08:10:19.297, value=Fallon
 Alvarez Jacob                                 column=basic:age, timestamp=2024-07-30T08:10:19.297, value=49
 Alvarez Jacob                                 column=location:country, timestamp=2024-07-30T08:10:19.297, value=MEX
 Alvarez Jacob                                 column=location:state, timestamp=2024-07-30T08:10:19.297, value=Yucatan
 Alvarez Jacob                                 column=location:town, timestamp=2024-07-30T08:10:19.297, value=Merida
 Anderson Lester                               column=basic:age, timestamp=2024-07-30T08:10:19.297, value=62
 Anderson Lester                               column=location:country, timestamp=2024-07-30T08:10:19.297, value=MEX
 Anderson Lester                               column=location:state, timestamp=2024-07-30T08:10:19.297, value=Sonora
 Anderson Lester                               column=location:town, timestamp=2024-07-30T08:10:19.297, value=Hermosillo
 Anderson Lucile                               column=basic:age, timestamp=2024-07-30T08:10:19.297, value=33
 Anderson Lucile                               column=location:country, timestamp=2024-07-30T08:10:19.297, value=MEX
 Anderson Lucile                               column=location:state, timestamp=2024-07-30T08:10:19.297, value=Sonora
 Anderson Lucile                               column=location:town, timestamp=2024-07-30T08:10:19.297, value=Hermosillo
 Anderson Ora                                  column=basic:age, timestamp=2024-07-30T08:10:19.297, value=26
 Anderson Ora                                  column=location:country, timestamp=2024-07-30T08:10:19.297, value=MEX
 Anderson Ora                                  column=location:state, timestamp=2024-07-30T08:10:19.297, value=Sonora
 Anderson Ora                                  column=location:town, timestamp=2024-07-30T08:10:19.297, value=Nogales
 Andrews Caleb                                 column=basic:age, timestamp=2024-07-30T08:10:19.297, value=22
 Andrews Caleb                                 column=location:country, timestamp=2024-07-30T08:10:19.297, value=MEX
 Andrews Caleb                                 column=location:state, timestamp=2024-07-30T08:10:19.297, value=Quintana Roo
 Andrews Caleb                                 column=location:town, timestamp=2024-07-30T08:10:19.297, value=Cancun
 Andrews Lucy                                  column=basic:age, timestamp=2024-07-30T08:10:19.297, value=33
 Andrews Lucy                                  column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Andrews Lucy                                  column=location:state, timestamp=2024-07-30T08:10:19.297, value=AZ
 Andrews Lucy                                  column=location:town, timestamp=2024-07-30T08:10:19.297, value=Kingman
 Andrews Noah                                  column=basic:age, timestamp=2024-07-30T08:10:19.297, value=63
 Andrews Noah                                  column=location:country, timestamp=2024-07-30T08:10:19.297, value=MEX
 Andrews Noah                                  column=location:state, timestamp=2024-07-30T08:10:19.297, value=California Baja
 Andrews Noah                                  column=location:town, timestamp=2024-07-30T08:10:19.297, value=Ensenada
 Andrews Susan                                 column=basic:age, timestamp=2024-07-30T08:10:19.297, value=48
 Andrews Susan                                 column=location:country, timestamp=2024-07-30T08:10:19.297, value=MEX
 Andrews Susan                                 column=location:state, timestamp=2024-07-30T08:10:19.297, value=Yucatan
 Andrews Susan                                 column=location:town, timestamp=2024-07-30T08:10:19.297, value=Merida
 Armstrong Isabella                            column=basic:age, timestamp=2024-07-30T08:10:19.297, value=49
 Armstrong Isabella                            column=location:country, timestamp=2024-07-30T08:10:19.297, value=MEX
 Armstrong Isabella                            column=location:state, timestamp=2024-07-30T08:10:19.297, value=Sonora
 Armstrong Isabella                            column=location:town, timestamp=2024-07-30T08:10:19.297, value=Hermosillo
 Armstrong Marie                               column=basic:age, timestamp=2024-07-30T08:10:19.297, value=25
 Armstrong Marie                               column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Armstrong Marie                               column=location:state, timestamp=2024-07-30T08:10:19.297, value=NM
 Armstrong Marie                               column=location:town, timestamp=2024-07-30T08:10:19.297, value=Santa Fe
 Arnold Bettie                                 column=basic:age, timestamp=2024-07-30T08:10:19.297, value=18
 Arnold Bettie                                 column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Arnold Bettie                                 column=location:state, timestamp=2024-07-30T08:10:19.297, value=NV
 Arnold Bettie                                 column=location:town, timestamp=2024-07-30T08:10:19.297, value=Primm
 Atkins Daisy                                  column=basic:age, timestamp=2024-07-30T08:10:19.297, value=51
 Atkins Daisy                                  column=location:country, timestamp=2024-07-30T08:10:19.297, value=MEX
 Atkins Daisy                                  column=location:state, timestamp=2024-07-30T08:10:19.297, value=Quintana Roo
 Atkins Daisy                                  column=location:town, timestamp=2024-07-30T08:10:19.297, value=Chetumal
 Atkins Gene                                   column=basic:age, timestamp=2024-07-30T08:10:19.297, value=37
 Atkins Gene                                   column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Atkins Gene                                   column=location:state, timestamp=2024-07-30T08:10:19.297, value=TX
 Atkins Gene                                   column=location:town, timestamp=2024-07-30T08:10:19.297, value=Dallas
 Austin Bertie                                 column=basic:age, timestamp=2024-07-30T08:10:19.297, value=57
 Austin Bertie                                 column=location:country, timestamp=2024-07-30T08:10:19.297, value=MEX
 Austin Bertie                                 column=location:state, timestamp=2024-07-30T08:10:19.297, value=California Baja
 Austin Bertie                                 column=location:town, timestamp=2024-07-30T08:10:19.297, value=Tijuana
 Austin Eugene                                 column=basic:age, timestamp=2024-07-30T08:10:19.297, value=64
 Austin Eugene                                 column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Austin Eugene                                 column=location:state, timestamp=2024-07-30T08:10:19.297, value=LA
 Austin Eugene                                 column=location:town, timestamp=2024-07-30T08:10:19.297, value=Lafayette
 Austin Travis                                 column=basic:age, timestamp=2024-07-30T08:10:19.297, value=53
 Austin Travis                                 column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Austin Travis                                 column=location:state, timestamp=2024-07-30T08:10:19.297, value=TX
 Austin Travis                                 column=location:town, timestamp=2024-07-30T08:10:19.297, value=El Paso
34 row(s)
Took 0.6767 seconds
FamilyFilter

 
Checks all columns and returns all key/value pairs in a column if its family name matches the result of comparison specified by the arguments. Takes two arguments: a comparison operator and a comparator.

Syntax:

"FamilyFilter (\<=, 'binaryprefix:c')"

Command example:

scan 'people', { FILTER => "FamilyFilter (<, 'binaryprefix:c')" }

This filter should return all key/value pairs the column families of which begin with something lexicographically less than c. In case of our test table this means the whole basic:age column. Result (the first and last five rows):

ROW                                                      COLUMN+CELL
 Abbott Delia                                            column=basic:age, timestamp=2024-07-30T08:10:19.297, value=62
 Abbott Howard                                           column=basic:age, timestamp=2024-07-30T08:10:19.297, value=24
 Abbott Jack                                             column=basic:age, timestamp=2024-07-30T08:10:19.297, value=29
 Adams Clyde                                             column=basic:age, timestamp=2024-07-30T08:10:19.297, value=29
 Aguilar Myrtie                                          column=basic:age, timestamp=2024-07-30T08:10:19.297, value=23
 ...
 Young Della                                             column=basic:age, timestamp=2024-07-30T08:10:19.297, value=21
 Young Josephine                                         column=basic:age, timestamp=2024-07-30T08:10:19.297, value=29
 Young Mattie                                            column=basic:age, timestamp=2024-07-30T08:10:19.297, value=39
 Zimmerman Gene                                          column=basic:age, timestamp=2024-07-30T08:10:19.297, value=35
 Zimmerman Madge                                         column=basic:age, timestamp=2024-07-30T08:10:19.297, value=46
997 row(s)
Took 0.4836 seconds
QualifierFilter

 
Checks all columns and returns all key/value pairs in a column if its qualifier name matches the result of comparison specified by the arguments. Takes two arguments: a comparison operator and a comparator.

Syntax:

"QualifierFilter (=, 'binary:town')"

Command example:

scan 'people', { FILTER => "QualifierFilter (=, 'binary:town')" }

This filter should return all key/value pairs the column qualifier of which is exactly town. Result (first and last five rows):

ROW                                                      COLUMN+CELL
 Abbott Delia                                            column=location:town, timestamp=2024-07-30T08:10:19.297, value=Dallas
 Abbott Howard                                           column=location:town, timestamp=2024-07-30T08:10:19.297, value=Baton Rouge
 Abbott Jack                                             column=location:town, timestamp=2024-07-30T08:10:19.297, value=Juarez
 Adams Clyde                                             column=location:town, timestamp=2024-07-30T08:10:19.297, value=Lafayette
 Aguilar Myrtie                                          column=location:town, timestamp=2024-07-30T08:10:19.297, value=Uman
 ...
 Young Della                                             column=location:town, timestamp=2024-07-30T08:10:19.297, value=Chetumal
 Young Josephine                                         column=location:town, timestamp=2024-07-30T08:10:19.297, value=Redding
 Young Mattie                                            column=location:town, timestamp=2024-07-30T08:10:20, value=Carlsbad
 Zimmerman Gene                                          column=location:town, timestamp=2024-07-30T08:10:19.297, value=Tijuana
 Zimmerman Madge                                         column=location:town, timestamp=2024-07-30T08:10:19.297, value=Nogales
997 row(s)
Took 0.2112 seconds
ValueFilter

 
Returns all key/value pairs the values of which match the result of comparison specified by the arguments. Takes two arguments: a comparison operator and a comparator.

Syntax:

"ValueFilter (=, 'binary:TX')"

Command example:

scan 'people', { FILTER => "ValueFilter (=, 'binary:TX')" }

This filter should return the list of texans from the test table, i.e. all key/value pairs which are exactly TX. Result (first and last five rows):

ROW                                                      COLUMN+CELL
 Abbott Delia                                            column=location:state, timestamp=2024-07-30T08:10:19.297, value=TX
 Alexander Derrick                                       column=location:state, timestamp=2024-07-30T08:10:19.297, value=TX
 Alvarado Melvin                                         column=location:state, timestamp=2024-07-30T08:10:19.297, value=TX
 Atkins Gene                                             column=location:state, timestamp=2024-07-30T08:10:19.297, value=TX
 Austin Travis                                           column=location:state, timestamp=2024-07-30T08:10:19.297, value=TX
 ...
 Watkins Julian                                          column=location:state, timestamp=2024-07-30T08:10:19.297, value=TX
 Welch Lela                                              column=location:state, timestamp=2024-07-30T08:10:19.297, value=TX
 Willis Travis                                           column=location:state, timestamp=2024-07-30T08:10:19.297, value=TX
 Wilson Grace                                            column=location:state, timestamp=2024-07-30T08:10:19.297, value=TX
 Wilson Nellie                                           column=location:state, timestamp=2024-07-30T08:10:19.297, value=TX
112 row(s)
Took 0.0535 seconds
DependentColumnFilter

 
Searches each row for the column identified by the two mandatory arguments: column family and qualifier. If such a column is found (reference column), returns all key/value pairs in that row that have the same timestamp as does the reference column. If not — nothing is returned.

If a third boolean argument (dropDependentColumn) is specified (optional), then the reference column is either returned as well (false) or not (true).

Two more arguments can be specified: a comparison operator and a comparator. In this case the column value must also match the result of comparison specified by these arguments for the column to be considered a reference one.

Takes two, three, or five arguments: column family (mandatory), column qualifier (mandatory), dropDependentColumn flag (boolean, optional), comparison operator, and a comparator (last two arguments can only be specified jointly).

Syntax:

"DependentColumnFilter ('location', 'town')"
"DependentColumnFilter ('location', 'town', true)"
"DependentColumnFilter ('location', 'town', true, =, 'binary:Carlsbad')"

Command example:

scan 'people', { FILTER => "DependentColumnFilter ('location', 'town', true, =, 'binary:Carlsbad')" }

This filter should search each row for the exact value Carlsbad in the column called exactly location:town, and in case of success return other key/value pairs that share the same timestamp, omitting the found one. Result:

ROW                                              COLUMN+CELL
 Young Mattie                                    column=location:state, timestamp=2024-07-30T08:10:20, value=NM
1 row(s)
Took 0.0269 seconds

This example relies on the tweaking done in the example for the TimestampsFilter. Without it, the filter will return nothing as the comparison condition will not be met.

SingleColumnValueFilter

 
Searches each row for the reference column identified by four mandatory arguments: column family, column qualifier, comparison operator, and a comparator. If a column with the specified family and qualifier is found and its value matches the comparison result, all the key/value pairs in the row are returned. If a column with the specified family and qualifier is found but its value does not match the comparison result, no key/value pairs are returned. If no reference column is found, by default, all the key/value pairs in the row are returned.

Two more boolean arguments can be specified jointly: if the first (setFilterIfMissing) is set to true (default is false) and no reference column is found, then no key/value pairs in the row are returned. If the second (setLatestVersionOnly) is set to false (default is true), then all the reference column value versions are checked against the comparison result, and not only the latest one.

Takes four or six arguments: column family (mandatory), column qualifier (mandatory), comparison operator (mandatory), comparison value (mandatory), and setFilterIfMissing and setLatestVersionOnly flags (optional, jointly).

Syntax:

"SingleColumnValueFilter ('location', 'town', =, 'binaryprefix:Al')"
"SingleColumnValueFilter ('location', 'town', =, 'binaryprefix:Al', true, false)"

Command example:

scan 'people', { FILTER => "SingleColumnValueFilter ('location', 'town', =, 'binaryprefix:Al', true, false)" }

This filter should search each row for the values beginning with Al in the column called exactly location:town, checking all versions of those values. Only in case of success it returns all key/value pairs from that row. Result:

ROW                                              COLUMN+CELL
 Ball Nelle                                      column=basic:age, timestamp=2024-07-30T08:10:19.297, value=43
 Ball Nelle                                      column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Ball Nelle                                      column=location:state, timestamp=2024-07-30T08:10:19.297, value=NM
 Ball Nelle                                      column=location:town, timestamp=2024-07-30T08:10:19.297, value=Albuquerque
 Bell Leila                                      column=basic:age, timestamp=2024-07-30T08:10:19.297, value=58
 Bell Leila                                      column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Bell Leila                                      column=location:state, timestamp=2024-07-30T08:10:19.297, value=NM
 Bell Leila                                      column=location:town, timestamp=2024-07-30T08:10:19.297, value=Albuquerque
 Cohen John                                      column=basic:age, timestamp=2024-07-30T08:10:19.297, value=28
 Cohen John                                      column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Cohen John                                      column=location:state, timestamp=2024-07-30T08:10:19.297, value=NM
 Cohen John                                      column=location:town, timestamp=2024-07-30T08:10:19.297, value=Albuquerque
 Diaz Anne                                       column=basic:age, timestamp=2024-07-30T08:10:19.297, value=57
 Diaz Anne                                       column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Diaz Anne                                       column=location:state, timestamp=2024-07-30T08:10:19.297, value=NM
 Diaz Anne                                       column=location:town, timestamp=2024-07-30T08:10:19.297, value=Albuquerque
 Howard Florence                                 column=basic:age, timestamp=2024-07-30T08:10:19.297, value=37
 Howard Florence                                 column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Howard Florence                                 column=location:state, timestamp=2024-07-30T08:10:19.297, value=NM
 Howard Florence                                 column=location:town, timestamp=2024-07-30T08:10:19.297, value=Albuquerque
 Ingram Barbara                                  column=basic:age, timestamp=2024-07-30T08:10:19.297, value=55
 Ingram Barbara                                  column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Ingram Barbara                                  column=location:state, timestamp=2024-07-30T08:10:19.297, value=NM
 Ingram Barbara                                  column=location:town, timestamp=2024-07-30T08:10:19.297, value=Albuquerque
 Jefferson Charlie                               column=basic:age, timestamp=2024-07-30T08:10:19.297, value=46
 Jefferson Charlie                               column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Jefferson Charlie                               column=location:state, timestamp=2024-07-30T08:10:19.297, value=NM
 Jefferson Charlie                               column=location:town, timestamp=2024-07-30T08:10:19.297, value=Albuquerque
 Kennedy Todd                                    column=basic:age, timestamp=2024-07-30T08:10:19.297, value=44
 Kennedy Todd                                    column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Kennedy Todd                                    column=location:state, timestamp=2024-07-30T08:10:19.297, value=NM
 Kennedy Todd                                    column=location:town, timestamp=2024-07-30T08:10:19.297, value=Albuquerque
 McGee Isabelle                                  column=basic:age, timestamp=2024-07-30T08:10:19.297, value=48
 McGee Isabelle                                  column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 McGee Isabelle                                  column=location:state, timestamp=2024-07-30T08:10:19.297, value=NM
 McGee Isabelle                                  column=location:town, timestamp=2024-07-30T08:10:19.297, value=Albuquerque
 Page Victoria                                   column=basic:age, timestamp=2024-07-30T08:10:19.297, value=30
 Page Victoria                                   column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Page Victoria                                   column=location:state, timestamp=2024-07-30T08:10:19.297, value=NM
 Page Victoria                                   column=location:town, timestamp=2024-07-30T08:10:19.297, value=Albuquerque
 Phelps Lida                                     column=basic:age, timestamp=2024-07-30T08:10:19.297, value=43
 Phelps Lida                                     column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Phelps Lida                                     column=location:state, timestamp=2024-07-30T08:10:19.297, value=NM
 Phelps Lida                                     column=location:town, timestamp=2024-07-30T08:10:19.297, value=Albuquerque
 Phillips Helen                                  column=basic:age, timestamp=2024-07-30T08:10:19.297, value=61
 Phillips Helen                                  column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Phillips Helen                                  column=location:state, timestamp=2024-07-30T08:10:19.297, value=NM
 Phillips Helen                                  column=location:town, timestamp=2024-07-30T08:10:19.297, value=Albuquerque
 Reyes Marc                                      column=basic:age, timestamp=2024-07-30T08:10:19.297, value=25
 Reyes Marc                                      column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Reyes Marc                                      column=location:state, timestamp=2024-07-30T08:10:19.297, value=NM
 Reyes Marc                                      column=location:town, timestamp=2024-07-30T08:10:19.297, value=Albuquerque
 Roberts Clayton                                 column=basic:age, timestamp=2024-07-30T08:10:19.297, value=52
 Roberts Clayton                                 column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Roberts Clayton                                 column=location:state, timestamp=2024-07-30T08:10:19.297, value=NM
 Roberts Clayton                                 column=location:town, timestamp=2024-07-30T08:10:19.297, value=Albuquerque
 Ryan Curtis                                     column=basic:age, timestamp=2024-07-30T08:10:19.297, value=58
 Ryan Curtis                                     column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Ryan Curtis                                     column=location:state, timestamp=2024-07-30T08:10:19.297, value=NM
 Ryan Curtis                                     column=location:town, timestamp=2024-07-30T08:10:19.297, value=Albuquerque
 Spencer Lucinda                                 column=basic:age, timestamp=2024-07-30T08:10:19.297, value=24
 Spencer Lucinda                                 column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Spencer Lucinda                                 column=location:state, timestamp=2024-07-30T08:10:19.297, value=NM
 Spencer Lucinda                                 column=location:town, timestamp=2024-07-30T08:10:19.297, value=Albuquerque
 Woods Bessie                                    column=basic:age, timestamp=2024-07-30T08:10:19.297, value=47
 Woods Bessie                                    column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Woods Bessie                                    column=location:state, timestamp=2024-07-30T08:10:19.297, value=NM
 Woods Bessie                                    column=location:town, timestamp=2024-07-30T08:10:19.297, value=Albuquerque
 Yates Douglas                                   column=basic:age, timestamp=2024-07-30T08:10:19.297, value=35
 Yates Douglas                                   column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Yates Douglas                                   column=location:state, timestamp=2024-07-30T08:10:20, value=NM
 Yates Douglas                                   column=location:town, timestamp=2024-07-30T08:10:20, value=Albuquerque
18 row(s)
Took 0.0263 seconds
SingleColumnValueExcludeFilter

 
Works the same as the SingleColumnValueFilter filter except that the reference column value is never returned in the results.

Syntax:

"SingleColumnValueExcludeFilter ('location', 'town', =, 'binaryprefix:Al')"
"SingleColumnValueExcludeFilter ('location', 'town', =, 'binaryprefix:Al', true, false)"

Command example:

scan 'people', { FILTER => "SingleColumnValueExcludeFilter ('location', 'town', =, 'binaryprefix:Al', true, false)" }

This filter should search each row for the values beginning with Al in the column called exactly location:town, checking all versions of those values. Only in case of success it returns other key/value pairs from that row but not the found one. Result:

ROW                                              COLUMN+CELL
 Ball Nelle                                      column=basic:age, timestamp=2024-07-30T08:10:19.297, value=43
 Ball Nelle                                      column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Ball Nelle                                      column=location:state, timestamp=2024-07-30T08:10:19.297, value=NM
 Bell Leila                                      column=basic:age, timestamp=2024-07-30T08:10:19.297, value=58
 Bell Leila                                      column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Bell Leila                                      column=location:state, timestamp=2024-07-30T08:10:19.297, value=NM
 Cohen John                                      column=basic:age, timestamp=2024-07-30T08:10:19.297, value=28
 Cohen John                                      column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Cohen John                                      column=location:state, timestamp=2024-07-30T08:10:19.297, value=NM
 Diaz Anne                                       column=basic:age, timestamp=2024-07-30T08:10:19.297, value=57
 Diaz Anne                                       column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Diaz Anne                                       column=location:state, timestamp=2024-07-30T08:10:19.297, value=NM
 Howard Florence                                 column=basic:age, timestamp=2024-07-30T08:10:19.297, value=37
 Howard Florence                                 column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Howard Florence                                 column=location:state, timestamp=2024-07-30T08:10:19.297, value=NM
 Ingram Barbara                                  column=basic:age, timestamp=2024-07-30T08:10:19.297, value=55
 Ingram Barbara                                  column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Ingram Barbara                                  column=location:state, timestamp=2024-07-30T08:10:19.297, value=NM
 Jefferson Charlie                               column=basic:age, timestamp=2024-07-30T08:10:19.297, value=46
 Jefferson Charlie                               column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Jefferson Charlie                               column=location:state, timestamp=2024-07-30T08:10:19.297, value=NM
 Kennedy Todd                                    column=basic:age, timestamp=2024-07-30T08:10:19.297, value=44
 Kennedy Todd                                    column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Kennedy Todd                                    column=location:state, timestamp=2024-07-30T08:10:19.297, value=NM
 McGee Isabelle                                  column=basic:age, timestamp=2024-07-30T08:10:19.297, value=48
 McGee Isabelle                                  column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 McGee Isabelle                                  column=location:state, timestamp=2024-07-30T08:10:19.297, value=NM
 Page Victoria                                   column=basic:age, timestamp=2024-07-30T08:10:19.297, value=30
 Page Victoria                                   column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Page Victoria                                   column=location:state, timestamp=2024-07-30T08:10:19.297, value=NM
 Phelps Lida                                     column=basic:age, timestamp=2024-07-30T08:10:19.297, value=43
 Phelps Lida                                     column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Phelps Lida                                     column=location:state, timestamp=2024-07-30T08:10:19.297, value=NM
 Phillips Helen                                  column=basic:age, timestamp=2024-07-30T08:10:19.297, value=61
 Phillips Helen                                  column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Phillips Helen                                  column=location:state, timestamp=2024-07-30T08:10:19.297, value=NM
 Reyes Marc                                      column=basic:age, timestamp=2024-07-30T08:10:19.297, value=25
 Reyes Marc                                      column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Reyes Marc                                      column=location:state, timestamp=2024-07-30T08:10:19.297, value=NM
 Roberts Clayton                                 column=basic:age, timestamp=2024-07-30T08:10:19.297, value=52
 Roberts Clayton                                 column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Roberts Clayton                                 column=location:state, timestamp=2024-07-30T08:10:19.297, value=NM
 Ryan Curtis                                     column=basic:age, timestamp=2024-07-30T08:10:19.297, value=58
 Ryan Curtis                                     column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Ryan Curtis                                     column=location:state, timestamp=2024-07-30T08:10:19.297, value=NM
 Spencer Lucinda                                 column=basic:age, timestamp=2024-07-30T08:10:19.297, value=24
 Spencer Lucinda                                 column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Spencer Lucinda                                 column=location:state, timestamp=2024-07-30T08:10:19.297, value=NM
 Woods Bessie                                    column=basic:age, timestamp=2024-07-30T08:10:19.297, value=47
 Woods Bessie                                    column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Woods Bessie                                    column=location:state, timestamp=2024-07-30T08:10:19.297, value=NM
 Yates Douglas                                   column=basic:age, timestamp=2024-07-30T08:10:19.297, value=35
 Yates Douglas                                   column=location:country, timestamp=2024-07-30T08:10:19.297, value=USA
 Yates Douglas                                   column=location:state, timestamp=2024-07-30T08:10:20, value=NM
18 row(s)
Took 0.0646 seconds
ColumnRangeFilter

 
Returns only key/value pairs with column qualifier names in the range set by the arguments. The range ends can be definitive or empty. Each range end is followed by a boolean argument that defines whether the range end is included or not.

Takes four arguments: left range end, left inclusion flag, right range end, right inclusion flag. If a range end is empty, then its respective inclusion flag value does not matter.

Syntax:

"ColumnRangeFilter ('', true, 'c', false)"

Command example:

scan 'people', { FILTER => "ColumnRangeFilter ('', true, 'c', false)" }

This filter will only return the basic:age column, since the next column qualifier alphabetically is country and it does not fall into the specified range. Result (the first and last five rows):

ROW                                              COLUMN+CELL
 Abbott Delia                                    column=basic:age, timestamp=2024-07-30T08:10:19.297, value=62
 Abbott Howard                                   column=basic:age, timestamp=2024-07-30T08:10:19.297, value=24
 Abbott Jack                                     column=basic:age, timestamp=2024-07-30T08:10:19.297, value=29
 Adams Clyde                                     column=basic:age, timestamp=2024-07-30T08:10:19.297, value=29
 Aguilar Myrtie                                  column=basic:age, timestamp=2024-07-30T08:10:19.297, value=23
 ...
 Young Della                                     column=basic:age, timestamp=2024-07-30T08:10:19.297, value=21
 Young Josephine                                 column=basic:age, timestamp=2024-07-30T08:10:19.297, value=29
 Young Mattie                                    column=basic:age, timestamp=2024-07-30T08:10:19.297, value=39
 Zimmerman Gene                                  column=basic:age, timestamp=2024-07-30T08:10:19.297, value=35
 Zimmerman Madge                                 column=basic:age, timestamp=2024-07-30T08:10:19.297, value=46
997 row(s)
Took 0.1788 seconds

Dynamic loading of custom filters

HBase in ADH supports dynamic loading of custom filters. To utilize this feature, you should specify the directory containing the custom filter JAR files:

  1. Go to ADCM UI and select your ADH cluster.

  2. Navigate to Services → HBase → Primary configuration and toggle Show advanced.

  3. Open the Custom hbase-site.xml section and click Add property.

  4. For the field name, enter hbase.dynamic.jars.dir. For the field value, enter a path of your preference. A good example is ${hbase.rootdir}/lib. Click Apply.

  5. Save the configuration by clicking Save → Create and restart the service by clicking Actions → Reconfig and graceful restart.

Provided there are JAR files with the custom filters in the specified location, you should be able to use them both in HBase shell and via Java applications using the HBase API.

Found a mistake? Seleсt text and press Ctrl+Enter to report it