Nested data structures

ADQM/ClickHouse provides the Nested data type, which can be used to store and process nested data structures and can serve as an alternative to the Tuple and Array(Tuple) types. It allows modeling complex data structures without the need to create/store additional tables and use complex queries.

A nested data structure is a table inside a cell of the main table. Each row of the main table may correspond to any number of rows in the nested data structure.

Overview

To define a Nested column in a table, it is necessary to specify parameters of a nested structure — column names with corresponding data types.

Below is an example of the CREATE TABLE query syntax for creating a table with a nested data structure in a column (for simplicity, the example uses short column names and specific data types):

CREATE TABLE test_table (
    a String,
    b UInt32,
    ...
    n Nested (
        c String,
        d Int64,
        ...
    )
)
ENGINE = MergeTree ORDER BY tuple();

To access columns of nested data structures, use the <nested_structure>.<nested_column> format in queries (where <nested_structure> is the name of the table’s Nested column and <nested_column> is the name of a column within the nested structure).

Data format of a nested structure’s columns is specified by the flatten_nested setting, which can be set to 1 (by default) or 0.

flatten_nested = 1

Columns of a nested structure are of the Array(Type) type — they store arrays of values of the corresponding data types, and all arrays have the same length. In this case, when reading and processing values of nested columns, you can use functions for working with arrays, including the ARRAY JOIN clause.

In the SHOW CREATE TABLE query output, you can see which data types are defined for table columns:

SHOW CREATE TABLE test_table;
   ┌─statement─────────────────────────┐
1. │ CREATE TABLE default.test_table   │
   │ (                                 │
   │    `a` String,                    │
   │    `b` UInt32,                    │
   │    `n.c` Array(String),           │
   │    `n.d` Array(Int64)             │
   │ )                                 │
   │ ENGINE = MergeTree                │
   │ ORDER BY tuple()                  │
   │ SETTINGS index_granularity = 8192 │
   └───────────────────────────────────┘

Insert data

To insert data into a Nested column, pass an array of values separately for each column of the nested structure (the system automatically checks whether all passed arrays of values are of the same length). For example:

INSERT INTO test_table VALUES ('a', 1, ['c1', 'c2', 'c3'], [100, 200, 300]);

When working with a table that has columns containing nested structures, you can also insert JSON data with the same structure. For example, use the JSONEachRow format in queries:

INSERT INTO test_table FORMAT JSONEachRow {"a":"a1", "b":2, "n.c":["c4", "c5"], "n.d":[101, 202]};

To pass data as a hierarchical JSON object, first set input_format_import_nested_json=1:

SET input_format_import_nested_json=1;
INSERT INTO test_table FORMAT JSONEachRow
{
    "a":"a2",
    "b":"3",
    "n": {
        "c": ["c6", "c7"],
        "d": [111, 222]
    }
};

Select data

Querying all data from a table outputs each column of the nested structure separately:

SELECT * FROM test_table;
   ┌─a──┬─b─┬─n.c──────────────┬─n.d───────────┐
1. │ a  │ 1 │ ['c1','c2','c3'] │ [100,200,300] │
2. │ a1 │ 2 │ ['c4','c5']      │ [101,202]     │
3. │ a2 │ 3 │ ['c6','c7']      │ [111,222]     │
   └────┴───┴──────────────────┴───────────────┘

flatten_nested = 0

Data of all columns of a nested structure is stored as a single array of tuples — in fact, a Nested column is of the Array(Tuple) type. In this case, an arbitrary level of nesting is supported.

If you recreate the test_table table after setting flatten_nested = 0, the output of SHOW CREATE TABLE will show the Nested type for the n column, although it is actually the same as Array(Tuple(…​)):

   ┌─statement──────────────────────────┐
1. │ CREATE TABLE default.test_table    │
   │ (                                  │
   │     `a` String,                    │
   │     `b` UInt32,                    │
   │     `n` Nested(c String, d Int64)  │
   │ )                                  │
   │ ENGINE = MergeTree                 │
   │ ORDER BY tuple()                   │
   │ SETTINGS index_granularity = 8192  │
   └────────────────────────────────────┘

Insert data

Accordingly, data should be inserted into a nested structure as an array of tuples. For example:

INSERT INTO test_table VALUES ('a', 1, [('c1', 100), ('c2', 200), ('c3', 300)]);

To insert a hierarchical JSON object into a table with a Nested column when the flatten_nested parameter is set to 0, it is not necessary to activate the input_format_import_nested_json option, and data for nested columns should be passed as an array:

INSERT INTO test_table FORMAT JSONEachRow
{
    "a":"a1",
    "b":"2",
    "n": [
        {"c": "c4", "d": 101},
        {"c": "c5", "d": 202}
        ]
};

Select data

Querying all data from a table outputs a column of the Nested type, where each row is an array of tuples of nested columns' values:

SELECT * FROM test_table;
   ┌─a──┬─b─┬─n──────────────────────────────────┐
1. │ a  │ 1 │ [('c1',100),('c2',200),('c3',300)] │
2. │ a1 │ 2 │ [('c4',101),('c5',202)]            │
   └────┴───┴────────────────────────────────────┘

Data of individual columns can also be accessed via the dot notation:

SELECT n.c, n.d FROM test_table;
   ┌─n.c──────────────┬─n.d───────────┐
1. │ ['c1','c2','c3'] │ [100,200,300] │
2. │ ['c4','c5']      │ [101,202]     │
   └──────────────────┴───────────────┘

Example

Create a table

Create the orders table to store order data in columns of the appropriate types:

  • order_id (UInt64) — order identifier;

  • customer_id (UInt32) — customer identifier;

  • order_date (Date) — order date.

Also, add a column to the table to store information about products each order includes — the items column of the Nested type that defines the following fields of the nested structure:

  • product_id (UInt32) — product identifier;

  • product_name (String) — product name;

  • quantity (UInt16) — number of product units in an order;

  • price_per_unit (Decimal) — cost of a product unit.

The value of the flatten_nested parameter is 1 (default).

CREATE TABLE orders (
    order_id UInt64,
    customer_id UInt32,
    order_date Date,
    items Nested (
        product_id UInt32,
        product_name String,
        quantity UInt16,
        price_per_unit Decimal(10, 2)
    )
) ENGINE = MergeTree()
ORDER BY (order_date, order_id);

Insert data

Insert test data into the table (for example, some orders over two days):

INSERT INTO orders VALUES
(1001, 42, '2025-08-07',
    [101, 102, 103],
    ['Laptop', 'Mouse', 'Keyboard'],
    [1, 2, 1],
    [999.99, 19.99, 49.99]),
(1002, 77, '2025-08-07',
    [104, 105],
    ['Monitor', 'Headphones'],
    [2, 1],
    [199.99, 89.99]),
(1003, 55, '2025-08-07',
    [101, 104, 102],
    ['Laptop', 'Monitor', 'Mouse'],
    [1, 1, 1],
    [999.99, 199.99, 19.99]),
(1004, 24, '2025-08-08',
    [102, 105],
    ['Mouse', 'Headphones'],
    [3, 4],
    [19.99, 89.99]),
(1005, 38, '2025-08-08',
    [101, 103, 104],
    ['Laptop', 'Keyboard', 'Monitor'],
    [1, 1, 1],
    [999.99, 49.99, 199.99]);

Select data

Select all data from the table:

SELECT * FROM orders;
   ┌─order_id─┬─customer_id─┬─order_date─┬─items.product_id─┬─items.product_name──────────────┬─items.quantity─┬─items.price_per_unit──┐
1. │     1001 │          42 │ 2025-08-07 │ [101,102,103]    │ ['Laptop','Mouse','Keyboard']   │ [1,2,1]        │ [999.99,19.99,49.99]  │
2. │     1002 │          77 │ 2025-08-07 │ [104,105]        │ ['Monitor','Headphones']        │ [2,1]          │ [199.99,89.99]        │
3. │     1003 │          55 │ 2025-08-07 │ [101,104,102]    │ ['Laptop','Monitor','Mouse']    │ [1,1,1]        │ [999.99,199.99,19.99] │
4. │     1004 │          24 │ 2025-08-08 │ [102,105]        │ ['Mouse','Headphones']          │ [3,4]          │ [19.99,89.99]         │
5. │     1005 │          38 │ 2025-08-08 │ [101,103,104]    │ ['Laptop','Keyboard','Monitor'] │ [1,1,1]        │ [999.99,49.99,199.99] │
   └──────────┴─────────────┴────────────┴──────────────────┴─────────────────────────────────┴────────────────┴───────────────────────┘

To display data in a usual tabular form, expanding the Nested structure into several rows, use ARRAY JOIN in a query:

SELECT * FROM orders ARRAY JOIN items;
    ┌─order_id─┬─customer_id─┬─order_date─┬─items.product_id─┬─items.product_name─┬─items.quantity─┬─items.price_per_unit─┐
 1. │     1001 │          42 │ 2025-08-07 │              101 │ Laptop             │              1 │               999.99 │
 2. │     1001 │          42 │ 2025-08-07 │              102 │ Mouse              │              2 │                19.99 │
 3. │     1001 │          42 │ 2025-08-07 │              103 │ Keyboard           │              1 │                49.99 │
 4. │     1002 │          77 │ 2025-08-07 │              104 │ Monitor            │              2 │               199.99 │
 5. │     1002 │          77 │ 2025-08-07 │              105 │ Headphones         │              1 │                89.99 │
 6. │     1003 │          55 │ 2025-08-07 │              101 │ Laptop             │              1 │               999.99 │
 7. │     1003 │          55 │ 2025-08-07 │              104 │ Monitor            │              1 │               199.99 │
 8. │     1003 │          55 │ 2025-08-07 │              102 │ Mouse              │              1 │                19.99 │
 9. │     1004 │          24 │ 2025-08-08 │              102 │ Mouse              │              3 │                19.99 │
10. │     1004 │          24 │ 2025-08-08 │              105 │ Headphones         │              4 │                89.99 │
11. │     1005 │          38 │ 2025-08-08 │              101 │ Laptop             │              1 │               999.99 │
12. │     1005 │          38 │ 2025-08-08 │              103 │ Keyboard           │              1 │                49.99 │
13. │     1005 │          38 │ 2025-08-08 │              104 │ Monitor            │              1 │               199.99 │
    └──────────┴─────────────┴────────────┴──────────────────┴────────────────────┴────────────────┴──────────────────────┘

Apply aggregation by columns of a nested structure. For example, run the following query to calculate how many units of each product were sold in a month (quantity across all orders):

SELECT
    formatDateTime(order_date, '%M') as month,
    items.product_name as product,
    SUM(items.quantity) as units_sold
FROM orders
ARRAY JOIN items
GROUP BY
    formatDateTime(order_date, '%M'),
    items.product_name;
   ┌─month──┬─product────┬─units_sold─┐
1. │ August │ Keyboard   │          2 │
2. │ August │ Headphones │          5 │
3. │ August │ Laptop     │          3 │
4. │ August │ Mouse      │          6 │
5. │ August │ Monitor    │          4 │
   └────────┴────────────┴────────────┘

You can also filter data by a field of a nested structure. For example, the following query outputs sales of the Laptop product for each day:

SELECT
    order_date,
    items.product_name AS product ,
    SUM(items.quantity) AS quantity
FROM orders
ARRAY JOIN items
WHERE items.product_name = 'Laptop'
GROUP BY
    order_date,
    items.product_name;
   ┌─order_date─┬─product─┬─quantity─┐
1. │ 2025-08-07 │ Laptop  │        2 │
2. │ 2025-08-08 │ Laptop  │        1 │
   └────────────┴─────────┴──────────┘

To return only one row with the number of units sold for each product per date, you can use a subquery and the groupArray function:

SELECT
    order_date,
    groupArray((product, units_sold))
FROM
(
    SELECT
        order_date,
        product,
        sum(product_sales) AS units_sold
    FROM orders
    ARRAY JOIN
        items.product_name AS product,
        items.quantity AS product_sales
    GROUP BY
        order_date,
        product
)
GROUP BY order_date;
   ┌─order_date─┬─groupArray((product, units_sold))────────────────────────────────────────┐
1. │ 2025-08-07 │ [('Laptop',2),('Mouse',3),('Monitor',3),('Keyboard',1),('Headphones',1)] │
2. │ 2025-08-08 │ [('Keyboard',1),('Mouse',3),('Monitor',1),('Headphones',4),('Laptop',1)] │
   └────────────┴──────────────────────────────────────────────────────────────────────────┘
Found a mistake? Seleсt text and press Ctrl+Enter to report it