Combine queries (UNION, INTERSECT, EXCEPT)

To combine results of two or more queries, you can use the UNION, INTERSECT, and EXCEPT set operators.

Create tables for test examples

 
To reproduce examples from this article and see how various set operators work in ClickHouse, first create the tables:

  • products_2023 — product catalog for 2023:

    CREATE TABLE products_2023
    (
        id UInt32,
        name String,
        category String,
        price Float32
    ) ENGINE = MergeTree()
    ORDER BY id;
    INSERT INTO products_2023 VALUES
    (1, 'Office Chair', 'Furniture', 170),
    (2, 'Backpack', 'Accessories', 60),
    (3, 'Monitor', 'Electronics', 200),
    (4, 'Tablet', 'Electronics', 750),
    (5, 'Water Bottle', 'Accessories', 10),
    (6, 'Office Desk', 'Furniture', 250);
       ┌─id─┬─name─────────┬─category────┬─price─┐
    1. │  1 │ Office Chair │ Furniture   │   170 │
    2. │  2 │ Backpack     │ Accessories │    60 │
    3. │  3 │ Monitor      │ Electronics │   200 │
    4. │  4 │ Tablet       │ Electronics │   750 │
    5. │  5 │ Water Bottle │ Accessories │    10 │
    6. │  6 │ Office Desk  │ Furniture   │   250 │
       └────┴──────────────┴─────────────┴───────┘
  • products_2024 — product catalog for 2024:

    CREATE TABLE products_2024
    (
        id UInt32,
        name String,
        category String,
        price Float32
    ) ENGINE = MergeTree()
    ORDER BY id;
    INSERT INTO products_2024 VALUES
    (1, 'Office Chair', 'Furniture', 175),
    (2, 'Backpack', 'Accessories', 60),
    (3, 'Monitor', 'Electronics', 220),
    (6, 'Office Desk', 'Furniture', 250),
    (7, 'Laptop', 'Electronics', 1000),
    (8, 'Smartphone', 'Electronics', 700),
    (9, 'Cap', 'Accessories', 6),
    (10, 'Coffee Table', 'Furniture', 180);
       ┌─id─┬─name─────────┬─category────┬─price─┐
    1. │  1 │ Office Chair │ Furniture   │   175 │
    2. │  2 │ Backpack     │ Accessories │    60 │
    3. │  3 │ Monitor      │ Electronics │   220 │
    4. │  6 │ Office Desk  │ Furniture   │   250 │
    5. │  7 │ Laptop       │ Electronics │  1000 │
    6. │  8 │ Smartphone   │ Electronics │   700 │
    7. │  9 │ Cap          │ Accessories │     6 │
    8. │ 10 │ Coffee Table │ Furniture   │   180 │
       └────┴──────────────┴─────────────┴───────┘
  • sales_2024_may — sales data for May 2024:

    CREATE TABLE sales_2024_may (
        sale_id UInt32,
        product_name String,
        sale_date Date DEFAULT toDate(now())
    )
    ENGINE = MergeTree()
    ORDER BY sale_id;
    INSERT INTO sales_2024_may VALUES
    (1, 'Smartphone', '2024-05-06'),
    (2, 'Smartphone', '2024-05-07'),
    (3, 'Monitor', '2024-05-07'),
    (4, 'Office Desk', '2024-05-08'),
    (5, 'Monitor', '2024-05-09'),
    (6, 'Office Chair', '2024-05-09'),
    (7, 'Backpack', '2024-05-10'),
    (8, 'Backpack', '2024-05-11'),
    (9, 'Cap', '2024-05-11'),
    (10, 'Cap', '2024-05-12');
        ┌─sale_id─┬─product_name─┬──sale_date─┐
     1. │       1 │ Smartphone   │ 2024-05-06 │
     2. │       2 │ Smartphone   │ 2024-05-07 │
     3. │       3 │ Monitor      │ 2024-05-07 │
     4. │       4 │ Office Desk  │ 2024-05-08 │
     5. │       5 │ Monitor      │ 2024-05-09 │
     6. │       6 │ Office Chair │ 2024-05-09 │
     7. │       7 │ Backpack     │ 2024-05-10 │
     8. │       8 │ Backpack     │ 2024-05-11 │
     9. │       9 │ Cap          │ 2024-05-11 │
    10. │      10 │ Cap          │ 2024-05-12 │
        └─────────┴──────────────┴────────────┘

UNION

The UNION clause appends the result of the first query to the result of the second query. It can be used in two modes:

  • UNION ALL — the result of the union contains all rows from the results of both queries, including duplicates;

  • UNION DISTINCT — duplicate rows are excluded from the result of the union.

If the ALL or DISTINCT keyword is not explicitly specified, the behavior of UNION is determined by the union_default_mode setting — it can be set to ALL, DISTINCT, or an empty string (in this case, using UNION throws an exception).

The basic syntax is:

SELECT <column1> [, <column2> ]
FROM <table1>
[WHERE <condition>]

UNION [ALL | DISTINCT]

SELECT <column1> [, <column2> ]
FROM <table2>
[WHERE <condition>]

The resulting columns are matched by their index (order within SELECT). If column names do not match, names for the final result are taken from the first query.

Type casting is performed for unions. For example, if two queries being combined have the same field with compatible non-Nullable and Nullable types, the result of the union will have a Nullable type field.

Queries that are parts of UNION can be enclosed in parentheses. ORDER BY and LIMIT are applied to individual queries, not to the final result. If you need to apply the conversion to the final result, place all queries with UNION in a subquery in the FROM clause.

Examples

UNION ALL

The following query combines product catalogs for 2023 and 2024:

SELECT id, name FROM products_2023
UNION ALL
SELECT id, name FROM products_2024;
   ┌─id─┬─name─────────┐
1. │  1 │ Office Chair │
2. │  2 │ Backpack     │
3. │  3 │ Monitor      │
4. │  4 │ Tablet       │
5. │  5 │ Water Bottle │
6. │  6 │ Office Desk  │
   └────┴──────────────┘
    ┌─id─┬─name─────────┐
 7. │  1 │ Office Chair │
 8. │  2 │ Backpack     │
 9. │  3 │ Monitor      │
10. │  6 │ Office Desk  │
11. │  7 │ Laptop       │
12. │  8 │ Smartphone   │
13. │  9 │ Cap          │
14. │ 10 │ Coffee Table │
    └────┴──────────────┘

UNION DISTINCT

This query combines product catalogs for 2023 and 2024, eliminating duplicates:

SELECT id, name FROM products_2023
UNION DISTINCT
SELECT id, name FROM products_2024;
   ┌─id─┬─name─────────┐
1. │  1 │ Office Chair │
2. │  2 │ Backpack     │
3. │  3 │ Monitor      │
4. │  4 │ Tablet       │
5. │  5 │ Water Bottle │
6. │  6 │ Office Desk  │
   └────┴──────────────┘
    ┌─id─┬─name─────────┐
 7. │  7 │ Laptop       │
 8. │  8 │ Smartphone   │
 9. │  9 │ Cap          │
10. │ 10 │ Coffee Table │
    └────┴──────────────┘

INTERSECT

The INTERSECT clause returns all rows that appear in both the result sets of the first and second queries. The queries should match in the number of columns, their order, and data types. The INTERSECT result may contain duplicate rows. To eliminate duplicates, use INTERSECT DISTINCT.

If multiple INTERSECT statements are used, and no parentheses are specified, intersections are evaluated from left to right. INTERSECT has a higher execution priority than UNION and EXCEPT.

The basic syntax is:

SELECT <column1> [, <column2> ]
FROM <table1>
[WHERE <condition>]

INTERSECT [DISTINCT]

SELECT <column1> [, <column2> ]
FROM <table2>
[WHERE <condition>]

Examples

The following query returns the names of products sold in May 2024 that belong to the Electronics category, according to the products_2024 table:

SELECT product_name FROM sales_2024_may
INTERSECT
SELECT name FROM products_2024 WHERE category='Electronics';

In the result, you can see that there were two sales of the Smartphone and Monitor products:

   ┌─product_name─┐
1. │ Smartphone   │
2. │ Smartphone   │
3. │ Monitor      │
4. │ Monitor      │
   └──────────────┘

The DISTINCT keyword excludes duplicate rows from the result of the previous query:

SELECT product_name FROM sales_2024_may
INTERSECT DISTINCT
SELECT name FROM products_2024 WHERE category='Electronics';
   ┌─product_name─┐
1. │ Smartphone   │
2. │ Monitor      │
   └──────────────┘

EXCEPT

The EXCEPT clause returns the result of the first query, excluding rows that are also included in the result of the second query. The queries should match in the number of columns, their order, and data types. The EXCEPT result may contain duplicate rows. To exclude duplicates, use EXCEPT DISTINCT.

If multiple EXCEPT statements are used with no parentheses specified, they are evaluated from left to right. EXCEPT has the same execution priority as UNION — lower than INTERSECT.

The basic syntax is:

SELECT <column1> [, <column2> ]
FROM <table1>
[WHERE <condition>]

EXCEPT [DISTINCT]

SELECT <column1> [, <column2> ]
FROM <table2>
[WHERE <condition>]

Examples

EXCEPT

The following query returns the names of products sold in May 2024 that are not in the Electronics category:

SELECT product_name FROM sales_2024_may
EXCEPT
SELECT name FROM products_2024 WHERE category='Electronics';
   ┌─product_name─┐
1. │ Office Desk  │
2. │ Office Chair │
3. │ Backpack     │
4. │ Backpack     │
5. │ Cap          │
6. │ Cap          │
   └──────────────┘

The DISTINCT keyword removes duplicates from the result of the previous query:

SELECT product_name FROM sales_2024_may
EXCEPT DISTINCT
SELECT name FROM products_2024 WHERE category='Electronics';
   ┌─product_name─┐
1. │ Office Desk  │
2. │ Office Chair │
3. │ Backpack     │
4. │ Cap          │
   └──────────────┘
Found a mistake? Seleсt text and press Ctrl+Enter to report it