Common table expressions

Common table expression (CTE) is a named dataset that is defined within the scope of an SQL query, exists only during this query execution (CTE is not materialized and is not permanently stored anywhere in a database), and can be accessed multiple times within the main query.

CTE allows you to simplify and make a complex query more readable (you can divide the code into simple parts, reduce its size, avoid duplication), as well as to optimize the execution of a query that involves multiple access to the same datasets (the necessary data is temporarily stored in the cache so that it does not need to be searched for each time).

CTE syntax

To define CTE, use the WITH clause before the main (parent) query:

WITH <cte_name> AS (<cte_subquery>)
<primary_query>;

where:

  • <cte_name> — identifier by which CTE can be referenced in other parts of the main query;

  • <cte_subquery> — SQL query whose results populate CTE;

  • <primary_query> — main SQL query that uses CTE.

CTE syntax for using results of a scalar subquery is slightly different:

WITH (<cte_scalar_subquery>) AS <cte_name>
<primary_query>;

You can define multiple CTEs within a single WITH clause, separating them by commas:

WITH
    <cte1>,
    <cte2>,
    <cte3>
<primary_query>;

Each CTE can reference any previous one within the WITH clause.

Examples

This section provides examples of how CTE can be used in queries to analyze data about bonuses given to employees of a chain of stores in two cities in September. The data is stored in the employee_bonus_sept_24 table:

CREATE TABLE employee_bonus_sept_24 (
    empl_id UInt16,
    full_name String,
    position Enum('store manager' = 1, 'cashier' = 2, 'sales assistant' = 3),
    store_id UInt8,
    city Enum('New York' = 1, 'Chicago' = 2),
    bonus UInt32)
ENGINE = MergeTree
ORDER BY empl_id;
INSERT INTO employee_bonus_sept_24 VALUES
    (1, 'James Wilson', 1, 101, 1, 1000),
    (2, 'Mary Burton', 2, 101, 1, 450),
    (3, 'Patricia Robinson', 2, 101, 1, 500),
    (4, 'Robert Gray', 1, 102, 1, 1200),
    (5, 'Elizabeth Tucker', 2, 102, 1, 470),
    (6, 'Joseph Lewis', 3, 102, 1, 700),
    (7, 'William Ferguson', 1, 201, 2, 970),
    (8, 'Linda Black', 2, 201, 2, 350),
    (9, 'David Green', 2, 201, 2, 370),
    (10, 'Daniel Gray', 3, 201, 2, 770),
    (11, 'Mark Armstrong', 1, 202, 2, 1200),
    (12, 'Donald Carter', 2, 202, 2, 420),
    (13, 'Elizabeth Collins', 2, 202, 2, 400),
    (14, 'Paul Brown', 3, 202, 2, 660),
    (15, 'Andrew Clarke', 3, 202, 2, 680);
    ┌─empl_id─┬─full_name─────────┬─position────────┬─store_id─┬─city─────┬─bonus─┐
 1. │       1 │ James Wilson      │ store manager   │      101 │ New York │  1000 │
 2. │       2 │ Mary Burton       │ cashier         │      101 │ New York │   450 │
 3. │       3 │ Patricia Robinson │ cashier         │      101 │ New York │   500 │
 4. │       4 │ Robert Gray       │ store manager   │      102 │ New York │  1200 │
 5. │       5 │ Elizabeth Tucker  │ cashier         │      102 │ New York │   470 │
 6. │       6 │ Joseph Lewis      │ sales assistant │      102 │ New York │   700 │
 7. │       7 │ William Ferguson  │ store manager   │      201 │ Chicago  │   970 │
 8. │       8 │ Linda Black       │ cashier         │      201 │ Chicago  │   350 │
 9. │       9 │ David Green       │ cashier         │      201 │ Chicago  │   370 │
10. │      10 │ Daniel Gray       │ sales assistant │      201 │ Chicago  │   770 │
11. │      11 │ Mark Armstrong    │ store manager   │      202 │ Chicago  │  1200 │
12. │      12 │ Donald Carter     │ cashier         │      202 │ Chicago  │   420 │
13. │      13 │ Elizabeth Collins │ cashier         │      202 │ Chicago  │   400 │
14. │      14 │ Paul Brown        │ sales assistant │      202 │ Chicago  │   660 │
15. │      15 │ Andrew Clarke     │ sales assistant │      202 │ Chicago  │   680 │
    └─────────┴───────────────────┴─────────────────┴──────────┴──────────┴───────┘

Query with CTE

In the query below:

  • avg_bonus is CTE that calculates the average bonus for each position across all stores in a given month;

  • only those employees who received a bonus higher than the average are displayed.

WITH avg_bonus AS
    (
        SELECT position, round(avg(bonus)) AS position_avg_bonus
        FROM employee_bonus_sept_24
        GROUP BY position
    )
SELECT
    b.full_name AS employee,
    b.position,
    b.store_id AS store,
    b.bonus,
    ab.position_avg_bonus
FROM employee_bonus_sept_24 AS b
INNER JOIN avg_bonus AS ab ON b.position = ab.position
WHERE b.bonus > ab.position_avg_bonus
ORDER BY b.position, b.bonus DESC;
   ┌─employee──────────┬─position────────┬─store─┬─bonus─┬─position_avg_bonus─┐
1. │ Robert Gray       │ store manager   │   102 │  1200 │               1092 │
2. │ Mark Armstrong    │ store manager   │   202 │  1200 │               1092 │
3. │ Patricia Robinson │ cashier         │   101 │   500 │                423 │
4. │ Elizabeth Tucker  │ cashier         │   102 │   470 │                423 │
5. │ Mary Burton       │ cashier         │   101 │   450 │                423 │
6. │ Daniel Gray       │ sales assistant │   201 │   770 │                702 │
   └───────────────────┴─────────────────┴───────┴───────┴────────────────────┘

Window functions in CTE

If you need to filter query data by values of window functions, calculate these values within CTE (or a subquery).

For example, the following query:

  • in the bonus_rank_by_city CTE, ranks employee bonuses using the dense_rank() window function with windows defined by cities;

  • in the WHERE clause, filters rows by the calculated rank to return the 3 highest bonuses for employees in each city.

WITH bonus_rank_by_city AS
    (
        SELECT *, dense_rank() OVER (PARTITION BY city ORDER BY bonus DESC) AS rnk
        FROM employee_bonus_sept_24
        ORDER BY city, rnk
    )
SELECT *
FROM bonus_rank_by_city
WHERE rnk BETWEEN 1 AND 3;
   ┌─empl_id─┬─full_name────────┬─position────────┬─store_id─┬─city─────┬─bonus─┬─rnk─┐
1. │       4 │ Robert Gray      │ store manager   │      102 │ New York │  1200 │   1 │
2. │       1 │ James Wilson     │ store manager   │      101 │ New York │  1000 │   2 │
3. │       6 │ Joseph Lewis     │ sales assistant │      102 │ New York │   700 │   3 │
4. │      11 │ Mark Armstrong   │ store manager   │      202 │ Chicago  │  1200 │   1 │
5. │       7 │ William Ferguson │ store manager   │      201 │ Chicago  │   970 │   2 │
6. │      10 │ Daniel Gray      │ sales assistant │      201 │ Chicago  │   770 │   3 │
   └─────────┴──────────────────┴─────────────────┴──────────┴──────────┴───────┴─────┘

Multiple CTEs in a query

The following query uses two CTEs:

  • avg_bonus_position — average monthly bonus for a position;

  • avg_bonus_city — average monthly bonus for a city.

WITH
    avg_bonus_position AS
    (
        SELECT position, round(avg(bonus)) AS position_avg_bonus
        FROM employee_bonus_sept_24
        GROUP BY position
    ),
    avg_bonus_city AS
    (
        SELECT city, round(avg(bonus)) AS city_avg_bonus
        FROM employee_bonus_sept_24
        GROUP BY city
    )
SELECT
    b.full_name AS employee,
    b.position AS position,
    b.store_id AS store,
    b.city AS city,
    b.bonus,
    abp.position_avg_bonus,
    abc.city_avg_bonus
FROM employee_bonus_sept_24 AS b
INNER JOIN avg_bonus_position AS abp ON b.position = abp.position
INNER JOIN avg_bonus_city AS abc ON b.city = abc.city
ORDER BY b.position, b.bonus DESC;
    ┌─employee──────────┬─position────────┬─store─┬─city─────┬─bonus─┬─position_avg_bonus─┬─city_avg_bonus─┐
 1. │ Robert Gray       │ store manager   │   102 │ New York │  1200 │               1092 │            720 │
 2. │ Mark Armstrong    │ store manager   │   202 │ Chicago  │  1200 │               1092 │            647 │
 3. │ James Wilson      │ store manager   │   101 │ New York │  1000 │               1092 │            720 │
 4. │ William Ferguson  │ store manager   │   201 │ Chicago  │   970 │               1092 │            647 │
 5. │ Patricia Robinson │ cashier         │   101 │ New York │   500 │                423 │            720 │
 6. │ Elizabeth Tucker  │ cashier         │   102 │ New York │   470 │                423 │            720 │
 7. │ Mary Burton       │ cashier         │   101 │ New York │   450 │                423 │            720 │
 8. │ Donald Carter     │ cashier         │   202 │ Chicago  │   420 │                423 │            647 │
 9. │ Elizabeth Collins │ cashier         │   202 │ Chicago  │   400 │                423 │            647 │
10. │ David Green       │ cashier         │   201 │ Chicago  │   370 │                423 │            647 │
11. │ Linda Black       │ cashier         │   201 │ Chicago  │   350 │                423 │            647 │
12. │ Daniel Gray       │ sales assistant │   201 │ Chicago  │   770 │                702 │            647 │
13. │ Joseph Lewis      │ sales assistant │   102 │ New York │   700 │                702 │            720 │
14. │ Andrew Clarke     │ sales assistant │   202 │ Chicago  │   680 │                702 │            647 │
15. │ Paul Brown        │ sales assistant │   202 │ Chicago  │   660 │                702 │            647 │
    └───────────────────┴─────────────────┴───────┴──────────┴───────┴────────────────────┴────────────────┘

Nested CTEs

The following query allows you to compare the amount of funds allocated for employee bonuses in different stores. It shows what percentage the amount of bonuses paid to employees in each store takes up from the total amount of expenses on bonuses for employees of the entire chain. To do this, the query uses nested CTEs:

  • bonus_all_stores — total amount of expenses on bonuses for employees of all stores;

  • store_bonus_totals_and_rates — the amount of bonuses paid to employees in each store and its share relative to the total employee bonus expense in the month, calculated using the output of the bonus_all_stores CTE.

WITH
    (SELECT sum(bonus) AS all_stores_bonus FROM employee_bonus_sept_24) AS bonus_all_stores,
    store_bonus_totals_and_rates AS
    (
        SELECT
            store_id,
            count() AS employee_count,
            sum(bonus) AS store_bonus_total,
            store_bonus_total / bonus_all_stores AS store_bonus_rate
        FROM employee_bonus_sept_24
        GROUP BY store_id
    )
SELECT
    store_id,
    employee_count,
    store_bonus_total AS total_bonus,
    round(store_bonus_rate * 100, 2) AS bonus_percentage
FROM store_bonus_totals_and_rates;
   ┌─store_id─┬─employee_count─┬─total_bonus─┬─bonus_percentage─┐
1. │      101 │              3 │        1950 │            19.23 │
2. │      102 │              3 │        2370 │            23.37 │
3. │      201 │              4 │        2460 │            24.26 │
4. │      202 │              5 │        3360 │            33.14 │
   └──────────┴────────────────┴─────────────┴──────────────────┘
Found a mistake? Seleсt text and press Ctrl+Enter to report it