Конференция Arenadata
Новое время — новый Greenplum
Мы приглашаем вас принять участие в конференции, посвященной будущему Open-Source Greenplum 19 сентября в 18:00:00 UTC +3. Встреча будет проходить в гибридном формате — и офлайн, и онлайн. Онлайн-трансляция будет доступна для всех желающих.
Внезапное закрытие Greenplum его владельцем — компанией Broadcom - стало неприятным сюрпризом для всех, кто использует или планирует начать использовать решения на базе этой технологии. Многие ожидают выхода стабильной версии Greenplum 7 и надеются на её дальнейшее активное развитие.
Arenadata не могла допустить, чтобы разрабатываемый годами Open-Source проект Greenplum прекратил своё существование, поэтому 19 сентября мы представим наш ответ на данное решение Broadcom, а участники сообщества получат исчерпывающие разъяснения на все вопросы о дальнейшей судьбе этой технологии.

На конференции вас ждёт обсуждение следующих тем:

  • План возрождения Greenplum;
  • Дорожная карта;
  • Экспертное обсуждение и консультации.
Осталось до события

Window functions

Window functions perform a calculation across a set of rows. They do not return a single output row as aggregate functions, window functions include some additional information in the selection result.

The simplified window function syntax can be written as follows:

<function_name> OVER (<window>)

Where:

  • function_name is a window function name;

  • window is an expression that describes the set of rows to process and the processing order.

For information on the extended window function syntax, refer to Window Function Calls.

If you need to pass all result rows to a window function, use empty parentheses (). For example, the following query adds row numbers to the result:

SELECT id, title, evaluation,
       row_number() OVER () AS numbers
FROM books;

The result:

 id |                title                | evaluation | numbers
----+-------------------------------------+------------+---------
  1 | Mrs. Dalloway                       |       7.82 |       1
  2 | To the Lighthouse                   |       8.45 |       2
  3 | To Kill a Mockingbird               |       7.48 |       3
  4 | The Great Gatsby                    |       9.23 |       4
  5 | The Lord of the Rings               |       9.49 |       5
  6 | 1984                                |       8.17 |       6
  7 | The Hobbit, or There and Back Again |       9.32 |       7
  8 | War and Peace                       |       9.69 |       8
  9 | Hyperion                            |       9.46 |       9
 10 | The Time Machine                    |       8.12 |      10

You can add the ORDER BY clause to the window function to change the processing order:

SELECT id, title, genre, evaluation,
       row_number() OVER (ORDER BY evaluation DESC) AS rating
FROM books
ORDER BY id;

The result:

 id |                title                |  genre  | evaluation | rating
----+-------------------------------------+---------+------------+--------
  1 | Mrs. Dalloway                       | novel   |       7.82 |      9
  2 | To the Lighthouse                   | novel   |       8.45 |      6
  3 | To Kill a Mockingbird               | novel   |       7.48 |     10
  4 | The Great Gatsby                    | novel   |       9.23 |      5
  5 | The Lord of the Rings               | fantasy |       9.49 |      2
  6 | 1984                                | sci-fi  |       8.17 |      7
  7 | The Hobbit, or There and Back Again | fantasy |       9.32 |      4
  8 | War and Peace                       | novel   |       9.69 |      1
  9 | Hyperion                            | sci-fi  |       9.46 |      3
 10 | The Time Machine                    | sci-fi  |       8.12 |      8

In the example above, a query-level sort (ORDER BY id) is added to the query. PostgreSQL first evaluates the window function results and then sorts its results with the query results according to ORDER BY id. ORDER BY at the window level and ORDER BY at the query level are applied correctly and do not interfere with each other.

You can use the PARTITION BY clause to divide rows into groups or partitions that share the same values of the PARTITION BY expression. The window function calculates a row value across the rows of the same partition. The following example calculates a book rating across books of the same genre:

SELECT id, title, genre, evaluation,
       row_number() OVER (PARTITION BY genre ORDER BY evaluation DESC) AS rating
FROM books
ORDER BY genre;

The result:

 id |                title                |  genre  | evaluation | rating
----+-------------------------------------+---------+------------+--------
  5 | The Lord of the Rings               | fantasy |       9.49 |      1
  7 | The Hobbit, or There and Back Again | fantasy |       9.32 |      2
  8 | War and Peace                       | novel   |       9.69 |      1
  4 | The Great Gatsby                    | novel   |       9.23 |      2
  2 | To the Lighthouse                   | novel   |       8.45 |      3
  1 | Mrs. Dalloway                       | novel   |       7.82 |      4
  3 | To Kill a Mockingbird               | novel   |       7.48 |      5
  9 | Hyperion                            | sci-fi  |       9.46 |      1
  6 | 1984                                | sci-fi  |       8.17 |      2
 10 | The Time Machine                    | sci-fi  |       8.12 |      3

If you omit PARTITION BY, a window function is evaluated for all rows.

The list of predefined window functions is available at the following link: General-purpose window functions.

You can also use aggregate functions sum, count and others as window functions. If ORDER BY is not specified in the window clause of an aggregate function, the function is calculated for the entire partition and its result is written to all partition rows.

For example, we have the table orders:

 id | customer_id |  total
----+-------------+---------
  1 |          17 | 3500.00
  2 |          22 | 1000.00
  4 |          46 | 3300.00
 11 |          38 | 2000.00
 12 |          17 | 1750.00
  3 |          17 | 5000.00
  5 |          17 | 4700.00
  6 |          17 | 2000.00
  7 |          46 | 1900.00
  8 |          22 | 2100.00
  9 |          22 | 1400.00
 10 |          13 | 1400.00
 13 |          46 | 4600.00
 14 |          13 | 2250.00

The query below calculates the sum of orders for each customer:

SELECT id, customer_id,
    sum(total) OVER (PARTITION BY customer_id) as sum
FROM orders;

The result:

 id | customer_id |   sum
----+-------------+----------
 10 |          13 |  3650.00
 14 |          13 |  3650.00
  6 |          17 | 16950.00
 12 |          17 | 16950.00
  3 |          17 | 16950.00
  5 |          17 | 16950.00
  1 |          17 | 16950.00
  2 |          22 |  4500.00
  9 |          22 |  4500.00
  8 |          22 |  4500.00
 11 |          38 |  2000.00
  7 |          46 |  9800.00
  4 |          46 |  9800.00
 13 |          46 |  9800.00

If ORDER BY is specified, the function is calculated for rows from the beginning of the partition to the current row. If the next row contains the same value of the field specified in the ORDER BY clause, it is included in the calculation. This set of rows is called window frame. Some window functions operate with window frame rows, rather than with the entire partition. For example, the sum function works with window frames. Add ORDER BY to the example above:

SELECT id, customer_id,
    sum(total) OVER (PARTITION BY customer_id ORDER BY id) as sum
FROM orders;

The result:

 id | customer_id |   sum
----+-------------+----------
 10 |          13 |  1400.00
 14 |          13 |  3650.00
  1 |          17 |  3500.00
  3 |          17 |  8500.00
  5 |          17 | 13200.00
  6 |          17 | 15200.00
 12 |          17 | 16950.00
  2 |          22 |  1000.00
  8 |          22 |  3100.00
  9 |          22 |  4500.00
 11 |          38 |  2000.00
  4 |          46 |  3300.00
  7 |          46 |  5200.00
 13 |          46 |  9800.00

The result contains the running total for each partition.

You can use window functions in the SELECT and ORDER BY clause. They cannot be included in the GROUP BY, HAVING, WHERE, and other clauses, because window functions are calculated after the processing of these clauses. Also, window functions are executed after non-window aggregate functions. You can include an aggregate function call in the arguments of a window function.

To filter or group rows after a window function calculation, use a subquery. The following query utilizes subquery to apply a filter and display rows where sum less than 5000:

SELECT id, customer_id, sum
 FROM
    (SELECT id, customer_id, sum(total) OVER (PARTITION BY customer_id ORDER BY id) as sum
        FROM orders
    )AS orders_with_sum
 WHERE sum < 5000;

The result:

 id | customer_id |   sum
----+-------------+---------
 10 |          13 | 1400.00
 14 |          13 | 3650.00
  1 |          17 | 3500.00
  2 |          22 | 1000.00
  8 |          22 | 3100.00
  9 |          22 | 4500.00
 11 |          38 | 2000.00
  4 |          46 | 3300.00

You can use several window functions in a single query:

SELECT id, customer_id,
    sum(total) OVER (PARTITION BY customer_id ORDER BY id) as sum,
    round(avg(total) OVER(PARTITION BY customer_id),2) as avg
FROM orders;

The result:

 id | customer_id |   sum    |   avg
----+-------------+----------+---------
 10 |          13 |  1400.00 | 1825.00
 14 |          13 |  3650.00 | 1825.00
  1 |          17 |  3500.00 | 3390.00
  3 |          17 |  8500.00 | 3390.00
  5 |          17 | 13200.00 | 3390.00
  6 |          17 | 15200.00 | 3390.00
 12 |          17 | 16950.00 | 3390.00
  2 |          22 |  1000.00 | 1500.00
  8 |          22 |  3100.00 | 1500.00
  9 |          22 |  4500.00 | 1500.00
 11 |          38 |  2000.00 | 2000.00
  4 |          46 |  3300.00 | 3266.67
  7 |          46 |  5200.00 | 3266.67
 13 |          46 |  9800.00 | 3266.67

If multiple window functions have the same OVER clause, you can name it and place separately with the WINDOW keyword. Rewrite the query above as follows:

SELECT id, customer_id,
    sum(total) OVER (w ORDER BY id) as sum,
    round(avg(total) OVER(w),2) as avg
FROM orders
WINDOW w AS (PARTITION BY customer_id);

The result:

 id | customer_id |   sum    |   avg
----+-------------+----------+---------
 10 |          13 |  1400.00 | 1825.00
 14 |          13 |  3650.00 | 1825.00
  1 |          17 |  3500.00 | 3390.00
  3 |          17 |  8500.00 | 3390.00
  5 |          17 | 13200.00 | 3390.00
  6 |          17 | 15200.00 | 3390.00
 12 |          17 | 16950.00 | 3390.00
  2 |          22 |  1000.00 | 1500.00
  8 |          22 |  3100.00 | 1500.00
  9 |          22 |  4500.00 | 1500.00
 11 |          38 |  2000.00 | 2000.00
  4 |          46 |  3300.00 | 3266.67
  7 |          46 |  5200.00 | 3266.67
 13 |          46 |  9800.00 | 3266.67
Found a mistake? Seleсt text and press Ctrl+Enter to report it