Queries with Window Functions
Window expressions allow application developers to more easily compose complex online analytical processing (OLAP) queries using standard SQL commands. For example, with window expressions, users can calculate moving averages or sums over various intervals, reset aggregations and ranks as selected column values change, and express complex ratios in simple terms.
A window expression represents the application of a <window function>
to a <window frame>
, which is defined with an OVER()
clause. This is comparable to the type of calculation that can be done with an aggregate function and a GROUP BY
clause. Unlike aggregate functions, which return a single result value for each group of rows, window functions return a result value for every row, but that value is calculated with respect to the set of rows in the window frame to which the row belongs. The OVER()
clause allows dividing the rows into partitions and then further restricting the window frame by specifying which rows preceding or following the current row within its partition to include in the calculation.
Apache Cloudberry does not support specifying a window function as an argument to another window function.
The syntax of a window expression is:
<window_function> ( [<expression> [, ...]] ) [ FILTER ( WHERE <filter_clause> ) ] OVER ( <window_specification> )
Where <window_function>
might be a user-defined window function, and <expression>
is any value expression that does not contain a window expression, and <window_specification>
is:
[<window_name>]
[PARTITION BY <expression> [, ...]]
[[ORDER BY <expression> [ASC | DESC | USING operator] [NULLS {FIRST | LAST}] [, ...]
[ <frame_clause> ]
The optional <frame_clause>
can be one of the following:
{ RANGE | ROWS | GROUPS } <frame_start> [ <frame_exclusion> ]
{ RANGE | ROWS | GROUPS } BETWEEN <frame_start> AND <frame_end> [ <frame_exclusion> ]
Where <frame_start>
and <frame_end>
can be one of the following:
UNBOUNDED PRECEDING
<offset> PRECEDING
CURRENT ROW
<offset> FOLLOWING
UNBOUNDED FOLLOWING
and <frame_exclusion>
can be one of the following:
EXCLUDE CURRENT ROW
EXCLUDE GROUP
EXCLUDE TIES
EXCLUDE NO OTHERS
A window expression can appear only in the select list of a SELECT
command. For example:
SELECT count(*) OVER(PARTITION BY customer_id), * FROM sales;
If FILTER
is specified, then only the input rows for which the <filter_clause>
evaluates to true are fed to the window function; other rows are discarded. In a window expression, a FILTER
clause can be used only with a <window_function>
that is an aggregate function.
In a window expression, the expression must contain an OVER
clause. The OVER
clause specifies the window frame—the rows to be processed by the window function. This syntactically distinguishes the function from a regular or aggregate function.
In a window aggregate function that is used in a window expression, Apache Cloudberry does not support a DISTINCT
clause with multiple input expressions.
A window specification has the following characteristics:
- The
PARTITION BY
clause defines the window partitions to which the window function is applied. If omitted, the entire result set is treated as one partition. - The
ORDER BY
clause defines the expression(s) for sorting rows within a window partition. TheORDER BY
clause of a window specification is separate and distinct from theORDER BY
clause of a regular query expression. TheORDER BY
clause is required for the window functions that calculate rankings, as it identifies the measure(s) for the ranking values. For OLAP aggregations, theORDER BY
clause is required to use window frames (theROWS
,RANGE
orGROUPS
clause).
Columns of data types without a coherent ordering, such as time
, are not good candidates for use in the ORDER BY
clause of a window specification. Time
, with or without a specified time zone, lacks a coherent ordering because addition and subtraction do not have the expected effects. For example, the following is not generally true: x::time < x::time + '2 hour'::interval
-
The
<frame_clause>
specifies the set of rows constituting the<window frame>
, which is a subset of the current partition, for those window functions that act on the frame instead of the whole partition. The set of rows in the frame can vary depending on which row is the current row. The frame can be specified inRANGE
,ROWS
orGROUPS
mode; in each case, it runs from the<frame_start>
to the<frame_end>
. If<frame_end>
is omitted, the end defaults toCURRENT ROW
. -
A
<frame_start>
ofUNBOUNDED PRECEDING
means that the frame starts with the first row of the partition, and similarly a<frame_end>
ofUNBOUNDED FOLLOWING
means that the frame ends with the last row of the partition. -
In
RANGE
orGROUPS
mode, a<frame_start>
ofCURRENT ROW
means the frame starts with the current row's first peer row (a row that the window'sORDER BY
clause sorts as equivalent to the current row), while a<frame_end>
ofCURRENT ROW
means the frame ends with the current row's last peer row. InROWS
mode,CURRENT ROW
simply means the current row. -
In the
<offset> PRECEDING
and<offset> FOLLOWING
frame options, the<offset>
must be an expression not containing any variables, aggregate functions, or window functions. The meaning of the<offset>
depends on the frame mode:-
In
ROWS
mode, the<offset>
must yield a non-null, non-negative integer, and the option means that the frame starts or ends the specified number of rows before or after the current row. -
In
GROUPS
mode, the<offset>
again must yield a non-null, non-negative integer, and the option means that the frame starts or ends the specified number of peer groups before or after the current row's peer group, where a peer group is a set of rows that are equivalent in theORDER BY
ordering. (There must be anORDER BY
clause in the window definition to useGROUPS
mode). -
In
RANGE
mode, these options require that theORDER BY
clause specifies exactly one column. The<offset>
specifies the maximum difference between the value of that column in the current row and its value in preceding or following rows of the frame. The data type of the<offset>
expression varies depending on the data type of the ordering column. For numeric ordering columns it is typically of the same type as the ordering column, but for datetime ordering columns it is aninterval
. For example, if the ordering column is of typedate
ortimestamp
, one could writeRANGE BETWEEN '1 day' PRECEDING AND '10 days' FOLLOWING
. The<offset>
is still required to be non-null and non-negative, though the meaning of “non-negative” depends on its data type.
In any case, the distance to the end of the frame is limited by the distance to the end of the partition, so that for rows near the partition ends the frame might contain fewer rows than elsewhere.
-
-
Notice that in both
ROWS
andGROUPS
mode,0 PRECEDING
and0 FOLLOWING
are equivalent toCURRENT ROW
. This normally holds inRANGE
mode as well, for an appropriate data-type-specific meaning of “zero”. -
The
<frame_exclusion>
option allows rows around the current row to be excluded from the frame, even if they would be included according to the frame start and frame end options.EXCLUDE CURRENT ROW
excludes the current row from the frame.EXCLUDE GROUP
excludes the current row and its ordering peers from the frame.EXCLUDE TIES
excludes any peers of the current row from the frame, but not the current row itself.EXCLUDE NO OTHERS
simply specifies explicitly the default behavior of not excluding the current row or its peers. -
The default framing option is
RANGE UNBOUNDED PRECEDING
, which is the same asRANGE BETWEEN UNBOUNDED PRECEDING AND CURRENT ROW
. WithORDER BY
, this sets the frame to be all rows from the partition start up through the current row's lastORDER BY
peer. WithoutORDER BY
, this means all rows of the partition are included in the window frame, because all rows become peers of the current row. -
Restrictions are that
<frame_start>
cannot beUNBOUNDED FOLLOWING
,<frame_end>
cannot beUNBOUNDED PRECEDING
, and the<frame_end>
choice cannot appear earlier in the above list than the<frame_start>
choice. for exampleRANGE BETWEEN CURRENT ROW AND value PRECEDING
is not allowed.
Window examples
The following examples demonstrate using window functions with partitions and window frames.
Example 1 – Aggregate window function over a partition
The PARTITION BY
list in the OVER
clause divides the rows into groups, or partitions, that have the same values as the specified expressions.
This example compares employees' salaries with the average salaries for their departments:
SELECT depname, empno, salary, avg(salary) OVER(PARTITION BY depname)
FROM empsalary;
depname | empno | salary | avg
-----------+-------+--------+-----------------------
develop | 9 | 4500 | 5020.0000000000000000
develop | 10 | 5200 | 5020.0000000000000000
develop | 11 | 5200 | 5020.0000000000000000
develop | 7 | 4200 | 5020.0000000000000000
develop | 8 | 6000 | 5020.0000000000000000
personnel | 5 | 3500 | 3700.0000000000000000
personnel | 2 | 3900 | 3700.0000000000000000
sales | 1 | 5000 | 4866.6666666666666667
sales | 3 | 4800 | 4866.6666666666666667
sales | 4 | 4800 | 4866.6666666666666667
(10 rows)
The first three output columns come from the table empsalary
, and there is one output row for each row in the table. The fourth column is the average calculated on all rows that have the same depname
value as the current row. Rows that share the same depname
value constitute a partition, and there are three partitions in this example. The avg
function is the same as the regular avg
aggregate function, but the OVER
clause causes it to be applied as a window function.
You can also put the window specification in a WINDOW
clause and reference it in the select list. This example is equivalent to the previous query:
SELECT depname, empno, salary, avg(salary) OVER(mywindow)
FROM empsalary
WINDOW mywindow AS (PARTITION BY depname);
Defining a named window is useful when the select list has multiple window functions using the same window specification.
Example 2 – Ranking window function with an ORDER BY clause
An ORDER BY
clause within the OVER
clause controls the order in which rows are processed by window functions. The ORDER BY
list for the window function does not have to match the output order of the query. This example uses the rank()
window function to rank employees' salaries within their departments:
SELECT depname, empno, salary,
rank() OVER (PARTITION BY depname ORDER BY salary DESC)
FROM empsalary;
depname | empno | salary | rank
-----------+-------+--------+------
develop | 8 | 6000 | 1
develop | 11 | 5200 | 2
develop | 10 | 5200 | 2
develop | 9 | 4500 | 4
develop | 7 | 4200 | 5
personnel | 2 | 3900 | 1
personnel | 5 | 3500 | 2
sales | 1 | 5000 | 1
sales | 4 | 4800 | 2
sales | 3 | 4800 | 2
(10 rows)
Example 3 – Aggregate function over a row window frame
A RANGE
, ROWS
or GROUPS
clause defines the window frame—a set of rows within a partition—that the window function includes in the calculation. ROWS
specifies a physical set of rows to process, for example all rows from the beginning of the partition to the current row.
This example calculates a running total of employee's salaries by department using the sum()
function to total rows from the start of the partition to the current row:
SELECT depname, empno, salary,
sum(salary) OVER (PARTITION BY depname ORDER BY salary
ROWS between UNBOUNDED PRECEDING AND CURRENT ROW)
FROM empsalary ORDER BY depname, sum;
depname | empno | salary | sum
-----------+-------+--------+-------
develop | 7 | 4200 | 4200
develop | 9 | 4500 | 8700
develop | 11 | 5200 | 13900
develop | 10 | 5200 | 19100
develop | 8 | 6000 | 25100
personnel | 5 | 3500 | 3500
personnel | 2 | 3900 | 7400
sales | 4 | 4800 | 4800
sales | 3 | 4800 | 9600
sales | 1 | 5000 | 14600
(10 rows)
Example 4 – Aggregate function for a range or groups window frame
RANGE
and GROUPS
modes specify logical values based on values of the ORDER BY
expression in the OVER
clause. This example demonstrates the difference between ROWS
and RANGE
or GROUPS
. The frame contains all rows with salary values less than or equal to the current row. Unlike the previous example, for employees with the same salary, the sum is the same and includes the salaries of all of those employees.
SELECT depname, empno, salary,
sum(salary) OVER (PARTITION BY depname ORDER BY salary
RANGE between UNBOUNDED PRECEDING AND CURRENT ROW)
FROM empsalary ORDER BY depname, sum;
depname | empno | salary | sum
-----------+-------+--------+-------
develop | 7 | 4200 | 4200
develop | 9 | 4500 | 8700
develop | 11 | 5200 | 19100
develop | 10 | 5200 | 19100
develop | 8 | 6000 | 25100
personnel | 5 | 3500 | 3500
personnel | 2 | 3900 | 7400
sales | 4 | 4800 | 9600
sales | 3 | 4800 | 9600
sales | 1 | 5000 | 14600
(10 rows)
In this example, because the <frame_start>
and <frame_end>
are not using an <offset>
, RANGE
and GROUPS
mode provide identical results:
SELECT depname, empno, salary,
sum(salary) OVER (PARTITION BY depname ORDER BY salary
GROUPS between UNBOUNDED PRECEDING AND CURRENT ROW)
FROM empsalary ORDER BY depname, sum;
depname | empno | salary | sum
-----------+-------+--------+-------
develop | 7 | 4200 | 4200
develop | 9 | 4500 | 8700
develop | 11 | 5200 | 19100
develop | 10 | 5200 | 19100
develop | 8 | 6000 | 25100
personnel | 5 | 3500 | 3500
personnel | 2 | 3900 | 7400
sales | 4 | 4800 | 9600
sales | 3 | 4800 | 9600
sales | 1 | 5000 | 14600
(10 rows)
Example 5 – Aggregate function for a range and a groups window frame
This example demonstrates the difference between RANGE
and GROUPS
. The table sales
lists the sales a company's two shops during a period of four days.
When using GROUPS
mode, the value of <offset>
indicates the number of peer groups before the current row's peer group. The different peer groups are set by the ORDER BY
clause, in this case the date. because the query does not specify a value for <frame_end>
, the end defaults to CURRENT ROW
.
SELECT date, shop, total, sum(total) OVER (PARTITION BY shop ORDER BY date asc GROUPS 2 PRECEDING)
FROM sales ORDER BY shop, date;
date | shop | total | sum
------------+--------+---------+----------
2022-01-07 | Shop 1 | 3000.00 | 3000.00
2022-01-08 | Shop 1 | 1000.00 | 4000.00
2022-01-09 | Shop 1 | 5000.00 | 11000.00
2022-01-09 | Shop 1 | 2000.00 | 11000.00
2022-01-07 | Shop 2 | 4000.00 | 10000.00
2022-01-07 | Shop 2 | 6000.00 | 10000.00
2022-01-09 | Shop 2 | 7000.00 | 21000.00
2022-01-09 | Shop 2 | 4000.00 | 21000.00
2022-01-10 | Shop 2 | 2000.00 | 23000.00
(9 rows)
In order to get the equivalent result when using the RANGE
mode, the <frame_clause>
must specify an <offset>
using the same data type as the ordering column, in this case date
. Using the numberic value 2
will return an error.
SELECT date, shop, total, sum(total) OVER (PARTITION BY shop ORDER BY date asc RANGE '2 days' PRECEDING)
FROM sales ORDER BY shop, date;
date | shop | total | sum
------------+--------+---------+----------
2022-01-07 | Shop 1 | 3000.00 | 3000.00
2022-01-08 | Shop 1 | 1000.00 | 4000.00
2022-01-09 | Shop 1 | 5000.00 | 11000.00
2022-01-09 | Shop 1 | 2000.00 | 11000.00
2022-01-07 | Shop 2 | 4000.00 | 10000.00
2022-01-07 | Shop 2 | 6000.00 | 10000.00
2022-01-09 | Shop 2 | 7000.00 | 21000.00
2022-01-09 | Shop 2 | 4000.00 | 21000.00
2022-01-10 | Shop 2 | 2000.00 | 13000.00
(9 rows)
Note that the above outputs differ in the last row because RANGE
mode uses the entries from the previous two days and there is no entry for 2022-01-08
for Shop 2, but GROUPS
mode uses the previous two peer groups, which are 2022-01-07
and 2022-01-09
.