Skip to main content
Version: 2.x

Queries with Aggregate Expressions

An aggregate expression applies an aggregate function across the rows that a query selects. An aggregate function performs a calculation on a set of values and returns a single value, such as the sum or average of the set of values. The syntax of an aggregate expression is one of the following:

  • aggregate_name(expression [ , ... ] ) [ FILTER ( WHERE filter_clause ) ] — operates across all input rows for which the expected result value is non-null. ALL is the default.
  • aggregate_name(ALL expression [ , ... ] ) [ FILTER ( WHERE filter_clause ) ] — operates identically to the first form because ALL is the default.
  • aggregate_name(DISTINCT expression [ , ... ] ) [ FILTER ( WHERE filter_clause ) ] — operates across all distinct non-null values of input rows.
  • aggregate_name(*) [ FILTER ( WHERE filter_clause ) ] — operates on all rows with values both null and non-null. Generally, this form is most useful for the count(*) aggregate function.

Where aggregate_name is a previously defined aggregate (possibly schema-qualified) and expression is any value expression that does not contain an aggregate expression.

For example, count(*) yields the total number of input rows, count(f1) yields the number of input rows in which f1 is non-null, andcount(distinct f1) yields the number of distinct non-null values of f1.

If FILTER is specified, then only the input rows for which the filter_clause evaluates to true are fed to the aggregate function; other rows are discarded. For example:

SELECT
count(*) AS unfiltered,
count(*) FILTER (WHERE i < 5) AS filtered
FROM generate_series(1,10) AS s(i);
unfiltered | filtered
------------+----------
10 | 4
(1 row)

You can also add custom aggregate functions.

Apache Cloudberry provides the MEDIAN aggregate function, which returns the fiftieth percentile of the PERCENTILE_CONT result and special aggregate expressions for inverse distribution functions as follows:

PERCENTILE_CONT(<percentage>) WITHIN GROUP (ORDER BY <expression>)
PERCENTILE_DISC(<percentage>) WITHIN GROUP (ORDER BY <expression>)

Currently you can use only these two expressions with the keyword WITHIN GROUP.

Limitations of aggregate expressions

The following are current limitations of the aggregate expressions:

  • Apache Cloudberry does not support the following keywords: ALL, DISTINCT, and OVER.
  • An aggregate expression can appear only in the result list or HAVING clause of a SELECT command. It is forbidden in other clauses, such as WHERE, because those clauses are logically evaluated before the results of aggregates form. This restriction applies to the query level to which the aggregate belongs.
  • When an aggregate expression appears in a subquery, the aggregate is normally evaluated over the rows of the subquery. If the aggregate's arguments (and filter_clause if any) contain only outer-level variables, the aggregate belongs to the nearest such outer level and evaluates over the rows of that query. The aggregate expression as a whole is then an outer reference for the subquery in which it appears, and the aggregate expression acts as a constant over any one evaluation of that subquery. The restriction about appearing only in the result list or HAVING clause applies with respect to the query level at which the aggregate appears.
  • Apache Cloudberry does not support specifying an aggregate function as an argument to another aggregate function.
  • Apache Cloudberry does not support specifying a window function as an argument to an aggregate function.