CREATE FUNCTION
Defines a new function.
Synopsis
CREATE [OR REPLACE] FUNCTION <name>
( [ [<argmode>] [<argname>] <argtype> [ { DEFAULT | = } <default_expr> ] [, ...] ] )
[ RETURNS <rettype>
| RETURNS TABLE ( <column_name> <column_type> [, ...] ) ]
{ LANGUAGE <lang_name>
| TRANSFORM { FOR TYPE <type_name> } [, ... ]
| WINDOW
| { IMMUTABLE | STABLE | VOLATILE }
| [ NOT ] LEAKPROOF
| { CALLED ON NULL INPUT | RETURNS NULL ON NULL INPUT | STRICT }
| { [ EXTERNAL ] SECURITY INVOKER | [ EXTERNAL ] SECURITY DEFINER }
| EXECUTE ON { ANY | COORDINATOR | ALL SEGMENTS | INITPLAN }
| PARALLEL { UNSAFE | RESTRICTED | SAFE }
| COST <execution_cost>
| ROWS <result_rows>
| SUPPORT <support_function>
| SET <configuration_parameter> { TO <value> | = <value> | FROM CURRENT }
| AS '<definition>'
| AS '<obj_file>', '<link_symbol>'
[ WITH ({ DESCRIBE = <describe_function> } [, ...] ) ]
} ...
Description
CREATE FUNCTION
defines a new function. CREATE OR REPLACE FUNCTION
either creates a new function, or replaces an existing definition. To define a function, the user must have the USAGE
privilege on the language.
If a schema name is included, then the function is created in the specified schema. Otherwise it is created in the current schema. The name of the new function must not match any existing function with the same input argument types in the same schema. However, functions of different argument types may share a name (overloading).
To update the current definition of an existing function, use CREATE OR REPLACE FUNCTION
. It is not possible to change the name or argument types of a function this way (this would actually create a new, distinct function). Also, CREATE OR REPLACE FUNCTION
will not let you change the return type of an existing function. To do that, you must drop and recreate the function. (When using OUT
parameters, that means you cannot change the types of any OUT
parameters except by dropping the function.)
When CREATE OR REPLACE FUNCTION
is used to replace an existing function, the ownership and permissions of the function do not change. All other function properties are assigned the values specified or implied in the command. You must own the function to replace it (this includes being a member of the owning role).
If you drop and then recreate a function, the new function is not the same entity as the old; you will have to drop existing objects (rules, views, triggers, and so on) that refer to the old function. Use CREATE OR REPLACE FUNCTION
to change a function definition without breaking objects that refer to the function. Also, ALTER FUNCTION can be used to change most of the auxiliary properties of an existing function.
The user that creates the function becomes the owner of the function.
To be able to create a function, you must have USAGE
privilege on the argument types and the return type.
For more information about creating functions, refer to the User Defined Functions section of the PostgreSQL documentation.
Limited Use of VOLATILE and STABLE Functions
To prevent data from becoming out-of-sync across the segments in Apache Cloudberry, any function classified as STABLE
or VOLATILE
cannot be run at the segment level if it contains SQL or modifies the database in any way. For example, functions such as random()
or timeofday()
are not allowed to run on distributed data in Apache Cloudberry because they could potentially cause inconsistent data between the segment instances.
To ensure data consistency, VOLATILE
and STABLE
functions can safely be used in statements that are evaluated on and run from the coordinator. For example, the following statements are always run on the coordinator (statements without a FROM
clause):
SELECT setval('myseq', 201);
SELECT foo();
In cases where a statement has a FROM
clause containing a distributed table and the function used in the FROM
clause simply returns a set of rows, execution may be allowed on the segments:
SELECT * FROM foo();
One exception to this rule are functions that return a table reference (rangeFuncs
) or functions that use the refCursor
data type. Note that you cannot return a refcursor
from any kind of function in Apache Cloudberry.
Function Volatility and EXECUTE ON Attributes
Volatility attributes (IMMUTABLE
, STABLE
, VOLATILE
) and EXECUTE ON
attributes specify two different aspects of function execution. In general, volatility indicates when the function is run, and EXECUTE ON
indicates where it is run.
For example, a function defined with the IMMUTABLE
attribute can be run at query planning time, while a function with the VOLATILE
attribute must be run for every row in the query. A function with the EXECUTE ON COORDINATOR
attribute is run only on the coordinator segment and a function with the EXECUTE ON ALL SEGMENTS
attribute is run on all primary segment instances (not the coordinator).
Functions And Replicated Tables
A user-defined function that runs only SELECT
commands on replicated tables can run on segments. Replicated tables, created with the DISTRIBUTED REPLICATED
clause, store all of their rows on every segment. It is safe for a function to read them on the segments, but updates to replicated tables must run on the coordinator instance.
Parameters
name
The name (optionally schema-qualified) of the function to create.
argmode
The mode of an argument: either IN
, OUT
, INOUT
, or VARIADIC
. If omitted, the default is IN
. Only OUT
arguments can follow an argument declared as VARIADIC
. Also, OUT
and INOUT
arguments cannot be used together with the RETURNS TABLE
notation.
argname
The name of an argument. Some languages (currently only SQL and PL/pgSQL) let you use the name in the function body. For other languages the name of an input argument is just extra documentation, so far as the function itself is concerned; but you can use input argument names when calling a function to improve readability. In any case, the name of an output argument is significant, since it defines the column name in the result row type. (If you omit the name for an output argument, the system will choose a default column name.)
argtype
The data type(s) of the function's arguments (optionally schema-qualified), if any. The argument types may be base, composite, or domain types, or may reference the type of a table column.
Depending on the implementation language it may also be allowed to specify pseudotypes such as cstring
. Pseudotypes indicate that the actual argument type is either incompletely specified, or outside the set of ordinary SQL data types.
The type of a column is referenced by writing table_name.column_name%TYPE
. Using this feature can sometimes help make a function independent of changes to the definition of a table.
default_expr
An expression to be used as the default value if the parameter is not specified. The expression must be coercible to the argument type of the parameter. Only input (including INOUT
) parameters can have a default value. Each input parameter in the argument list that follows a parameter with a default value must have a default value as well.
rettype
The return data type (optionally schema-qualified). The return type can be a base, composite, or domain type, or may reference the type of a table column. Depending on the implementation language it may also be allowed to specify pseudotypes such as cstring
. If the function is not supposed to return a value, specify void
as the return type.
When there are OUT
or INOUT
parameters, the RETURNS
clause may be omitted. If present, it must agree with the result type implied by the output parameters: RECORD
if there are multiple output parameters, or the same type as the single output parameter.
The SETOF
modifier indicates that the function will return a set of items, rather than a single item.
The type of a column is referenced by writing table_name.column_name%TYPE
.
column_name
The name of an output column in the RETURNS TABLE
syntax. This is effectively another way of declaring a named OUT
parameter, except that RETURNS TABLE
also implies RETURNS SETOF
.
column_type
The data type of an output column in the RETURNS TABLE
syntax.
lang_name
The name of the language that the function is implemented in. May be SQL
, C
, internal
, or the name of a user-defined procedural language, e.g. plpgsql
. Enclosing the name in single quotes is deprecated and requires matching case.
TRANSFORM { FOR TYPE type_name } [, ... ] }
Lists which transforms a call to the function should apply. Transforms convert between SQL types and language-specific data types. Procedural language implementations usually have hardcoded knowledge of the built-in types, so those don't need to be listed here. If a procedural language implementation does not know how to handle a type and no transform is supplied, it will fall back to a default behavior for converting data types, but this depends on the implementation.
WINDOW
WINDOW
indicates that the function is a window function rather than a plain function. This is currently only useful for functions written in C. The WINDOW
attribute cannot be changed when replacing an existing function definition.
IMMUTABLE
STABLE
VOLATILE
These attributes inform the query optimizer about the behavior of the function. At most one choice may be specified. If none of these appear, VOLATILE
is the default assumption. Since Apache Cloudberry currently has limited use of VOLATILE
functions, if a function is truly IMMUTABLE
, you must declare it as so to be able to use it without restrictions.
IMMUTABLE
indicates that the function cannot modify the database and always returns the same result when given the same argument values. It does not do database lookups or otherwise use information not directly present in its argument list. If this option is given, any call of the function with all-constant arguments can be immediately replaced with the function value.
STABLE
indicates that the function cannot modify the database, and that within a single table scan it will consistently return the same result for the same argument values, but that its result could change across SQL statements. This is the appropriate selection for functions whose results depend on database lookups, parameter variables (such as the current time zone), and so on. Also note that the current_timestamp()
family of functions qualify as stable, since their values do not change within a transaction.
VOLATILE
indicates that the function value can change even within a single table scan, so no optimizations can be made. Relatively few database functions are volatile in this sense; some examples are random()
, timeofday()
. But note that any function that has side-effects must be classified volatile, even if its result is quite predictable, to prevent calls from being optimized away; an example is setval()
.
LEAKPROOF
LEAKPROOF
indicates that the function has no side effects. It reveals no information about its arguments other than by its return value. For example, a function that throws an error message for some argument values but not others, or that includes the argument values in any error message, is not leakproof. This affects how the system executes queries against views created with the security_barrier
option or tables with row level security enabled. The system will enforce conditions from security policies and security barrier views before any user-supplied conditions from the query itself that contain non-leakproof functions, in order to prevent the inadvertent exposure of data. Functions and operators marked as leakproof are assumed to be trustworthy, and may be executed before conditions from security policies and security barrier views. In addition, functions which do not take arguments or which are not passed any arguments from the security barrier view or table do not have to be marked as leakproof to be executed before security conditions. See CREATE VIEW. This option can only be set by the superuser.
CALLED ON NULL INPUT
RETURNS NULL ON NULL INPUT
STRICT
CALLED ON NULL INPUT
(the default) indicates that the function will be called normally when some of its arguments are null. It is then the function author's responsibility to check for null values if necessary and respond appropriately.
RETURNS NULL ON NULL INPUT
or STRICT
indicates that the function always returns null whenever any of its arguments are null. If this parameter is specified, the function is not run when there are null arguments; instead a null result is assumed automatically.
[EXTERNAL] SECURITY INVOKER
[EXTERNAL] SECURITY DEFINER
SECURITY INVOKER
(the default) indicates that the function is to be run with the privileges of the user that calls it.
SECURITY DEFINER
specifies that the function is to be run with the privileges of the user that created it.
The key word EXTERNAL
is allowed for SQL conformance, but it is optional since, unlike in SQL, this feature applies to all functions not just external ones.
EXECUTE ON ANY
EXECUTE ON COORDINATOR
EXECUTE ON ALL SEGMENTS
EXECUTE ON INITPLAN
The EXECUTE ON
attributes specify where (coordinator or segment instance) a function runs when it is invoked during the query execution process.
EXECUTE ON ANY
(the default) indicates that the function can be run on the coordinator, or any segment instance, and it returns the same result regardless of where it is run. Apache Cloudberry determines where the function runs.
EXECUTE ON COORDINATOR
indicates that the function must run only on the coordinator instance.
EXECUTE ON ALL SEGMENTS
indicates that the function must run on all primary segment instances, but not the coordinator, for each invocation. The overall result of the function is the UNION ALL
of the results from all segment instances.
EXECUTE ON INITPLAN
indicates that the function contains an SQL command that dispatches queries to the segment instances and requires special processing on the coordinator instance by Apache Cloudberry when possible.
Note
EXECUTE ON INITPLAN
is only supported in functions that are used in theFROM
clause of aCREATE TABLE AS
orINSERT
command such as theget_data()
function in these commands.
CREATE TABLE t AS SELECT * FROM get_data();
INSERT INTO t1 SELECT * FROM get_data();