CREATE_AGGREGATE(7)

1CREATE AGGREGATE(7)      PostgreSQL 10.7 Documentation     CREATE AGGREGATE(7)
2
3
4

NAME

6       CREATE_AGGREGATE - define a new aggregate function
7

SYNOPSIS

9       CREATE AGGREGATE name ( [ argmode ] [ argname ] arg_data_type [ , ... ] ) (
10           SFUNC = sfunc,
11           STYPE = state_data_type
12           [ , SSPACE = state_data_size ]
13           [ , FINALFUNC = ffunc ]
14           [ , FINALFUNC_EXTRA ]
15           [ , COMBINEFUNC = combinefunc ]
16           [ , SERIALFUNC = serialfunc ]
17           [ , DESERIALFUNC = deserialfunc ]
18           [ , INITCOND = initial_condition ]
19           [ , MSFUNC = msfunc ]
20           [ , MINVFUNC = minvfunc ]
21           [ , MSTYPE = mstate_data_type ]
22           [ , MSSPACE = mstate_data_size ]
23           [ , MFINALFUNC = mffunc ]
24           [ , MFINALFUNC_EXTRA ]
25           [ , MINITCOND = minitial_condition ]
26           [ , SORTOP = sort_operator ]
27           [ , PARALLEL = { SAFE | RESTRICTED | UNSAFE } ]
28       )
29
30       CREATE AGGREGATE name ( [ [ argmode ] [ argname ] arg_data_type [ , ... ] ]
31                               ORDER BY [ argmode ] [ argname ] arg_data_type [ , ... ] ) (
32           SFUNC = sfunc,
33           STYPE = state_data_type
34           [ , SSPACE = state_data_size ]
35           [ , FINALFUNC = ffunc ]
36           [ , FINALFUNC_EXTRA ]
37           [ , INITCOND = initial_condition ]
38           [ , PARALLEL = { SAFE | RESTRICTED | UNSAFE } ]
39           [ , HYPOTHETICAL ]
40       )
41
42       or the old syntax
43
44       CREATE AGGREGATE name (
45           BASETYPE = base_type,
46           SFUNC = sfunc,
47           STYPE = state_data_type
48           [ , SSPACE = state_data_size ]
49           [ , FINALFUNC = ffunc ]
50           [ , FINALFUNC_EXTRA ]
51           [ , COMBINEFUNC = combinefunc ]
52           [ , SERIALFUNC = serialfunc ]
53           [ , DESERIALFUNC = deserialfunc ]
54           [ , INITCOND = initial_condition ]
55           [ , MSFUNC = msfunc ]
56           [ , MINVFUNC = minvfunc ]
57           [ , MSTYPE = mstate_data_type ]
58           [ , MSSPACE = mstate_data_size ]
59           [ , MFINALFUNC = mffunc ]
60           [ , MFINALFUNC_EXTRA ]
61           [ , MINITCOND = minitial_condition ]
62           [ , SORTOP = sort_operator ]
63       )
64

DESCRIPTION

66       CREATE AGGREGATE defines a new aggregate function. Some basic and
67       commonly-used aggregate functions are included with the distribution;
68       they are documented in Section 9.20. If one defines new types or needs
69       an aggregate function not already provided, then CREATE AGGREGATE can
70       be used to provide the desired features.
71
72       If a schema name is given (for example, CREATE AGGREGATE myschema.myagg
73       ...) then the aggregate function is created in the specified schema.
74       Otherwise it is created in the current schema.
75
76       An aggregate function is identified by its name and input data type(s).
77       Two aggregates in the same schema can have the same name if they
78       operate on different input types. The name and input data type(s) of an
79       aggregate must also be distinct from the name and input data type(s) of
80       every ordinary function in the same schema. This behavior is identical
81       to overloading of ordinary function names (see CREATE FUNCTION
82       (CREATE_FUNCTION(7))).
83
84       A simple aggregate function is made from one or two ordinary functions:
85       a state transition function sfunc, and an optional final calculation
86       function ffunc. These are used as follows:
87
88           sfunc( internal-state, next-data-values ) ---> next-internal-state
89           ffunc( internal-state ) ---> aggregate-value
90
91       PostgreSQL creates a temporary variable of data type stype to hold the
92       current internal state of the aggregate. At each input row, the
93       aggregate argument value(s) are calculated and the state transition
94       function is invoked with the current state value and the new argument
95       value(s) to calculate a new internal state value. After all the rows
96       have been processed, the final function is invoked once to calculate
97       the aggregate's return value. If there is no final function then the
98       ending state value is returned as-is.
99
100       An aggregate function can provide an initial condition, that is, an
101       initial value for the internal state value. This is specified and
102       stored in the database as a value of type text, but it must be a valid
103       external representation of a constant of the state value data type. If
104       it is not supplied then the state value starts out null.
105
106       If the state transition function is declared “strict”, then it cannot
107       be called with null inputs. With such a transition function, aggregate
108       execution behaves as follows. Rows with any null input values are
109       ignored (the function is not called and the previous state value is
110       retained). If the initial state value is null, then at the first row
111       with all-nonnull input values, the first argument value replaces the
112       state value, and the transition function is invoked at each subsequent
113       row with all-nonnull input values. This is handy for implementing
114       aggregates like max. Note that this behavior is only available when
115       state_data_type is the same as the first arg_data_type. When these
116       types are different, you must supply a nonnull initial condition or use
117       a nonstrict transition function.
118
119       If the state transition function is not strict, then it will be called
120       unconditionally at each input row, and must deal with null inputs and
121       null state values for itself. This allows the aggregate author to have
122       full control over the aggregate's handling of null values.
123
124       If the final function is declared “strict”, then it will not be called
125       when the ending state value is null; instead a null result will be
126       returned automatically. (Of course this is just the normal behavior of
127       strict functions.) In any case the final function has the option of
128       returning a null value. For example, the final function for avg returns
129       null when it sees there were zero input rows.
130
131       Sometimes it is useful to declare the final function as taking not just
132       the state value, but extra parameters corresponding to the aggregate's
133       input values. The main reason for doing this is if the final function
134       is polymorphic and the state value's data type would be inadequate to
135       pin down the result type. These extra parameters are always passed as
136       NULL (and so the final function must not be strict when the
137       FINALFUNC_EXTRA option is used), but nonetheless they are valid
138       parameters. The final function could for example make use of
139       get_fn_expr_argtype to identify the actual argument type in the current
140       call.
141
142       An aggregate can optionally support moving-aggregate mode, as described
143       in Section 37.10.1. This requires specifying the MSFUNC, MINVFUNC, and
144       MSTYPE parameters, and optionally the MSPACE, MFINALFUNC,
145       MFINALFUNC_EXTRA, and MINITCOND parameters. Except for MINVFUNC, these
146       parameters work like the corresponding simple-aggregate parameters
147       without M; they define a separate implementation of the aggregate that
148       includes an inverse transition function.
149
150       The syntax with ORDER BY in the parameter list creates a special type
151       of aggregate called an ordered-set aggregate; or if HYPOTHETICAL is
152       specified, then a hypothetical-set aggregate is created. These
153       aggregates operate over groups of sorted values in order-dependent
154       ways, so that specification of an input sort order is an essential part
155       of a call. Also, they can have direct arguments, which are arguments
156       that are evaluated only once per aggregation rather than once per input
157       row. Hypothetical-set aggregates are a subclass of ordered-set
158       aggregates in which some of the direct arguments are required to match,
159       in number and data types, the aggregated argument columns. This allows
160       the values of those direct arguments to be added to the collection of
161       aggregate-input rows as an additional “hypothetical” row.
162
163       An aggregate can optionally support partial aggregation, as described
164       in Section 37.10.4. This requires specifying the COMBINEFUNC parameter.
165       If the state_data_type is internal, it's usually also appropriate to
166       provide the SERIALFUNC and DESERIALFUNC parameters so that parallel
167       aggregation is possible. Note that the aggregate must also be marked
168       PARALLEL SAFE to enable parallel aggregation.
169
170       Aggregates that behave like MIN or MAX can sometimes be optimized by
171       looking into an index instead of scanning every input row. If this
172       aggregate can be so optimized, indicate it by specifying a sort
173       operator. The basic requirement is that the aggregate must yield the
174       first element in the sort ordering induced by the operator; in other
175       words:
176
177           SELECT agg(col) FROM tab;
178
179       must be equivalent to:
180
181           SELECT col FROM tab ORDER BY col USING sortop LIMIT 1;
182
183       Further assumptions are that the aggregate ignores null inputs, and
184       that it delivers a null result if and only if there were no non-null
185       inputs. Ordinarily, a data type's < operator is the proper sort
186       operator for MIN, and > is the proper sort operator for MAX. Note that
187       the optimization will never actually take effect unless the specified
188       operator is the “less than” or “greater than” strategy member of a
189       B-tree index operator class.
190
191       To be able to create an aggregate function, you must have USAGE
192       privilege on the argument types, the state type(s), and the return
193       type, as well as EXECUTE privilege on the supporting functions.
194

PARAMETERS

196       name
197           The name (optionally schema-qualified) of the aggregate function to
198           create.
199
200       argmode
201           The mode of an argument: IN or VARIADIC. (Aggregate functions do
202           not support OUT arguments.) If omitted, the default is IN. Only the
203           last argument can be marked VARIADIC.
204
205       argname
206           The name of an argument. This is currently only useful for
207           documentation purposes. If omitted, the argument has no name.
208
209       arg_data_type
210           An input data type on which this aggregate function operates. To
211           create a zero-argument aggregate function, write * in place of the
212           list of argument specifications. (An example of such an aggregate
213           is count(*).)
214
215       base_type
216           In the old syntax for CREATE AGGREGATE, the input data type is
217           specified by a basetype parameter rather than being written next to
218           the aggregate name. Note that this syntax allows only one input
219           parameter. To define a zero-argument aggregate function with this
220           syntax, specify the basetype as "ANY" (not *). Ordered-set
221           aggregates cannot be defined with the old syntax.
222
223       sfunc
224           The name of the state transition function to be called for each
225           input row. For a normal N-argument aggregate function, the sfunc
226           must take N+1 arguments, the first being of type state_data_type
227           and the rest matching the declared input data type(s) of the
228           aggregate. The function must return a value of type
229           state_data_type. This function takes the current state value and
230           the current input data value(s), and returns the next state value.
231
232           For ordered-set (including hypothetical-set) aggregates, the state
233           transition function receives only the current state value and the
234           aggregated arguments, not the direct arguments. Otherwise it is the
235           same.
236
237       state_data_type
238           The data type for the aggregate's state value.
239
240       state_data_size
241           The approximate average size (in bytes) of the aggregate's state
242           value. If this parameter is omitted or is zero, a default estimate
243           is used based on the state_data_type. The planner uses this value
244           to estimate the memory required for a grouped aggregate query. The
245           planner will consider using hash aggregation for such a query only
246           if the hash table is estimated to fit in work_mem; therefore, large
247           values of this parameter discourage use of hash aggregation.
248
249       ffunc
250           The name of the final function called to compute the aggregate's
251           result after all input rows have been traversed. For a normal
252           aggregate, this function must take a single argument of type
253           state_data_type. The return data type of the aggregate is defined
254           as the return type of this function. If ffunc is not specified,
255           then the ending state value is used as the aggregate's result, and
256           the return type is state_data_type.
257
258           For ordered-set (including hypothetical-set) aggregates, the final
259           function receives not only the final state value, but also the
260           values of all the direct arguments.
261
262           If FINALFUNC_EXTRA is specified, then in addition to the final
263           state value and any direct arguments, the final function receives
264           extra NULL values corresponding to the aggregate's regular
265           (aggregated) arguments. This is mainly useful to allow correct
266           resolution of the aggregate result type when a polymorphic
267           aggregate is being defined.
268
269       combinefunc
270           The combinefunc function may optionally be specified to allow the
271           aggregate function to support partial aggregation. If provided, the
272           combinefunc must combine two state_data_type values, each
273           containing the result of aggregation over some subset of the input
274           values, to produce a new state_data_type that represents the result
275           of aggregating over both sets of inputs. This function can be
276           thought of as an sfunc, where instead of acting upon an individual
277           input row and adding it to the running aggregate state, it adds
278           another aggregate state to the running state.
279
280           The combinefunc must be declared as taking two arguments of the
281           state_data_type and returning a value of the state_data_type.
282           Optionally this function may be “strict”. In this case the function
283           will not be called when either of the input states are null; the
284           other state will be taken as the correct result.
285
286           For aggregate functions whose state_data_type is internal, the
287           combinefunc must not be strict. In this case the combinefunc must
288           ensure that null states are handled correctly and that the state
289           being returned is properly stored in the aggregate memory context.
290
291       serialfunc
292           An aggregate function whose state_data_type is internal can
293           participate in parallel aggregation only if it has a serialfunc
294           function, which must serialize the aggregate state into a bytea
295           value for transmission to another process. This function must take
296           a single argument of type internal and return type bytea. A
297           corresponding deserialfunc is also required.
298
299       deserialfunc
300           Deserialize a previously serialized aggregate state back into
301           state_data_type. This function must take two arguments of types
302           bytea and internal, and produce a result of type internal. (Note:
303           the second, internal argument is unused, but is required for type
304           safety reasons.)
305
306       initial_condition
307           The initial setting for the state value. This must be a string
308           constant in the form accepted for the data type state_data_type. If
309           not specified, the state value starts out null.
310
311       msfunc
312           The name of the forward state transition function to be called for
313           each input row in moving-aggregate mode. This is exactly like the
314           regular transition function, except that its first argument and
315           result are of type mstate_data_type, which might be different from
316           state_data_type.
317
318       minvfunc
319           The name of the inverse state transition function to be used in
320           moving-aggregate mode. This function has the same argument and
321           result types as msfunc, but it is used to remove a value from the
322           current aggregate state, rather than add a value to it. The inverse
323           transition function must have the same strictness attribute as the
324           forward state transition function.
325
326       mstate_data_type
327           The data type for the aggregate's state value, when using
328           moving-aggregate mode.
329
330       mstate_data_size
331           The approximate average size (in bytes) of the aggregate's state
332           value, when using moving-aggregate mode. This works the same as
333           state_data_size.
334
335       mffunc
336           The name of the final function called to compute the aggregate's
337           result after all input rows have been traversed, when using
338           moving-aggregate mode. This works the same as ffunc, except that
339           its first argument's type is mstate_data_type and extra dummy
340           arguments are specified by writing MFINALFUNC_EXTRA. The aggregate
341           result type determined by mffunc or mstate_data_type must match
342           that determined by the aggregate's regular implementation.
343
344       minitial_condition
345           The initial setting for the state value, when using
346           moving-aggregate mode. This works the same as initial_condition.
347
348       sort_operator
349           The associated sort operator for a MIN- or MAX-like aggregate. This
350           is just an operator name (possibly schema-qualified). The operator
351           is assumed to have the same input data types as the aggregate
352           (which must be a single-argument normal aggregate).
353
354       PARALLEL
355           The meanings of PARALLEL SAFE, PARALLEL RESTRICTED, and PARALLEL
356           UNSAFE are the same as for CREATE FUNCTION (CREATE_FUNCTION(7)). An
357           aggregate will not be considered for parallelization if it is
358           marked PARALLEL UNSAFE (which is the default!) or PARALLEL
359           RESTRICTED. Note that the parallel-safety markings of the
360           aggregate's support functions are not consulted by the planner,
361           only the marking of the aggregate itself.
362
363       HYPOTHETICAL
364           For ordered-set aggregates only, this flag specifies that the
365           aggregate arguments are to be processed according to the
366           requirements for hypothetical-set aggregates: that is, the last few
367           direct arguments must match the data types of the aggregated
368           (WITHIN GROUP) arguments. The HYPOTHETICAL flag has no effect on
369           run-time behavior, only on parse-time resolution of the data types
370           and collations of the aggregate's arguments.
371
372       The parameters of CREATE AGGREGATE can be written in any order, not
373       just the order illustrated above.
374

NOTES

376       In parameters that specify support function names, you can write a
377       schema name if needed, for example SFUNC = public.sum. Do not write
378       argument types there, however — the argument types of the support
379       functions are determined from other parameters.
380
381       If an aggregate supports moving-aggregate mode, it will improve
382       calculation efficiency when the aggregate is used as a window function
383       for a window with moving frame start (that is, a frame start mode other
384       than UNBOUNDED PRECEDING). Conceptually, the forward transition
385       function adds input values to the aggregate's state when they enter the
386       window frame from the bottom, and the inverse transition function
387       removes them again when they leave the frame at the top. So, when
388       values are removed, they are always removed in the same order they were
389       added. Whenever the inverse transition function is invoked, it will
390       thus receive the earliest added but not yet removed argument value(s).
391       The inverse transition function can assume that at least one row will
392       remain in the current state after it removes the oldest row. (When this
393       would not be the case, the window function mechanism simply starts a
394       fresh aggregation, rather than using the inverse transition function.)
395
396       The forward transition function for moving-aggregate mode is not
397       allowed to return NULL as the new state value. If the inverse
398       transition function returns NULL, this is taken as an indication that
399       the inverse function cannot reverse the state calculation for this
400       particular input, and so the aggregate calculation will be redone from
401       scratch for the current frame starting position. This convention allows
402       moving-aggregate mode to be used in situations where there are some
403       infrequent cases that are impractical to reverse out of the running
404       state value.
405
406       If no moving-aggregate implementation is supplied, the aggregate can
407       still be used with moving frames, but PostgreSQL will recompute the
408       whole aggregation whenever the start of the frame moves. Note that
409       whether or not the aggregate supports moving-aggregate mode, PostgreSQL
410       can handle a moving frame end without recalculation; this is done by
411       continuing to add new values to the aggregate's state. It is assumed
412       that the final function does not damage the aggregate's state value, so
413       that the aggregation can be continued even after an aggregate result
414       value has been obtained for one set of frame boundaries.
415
416       The syntax for ordered-set aggregates allows VARIADIC to be specified
417       for both the last direct parameter and the last aggregated (WITHIN
418       GROUP) parameter. However, the current implementation restricts use of
419       VARIADIC in two ways. First, ordered-set aggregates can only use
420       VARIADIC "any", not other variadic array types. Second, if the last
421       direct parameter is VARIADIC "any", then there can be only one
422       aggregated parameter and it must also be VARIADIC "any". (In the
423       representation used in the system catalogs, these two parameters are
424       merged into a single VARIADIC "any" item, since pg_proc cannot
425       represent functions with more than one VARIADIC parameter.) If the
426       aggregate is a hypothetical-set aggregate, the direct arguments that
427       match the VARIADIC "any" parameter are the hypothetical ones; any
428       preceding parameters represent additional direct arguments that are not
429       constrained to match the aggregated arguments.
430
431       Currently, ordered-set aggregates do not need to support
432       moving-aggregate mode, since they cannot be used as window functions.
433
434       Partial (including parallel) aggregation is currently not supported for
435       ordered-set aggregates. Also, it will never be used for aggregate calls
436       that include DISTINCT or ORDER BY clauses, since those semantics cannot
437       be supported during partial aggregation.
438

EXAMPLES

440       See Section 37.10.
441

COMPATIBILITY

443       CREATE AGGREGATE is a PostgreSQL language extension. The SQL standard
444       does not provide for user-defined aggregate functions.
445