1CREATE AGGREGATE(7) PostgreSQL 10.7 Documentation CREATE AGGREGATE(7)
2
3
4
6 CREATE_AGGREGATE - define a new aggregate function
7
9 CREATE AGGREGATE name ( [ argmode ] [ argname ] arg_data_type [ , ... ] ) (
10 SFUNC = sfunc,
11 STYPE = state_data_type
12 [ , SSPACE = state_data_size ]
13 [ , FINALFUNC = ffunc ]
14 [ , FINALFUNC_EXTRA ]
15 [ , COMBINEFUNC = combinefunc ]
16 [ , SERIALFUNC = serialfunc ]
17 [ , DESERIALFUNC = deserialfunc ]
18 [ , INITCOND = initial_condition ]
19 [ , MSFUNC = msfunc ]
20 [ , MINVFUNC = minvfunc ]
21 [ , MSTYPE = mstate_data_type ]
22 [ , MSSPACE = mstate_data_size ]
23 [ , MFINALFUNC = mffunc ]
24 [ , MFINALFUNC_EXTRA ]
25 [ , MINITCOND = minitial_condition ]
26 [ , SORTOP = sort_operator ]
27 [ , PARALLEL = { SAFE | RESTRICTED | UNSAFE } ]
28 )
29
30 CREATE AGGREGATE name ( [ [ argmode ] [ argname ] arg_data_type [ , ... ] ]
31 ORDER BY [ argmode ] [ argname ] arg_data_type [ , ... ] ) (
32 SFUNC = sfunc,
33 STYPE = state_data_type
34 [ , SSPACE = state_data_size ]
35 [ , FINALFUNC = ffunc ]
36 [ , FINALFUNC_EXTRA ]
37 [ , INITCOND = initial_condition ]
38 [ , PARALLEL = { SAFE | RESTRICTED | UNSAFE } ]
39 [ , HYPOTHETICAL ]
40 )
41
42 or the old syntax
43
44 CREATE AGGREGATE name (
45 BASETYPE = base_type,
46 SFUNC = sfunc,
47 STYPE = state_data_type
48 [ , SSPACE = state_data_size ]
49 [ , FINALFUNC = ffunc ]
50 [ , FINALFUNC_EXTRA ]
51 [ , COMBINEFUNC = combinefunc ]
52 [ , SERIALFUNC = serialfunc ]
53 [ , DESERIALFUNC = deserialfunc ]
54 [ , INITCOND = initial_condition ]
55 [ , MSFUNC = msfunc ]
56 [ , MINVFUNC = minvfunc ]
57 [ , MSTYPE = mstate_data_type ]
58 [ , MSSPACE = mstate_data_size ]
59 [ , MFINALFUNC = mffunc ]
60 [ , MFINALFUNC_EXTRA ]
61 [ , MINITCOND = minitial_condition ]
62 [ , SORTOP = sort_operator ]
63 )
64
66 CREATE AGGREGATE defines a new aggregate function. Some basic and
67 commonly-used aggregate functions are included with the distribution;
68 they are documented in Section 9.20. If one defines new types or needs
69 an aggregate function not already provided, then CREATE AGGREGATE can
70 be used to provide the desired features.
71
72 If a schema name is given (for example, CREATE AGGREGATE myschema.myagg
73 ...) then the aggregate function is created in the specified schema.
74 Otherwise it is created in the current schema.
75
76 An aggregate function is identified by its name and input data type(s).
77 Two aggregates in the same schema can have the same name if they
78 operate on different input types. The name and input data type(s) of an
79 aggregate must also be distinct from the name and input data type(s) of
80 every ordinary function in the same schema. This behavior is identical
81 to overloading of ordinary function names (see CREATE FUNCTION
82 (CREATE_FUNCTION(7))).
83
84 A simple aggregate function is made from one or two ordinary functions:
85 a state transition function sfunc, and an optional final calculation
86 function ffunc. These are used as follows:
87
88 sfunc( internal-state, next-data-values ) ---> next-internal-state
89 ffunc( internal-state ) ---> aggregate-value
90
91 PostgreSQL creates a temporary variable of data type stype to hold the
92 current internal state of the aggregate. At each input row, the
93 aggregate argument value(s) are calculated and the state transition
94 function is invoked with the current state value and the new argument
95 value(s) to calculate a new internal state value. After all the rows
96 have been processed, the final function is invoked once to calculate
97 the aggregate's return value. If there is no final function then the
98 ending state value is returned as-is.
99
100 An aggregate function can provide an initial condition, that is, an
101 initial value for the internal state value. This is specified and
102 stored in the database as a value of type text, but it must be a valid
103 external representation of a constant of the state value data type. If
104 it is not supplied then the state value starts out null.
105
106 If the state transition function is declared “strict”, then it cannot
107 be called with null inputs. With such a transition function, aggregate
108 execution behaves as follows. Rows with any null input values are
109 ignored (the function is not called and the previous state value is
110 retained). If the initial state value is null, then at the first row
111 with all-nonnull input values, the first argument value replaces the
112 state value, and the transition function is invoked at each subsequent
113 row with all-nonnull input values. This is handy for implementing
114 aggregates like max. Note that this behavior is only available when
115 state_data_type is the same as the first arg_data_type. When these
116 types are different, you must supply a nonnull initial condition or use
117 a nonstrict transition function.
118
119 If the state transition function is not strict, then it will be called
120 unconditionally at each input row, and must deal with null inputs and
121 null state values for itself. This allows the aggregate author to have
122 full control over the aggregate's handling of null values.
123
124 If the final function is declared “strict”, then it will not be called
125 when the ending state value is null; instead a null result will be
126 returned automatically. (Of course this is just the normal behavior of
127 strict functions.) In any case the final function has the option of
128 returning a null value. For example, the final function for avg returns
129 null when it sees there were zero input rows.
130
131 Sometimes it is useful to declare the final function as taking not just
132 the state value, but extra parameters corresponding to the aggregate's
133 input values. The main reason for doing this is if the final function
134 is polymorphic and the state value's data type would be inadequate to
135 pin down the result type. These extra parameters are always passed as
136 NULL (and so the final function must not be strict when the
137 FINALFUNC_EXTRA option is used), but nonetheless they are valid
138 parameters. The final function could for example make use of
139 get_fn_expr_argtype to identify the actual argument type in the current
140 call.
141
142 An aggregate can optionally support moving-aggregate mode, as described
143 in Section 37.10.1. This requires specifying the MSFUNC, MINVFUNC, and
144 MSTYPE parameters, and optionally the MSPACE, MFINALFUNC,
145 MFINALFUNC_EXTRA, and MINITCOND parameters. Except for MINVFUNC, these
146 parameters work like the corresponding simple-aggregate parameters
147 without M; they define a separate implementation of the aggregate that
148 includes an inverse transition function.
149
150 The syntax with ORDER BY in the parameter list creates a special type
151 of aggregate called an ordered-set aggregate; or if HYPOTHETICAL is
152 specified, then a hypothetical-set aggregate is created. These
153 aggregates operate over groups of sorted values in order-dependent
154 ways, so that specification of an input sort order is an essential part
155 of a call. Also, they can have direct arguments, which are arguments
156 that are evaluated only once per aggregation rather than once per input
157 row. Hypothetical-set aggregates are a subclass of ordered-set
158 aggregates in which some of the direct arguments are required to match,
159 in number and data types, the aggregated argument columns. This allows
160 the values of those direct arguments to be added to the collection of
161 aggregate-input rows as an additional “hypothetical” row.
162
163 An aggregate can optionally support partial aggregation, as described
164 in Section 37.10.4. This requires specifying the COMBINEFUNC parameter.
165 If the state_data_type is internal, it's usually also appropriate to
166 provide the SERIALFUNC and DESERIALFUNC parameters so that parallel
167 aggregation is possible. Note that the aggregate must also be marked
168 PARALLEL SAFE to enable parallel aggregation.
169
170 Aggregates that behave like MIN or MAX can sometimes be optimized by
171 looking into an index instead of scanning every input row. If this
172 aggregate can be so optimized, indicate it by specifying a sort
173 operator. The basic requirement is that the aggregate must yield the
174 first element in the sort ordering induced by the operator; in other
175 words:
176
177 SELECT agg(col) FROM tab;
178
179 must be equivalent to:
180
181 SELECT col FROM tab ORDER BY col USING sortop LIMIT 1;
182
183 Further assumptions are that the aggregate ignores null inputs, and
184 that it delivers a null result if and only if there were no non-null
185 inputs. Ordinarily, a data type's < operator is the proper sort
186 operator for MIN, and > is the proper sort operator for MAX. Note that
187 the optimization will never actually take effect unless the specified
188 operator is the “less than” or “greater than” strategy member of a
189 B-tree index operator class.
190
191 To be able to create an aggregate function, you must have USAGE
192 privilege on the argument types, the state type(s), and the return
193 type, as well as EXECUTE privilege on the supporting functions.
194
196 name
197 The name (optionally schema-qualified) of the aggregate function to
198 create.
199
200 argmode
201 The mode of an argument: IN or VARIADIC. (Aggregate functions do
202 not support OUT arguments.) If omitted, the default is IN. Only the
203 last argument can be marked VARIADIC.
204
205 argname
206 The name of an argument. This is currently only useful for
207 documentation purposes. If omitted, the argument has no name.
208
209 arg_data_type
210 An input data type on which this aggregate function operates. To
211 create a zero-argument aggregate function, write * in place of the
212 list of argument specifications. (An example of such an aggregate
213 is count(*).)
214
215 base_type
216 In the old syntax for CREATE AGGREGATE, the input data type is
217 specified by a basetype parameter rather than being written next to
218 the aggregate name. Note that this syntax allows only one input
219 parameter. To define a zero-argument aggregate function with this
220 syntax, specify the basetype as "ANY" (not *). Ordered-set
221 aggregates cannot be defined with the old syntax.
222
223 sfunc
224 The name of the state transition function to be called for each
225 input row. For a normal N-argument aggregate function, the sfunc
226 must take N+1 arguments, the first being of type state_data_type
227 and the rest matching the declared input data type(s) of the
228 aggregate. The function must return a value of type
229 state_data_type. This function takes the current state value and
230 the current input data value(s), and returns the next state value.
231
232 For ordered-set (including hypothetical-set) aggregates, the state
233 transition function receives only the current state value and the
234 aggregated arguments, not the direct arguments. Otherwise it is the
235 same.
236
237 state_data_type
238 The data type for the aggregate's state value.
239
240 state_data_size
241 The approximate average size (in bytes) of the aggregate's state
242 value. If this parameter is omitted or is zero, a default estimate
243 is used based on the state_data_type. The planner uses this value
244 to estimate the memory required for a grouped aggregate query. The
245 planner will consider using hash aggregation for such a query only
246 if the hash table is estimated to fit in work_mem; therefore, large
247 values of this parameter discourage use of hash aggregation.
248
249 ffunc
250 The name of the final function called to compute the aggregate's
251 result after all input rows have been traversed. For a normal
252 aggregate, this function must take a single argument of type
253 state_data_type. The return data type of the aggregate is defined
254 as the return type of this function. If ffunc is not specified,
255 then the ending state value is used as the aggregate's result, and
256 the return type is state_data_type.
257
258 For ordered-set (including hypothetical-set) aggregates, the final
259 function receives not only the final state value, but also the
260 values of all the direct arguments.
261
262 If FINALFUNC_EXTRA is specified, then in addition to the final
263 state value and any direct arguments, the final function receives
264 extra NULL values corresponding to the aggregate's regular
265 (aggregated) arguments. This is mainly useful to allow correct
266 resolution of the aggregate result type when a polymorphic
267 aggregate is being defined.
268
269 combinefunc
270 The combinefunc function may optionally be specified to allow the
271 aggregate function to support partial aggregation. If provided, the
272 combinefunc must combine two state_data_type values, each
273 containing the result of aggregation over some subset of the input
274 values, to produce a new state_data_type that represents the result
275 of aggregating over both sets of inputs. This function can be
276 thought of as an sfunc, where instead of acting upon an individual
277 input row and adding it to the running aggregate state, it adds
278 another aggregate state to the running state.
279
280 The combinefunc must be declared as taking two arguments of the
281 state_data_type and returning a value of the state_data_type.
282 Optionally this function may be “strict”. In this case the function
283 will not be called when either of the input states are null; the
284 other state will be taken as the correct result.
285
286 For aggregate functions whose state_data_type is internal, the
287 combinefunc must not be strict. In this case the combinefunc must
288 ensure that null states are handled correctly and that the state
289 being returned is properly stored in the aggregate memory context.
290
291 serialfunc
292 An aggregate function whose state_data_type is internal can
293 participate in parallel aggregation only if it has a serialfunc
294 function, which must serialize the aggregate state into a bytea
295 value for transmission to another process. This function must take
296 a single argument of type internal and return type bytea. A
297 corresponding deserialfunc is also required.
298
299 deserialfunc
300 Deserialize a previously serialized aggregate state back into
301 state_data_type. This function must take two arguments of types
302 bytea and internal, and produce a result of type internal. (Note:
303 the second, internal argument is unused, but is required for type
304 safety reasons.)
305
306 initial_condition
307 The initial setting for the state value. This must be a string
308 constant in the form accepted for the data type state_data_type. If
309 not specified, the state value starts out null.
310
311 msfunc
312 The name of the forward state transition function to be called for
313 each input row in moving-aggregate mode. This is exactly like the
314 regular transition function, except that its first argument and
315 result are of type mstate_data_type, which might be different from
316 state_data_type.
317
318 minvfunc
319 The name of the inverse state transition function to be used in
320 moving-aggregate mode. This function has the same argument and
321 result types as msfunc, but it is used to remove a value from the
322 current aggregate state, rather than add a value to it. The inverse
323 transition function must have the same strictness attribute as the
324 forward state transition function.
325
326 mstate_data_type
327 The data type for the aggregate's state value, when using
328 moving-aggregate mode.
329
330 mstate_data_size
331 The approximate average size (in bytes) of the aggregate's state
332 value, when using moving-aggregate mode. This works the same as
333 state_data_size.
334
335 mffunc
336 The name of the final function called to compute the aggregate's
337 result after all input rows have been traversed, when using
338 moving-aggregate mode. This works the same as ffunc, except that
339 its first argument's type is mstate_data_type and extra dummy
340 arguments are specified by writing MFINALFUNC_EXTRA. The aggregate
341 result type determined by mffunc or mstate_data_type must match
342 that determined by the aggregate's regular implementation.
343
344 minitial_condition
345 The initial setting for the state value, when using
346 moving-aggregate mode. This works the same as initial_condition.
347
348 sort_operator
349 The associated sort operator for a MIN- or MAX-like aggregate. This
350 is just an operator name (possibly schema-qualified). The operator
351 is assumed to have the same input data types as the aggregate
352 (which must be a single-argument normal aggregate).
353
354 PARALLEL
355 The meanings of PARALLEL SAFE, PARALLEL RESTRICTED, and PARALLEL
356 UNSAFE are the same as for CREATE FUNCTION (CREATE_FUNCTION(7)). An
357 aggregate will not be considered for parallelization if it is
358 marked PARALLEL UNSAFE (which is the default!) or PARALLEL
359 RESTRICTED. Note that the parallel-safety markings of the
360 aggregate's support functions are not consulted by the planner,
361 only the marking of the aggregate itself.
362
363 HYPOTHETICAL
364 For ordered-set aggregates only, this flag specifies that the
365 aggregate arguments are to be processed according to the
366 requirements for hypothetical-set aggregates: that is, the last few
367 direct arguments must match the data types of the aggregated
368 (WITHIN GROUP) arguments. The HYPOTHETICAL flag has no effect on
369 run-time behavior, only on parse-time resolution of the data types
370 and collations of the aggregate's arguments.
371
372 The parameters of CREATE AGGREGATE can be written in any order, not
373 just the order illustrated above.
374
376 In parameters that specify support function names, you can write a
377 schema name if needed, for example SFUNC = public.sum. Do not write
378 argument types there, however — the argument types of the support
379 functions are determined from other parameters.
380
381 If an aggregate supports moving-aggregate mode, it will improve
382 calculation efficiency when the aggregate is used as a window function
383 for a window with moving frame start (that is, a frame start mode other
384 than UNBOUNDED PRECEDING). Conceptually, the forward transition
385 function adds input values to the aggregate's state when they enter the
386 window frame from the bottom, and the inverse transition function
387 removes them again when they leave the frame at the top. So, when
388 values are removed, they are always removed in the same order they were
389 added. Whenever the inverse transition function is invoked, it will
390 thus receive the earliest added but not yet removed argument value(s).
391 The inverse transition function can assume that at least one row will
392 remain in the current state after it removes the oldest row. (When this
393 would not be the case, the window function mechanism simply starts a
394 fresh aggregation, rather than using the inverse transition function.)
395
396 The forward transition function for moving-aggregate mode is not
397 allowed to return NULL as the new state value. If the inverse
398 transition function returns NULL, this is taken as an indication that
399 the inverse function cannot reverse the state calculation for this
400 particular input, and so the aggregate calculation will be redone from
401 scratch for the current frame starting position. This convention allows
402 moving-aggregate mode to be used in situations where there are some
403 infrequent cases that are impractical to reverse out of the running
404 state value.
405
406 If no moving-aggregate implementation is supplied, the aggregate can
407 still be used with moving frames, but PostgreSQL will recompute the
408 whole aggregation whenever the start of the frame moves. Note that
409 whether or not the aggregate supports moving-aggregate mode, PostgreSQL
410 can handle a moving frame end without recalculation; this is done by
411 continuing to add new values to the aggregate's state. It is assumed
412 that the final function does not damage the aggregate's state value, so
413 that the aggregation can be continued even after an aggregate result
414 value has been obtained for one set of frame boundaries.
415
416 The syntax for ordered-set aggregates allows VARIADIC to be specified
417 for both the last direct parameter and the last aggregated (WITHIN
418 GROUP) parameter. However, the current implementation restricts use of
419 VARIADIC in two ways. First, ordered-set aggregates can only use
420 VARIADIC "any", not other variadic array types. Second, if the last
421 direct parameter is VARIADIC "any", then there can be only one
422 aggregated parameter and it must also be VARIADIC "any". (In the
423 representation used in the system catalogs, these two parameters are
424 merged into a single VARIADIC "any" item, since pg_proc cannot
425 represent functions with more than one VARIADIC parameter.) If the
426 aggregate is a hypothetical-set aggregate, the direct arguments that
427 match the VARIADIC "any" parameter are the hypothetical ones; any
428 preceding parameters represent additional direct arguments that are not
429 constrained to match the aggregated arguments.
430
431 Currently, ordered-set aggregates do not need to support
432 moving-aggregate mode, since they cannot be used as window functions.
433
434 Partial (including parallel) aggregation is currently not supported for
435 ordered-set aggregates. Also, it will never be used for aggregate calls
436 that include DISTINCT or ORDER BY clauses, since those semantics cannot
437 be supported during partial aggregation.
438
440 See Section 37.10.
441
443 CREATE AGGREGATE is a PostgreSQL language extension. The SQL standard
444 does not provide for user-defined aggregate functions.
445
447 ALTER AGGREGATE (ALTER_AGGREGATE(7)), DROP AGGREGATE
448 (DROP_AGGREGATE(7))
449
450
451
452PostgreSQL 10.7 2019 CREATE AGGREGATE(7)