1CHECK_POSTGRES(1) User Contributed Perl Documentation CHECK_POSTGRES(1)
2
3
4
6 check_postgres.pl - a Postgres monitoring script for Nagios, MRTG,
7 Cacti, and others
8
9 This documents describes check_postgres.pl version 2.22.0
10
12 ## Create all symlinks
13 check_postgres.pl --symlinks
14
15 ## Check connection to Postgres database 'pluto':
16 check_postgres.pl --action=connection --db=pluto
17
18 ## Same things, but using the symlink
19 check_postgres_connection --db=pluto
20
21 ## Warn if > 100 locks, critical if > 200, or > 20 exclusive
22 check_postgres_locks --warning=100 --critical="total=200:exclusive=20"
23
24 ## Show the current number of idle connections on port 6543:
25 check_postgres_txn_idle --port=6543 --output=simple
26
27 ## There are many other actions and options, please keep reading.
28
29 The latest news and documentation can always be found at:
30 http://bucardo.org/check_postgres/
31
33 check_postgres.pl is a Perl script that runs many different tests
34 against one or more Postgres databases. It uses the psql program to
35 gather the information, and outputs the results in one of three
36 formats: Nagios, MRTG, or simple.
37
38 Output Modes
39 The output can be changed by use of the "--output" option. The default
40 output is nagios, although this can be changed at the top of the script
41 if you wish. The current option choices are nagios, mrtg, and simple.
42 To avoid having to enter the output argument each time, the type of
43 output is automatically set if no --output argument is given, and if
44 the current directory has one of the output options in its name. For
45 example, creating a directory named mrtg and populating it with
46 symlinks via the --symlinks argument would ensure that any actions run
47 from that directory will always default to an output of "mrtg" As a
48 shortcut for --output=simple, you can enter --simple, which also
49 overrides the directory naming trick.
50
51 Nagios output
52
53 The default output format is for Nagios, which is a single line of
54 information, along with four specific exit codes:
55
56 0 (OK)
57 1 (WARNING)
58 2 (CRITICAL)
59 3 (UNKNOWN)
60
61 The output line is one of the words above, a colon, and then a short
62 description of what was measured. Additional statistics information, as
63 well as the total time the command took, can be output as well: see the
64 documentation on the arguments --showperf, --perflimit, and --showtime.
65
66 MRTG output
67
68 The MRTG output is four lines, with the first line always giving a
69 single number of importance. When possible, this number represents an
70 actual value such as a number of bytes, but it may also be a 1 or a 0
71 for actions that only return "true" or "false", such as
72 check_postgres_version. The second line is an additional stat and is
73 only used for some actions. The third line indicates an "uptime" and is
74 not used. The fourth line is a description and usually indicates the
75 name of the database the stat from the first line was pulled from, but
76 may be different depending on the action.
77
78 Some actions accept an optional --mrtg argument to further control the
79 output.
80
81 See the documentation on each action for details on the exact MRTG
82 output for each one.
83
84 Simple output
85
86 The simple output is simply a truncated version of the MRTG one, and
87 simply returns the first number and nothing else. This is very useful
88 when you just want to check the state of something, regardless of any
89 threshold. You can transform the numeric output by appending KB, MB,
90 GB, TB, or EB to the output argument, for example:
91
92 --output=simple,MB
93
94 Cacti output
95
96 The Cacti output consists of one or more items on the same line, with a
97 simple name, a colon, and then a number. At the moment, the only action
98 with explicit Cacti output is 'dbstats', and using the --output option
99 is not needed in this case, as Cacti is the only output for this
100 action. For many other actions, using --simple is enough to make Cacti
101 happy.
102
104 All actions accept a common set of database options.
105
106 -H NAME or --host=NAME
107 Connect to the host indicated by NAME. Can be a comma-separated
108 list of names. Multiple host arguments are allowed. If no host is
109 given, defaults to the "PGHOST" environment variable or no host at
110 all (which indicates using a local Unix socket). You may also use
111 "--dbhost".
112
113 -p PORT or --port=PORT
114 Connects using the specified PORT number. Can be a comma-separated
115 list of port numbers, and multiple port arguments are allowed. If
116 no port number is given, defaults to the "PGPORT" environment
117 variable. If that is not set, it defaults to 5432. You may also use
118 "--dbport"
119
120 -db NAME or --dbname=NAME
121 Specifies which database to connect to. Can be a comma-separated
122 list of names, and multiple dbname arguments are allowed. If no
123 dbname option is provided, defaults to the "PGDATABASE" environment
124 variable. If that is not set, it defaults to 'postgres' if psql is
125 version 8 or greater, and 'template1' otherwise.
126
127 -u USERNAME or --dbuser=USERNAME
128 The name of the database user to connect as. Can be a comma-
129 separated list of usernames, and multiple dbuser arguments are
130 allowed. If this is not provided, it defaults to the "PGUSER"
131 environment variable, otherwise it defaults to 'postgres'.
132
133 --dbpass=PASSWORD
134 Provides the password to connect to the database with. Use of this
135 option is highly discouraged. Instead, one should use a .pgpass or
136 pg_service.conf file.
137
138 --dbservice=NAME
139 The name of a service inside of the pg_service.conf file. Before
140 version 9.0 of Postgres, this is a global file, usually found in
141 /etc/pg_service.conf. If you are using version 9.0 or higher of
142 Postgres, you can use the file ".pg_service.conf" in the home
143 directory of the user running the script, e.g. nagios.
144
145 This file contains a simple list of connection options. You can
146 also pass additional information when using this option such as
147 --dbservice="maindatabase sslmode=require"
148
149 The documentation for this file can be found at
150 http://www.postgresql.org/docs/current/static/libpq-pgservice.html
151
152 The database connection options can be grouped: --host=a,b --host=c
153 --port=1234 --port=3344 would connect to a-1234, b-1234, and c-3344.
154 Note that once set, an option carries over until it is changed again.
155
156 Examples:
157
158 --host=a,b --port=5433 --db=c
159 Connects twice to port 5433, using database c, to hosts a and b: a-5433-c b-5433-c
160
161 --host=a,b --port=5433 --db=c,d
162 Connects four times: a-5433-c a-5433-d b-5433-c b-5433-d
163
164 --host=a,b --host=foo --port=1234 --port=5433 --db=e,f
165 Connects six times: a-1234-e a-1234-f b-1234-e b-1234-f foo-5433-e foo-5433-f
166
167 --host=a,b --host=x --port=5432,5433 --dbuser=alice --dbuser=bob -db=baz
168 Connects three times: a-5432-alice-baz b-5433-alice-baz x-5433-bob-baz
169
170 --dbservice="foo" --port=5433
171 Connects using the named service 'foo' in the pg_service.conf file, but overrides the port
172
174 Other options include:
175
176 --action=NAME
177 States what action we are running. Required unless using a
178 symlinked file, in which case the name of the file is used to
179 figure out the action.
180
181 --warning=VAL or -w VAL
182 Sets the threshold at which a warning alert is fired. The valid
183 options for this option depends on the action used.
184
185 --critical=VAL or -c VAL
186 Sets the threshold at which a critical alert is fired. The valid
187 options for this option depends on the action used.
188
189 -t VAL or --timeout=VAL
190 Sets the timeout in seconds after which the script will abort
191 whatever it is doing and return an UNKNOWN status. The timeout is
192 per Postgres cluster, not for the entire script. The default value
193 is 10; the units are always in seconds.
194
195 --assume-standby-mode
196 If specified, first the check if server in standby mode will be
197 performed (--datadir is required), if so, all checks that require
198 SQL queries will be ignored and "Server in standby mode" with OK
199 status will be returned instead.
200
201 Example:
202
203 postgres@db$./check_postgres.pl --action=version --warning=8.1 --datadir /var/lib/postgresql/8.3/main/ --assume-standby-mode
204 POSTGRES_VERSION OK: Server in standby mode | time=0.00
205
206 --assume-prod
207 If specified, check if server in production mode is performed
208 (--datadir is required). The option is only relevant for
209 ("symlink: check_postgres_checkpoint").
210
211 Example:
212
213 postgres@db$./check_postgres.pl --action=checkpoint --datadir /var/lib/postgresql/8.3/main/ --assume-prod
214 POSTGRES_CHECKPOINT OK: Last checkpoint was 72 seconds ago | age=72;;300 mode=MASTER
215
216 -h or --help
217 Displays a help screen with a summary of all actions and options.
218
219 --man
220 Displays the entire manual.
221
222 -V or --version
223 Shows the current version.
224
225 -v or --verbose
226 Set the verbosity level. Can call more than once to boost the
227 level. Setting it to three or higher (in other words, issuing "-v
228 -v -v") turns on debugging information for this program which is
229 sent to stderr.
230
231 --showperf=VAL
232 Determines if we output additional performance data in standard
233 Nagios format (at end of string, after a pipe symbol, using
234 name=value). VAL should be 0 or 1. The default is 1. Only takes
235 effect if using Nagios output mode.
236
237 --perflimit=i
238 Sets a limit as to how many items of interest are reported back
239 when using the showperf option. This only has an effect for actions
240 that return a large number of items, such as table_size. The
241 default is 0, or no limit. Be careful when using this with the
242 --include or --exclude options, as those restrictions are done
243 after the query has been run, and thus your limit may not include
244 the items you want. Only takes effect if using Nagios output mode.
245
246 --showtime=VAL
247 Determines if the time taken to run each query is shown in the
248 output. VAL should be 0 or 1. The default is 1. No effect unless
249 showperf is on. Only takes effect if using Nagios output mode.
250
251 --test
252 Enables test mode. See the "TEST MODE" section below.
253
254 --PGBINDIR=PATH
255 Tells the script where to find the psql binaries. Useful if you
256 have more than one version of the PostgreSQL executables on your
257 system, or if there are not in your path. Note that this option is
258 in all uppercase. By default, this option is not allowed. To enable
259 it, you must change the $NO_PSQL_OPTION near the top of the script
260 to 0. Avoid using this option if you can, and instead use
261 environment variable c<PGBINDIR> or hard-coded $PGBINDIR variable,
262 also near the top of the script, to set the path to the PostgreSQL
263 to use.
264
265 --PSQL=PATH
266 (deprecated, this option may be removed in a future release!)
267 Tells the script where to find the psql program. Useful if you have
268 more than one version of the psql executable on your system, or if
269 there is no psql program in your path. Note that this option is in
270 all uppercase. By default, this option is not allowed. To enable
271 it, you must change the $NO_PSQL_OPTION near the top of the script
272 to 0. Avoid using this option if you can, and instead hard-code
273 your psql location into the $PSQL variable, also near the top of
274 the script.
275
276 --symlinks
277 Creates symlinks to the main program for each action.
278
279 --output=VAL
280 Determines the format of the output, for use in various programs.
281 The default is 'nagios'. Available options are 'nagios', 'mrtg',
282 'simple' and 'cacti'.
283
284 --mrtg=VAL
285 Used only for the MRTG or simple output, for a few specific
286 actions.
287
288 --debugoutput=VAL
289 Outputs the exact string returned by psql, for use in debugging.
290 The value is one or more letters, which determine if the output is
291 displayed or not, where 'a' = all, 'c' = critical, 'w' = warning,
292 'o' = ok, and 'u' = unknown. Letters can be combined.
293
294 --get_method=VAL
295 Allows specification of the method used to fetch information for
296 the "new_version_cp", "new_version_pg", "new_version_bc",
297 "new_version_box", and "new_version_tnm" checks. The following
298 programs are tried, in order, to grab the information from the web:
299 GET, wget, fetch, curl, lynx, links. To force the use of just one
300 (and thus remove the overhead of trying all the others until one of
301 those works), enter one of the names as the argument to get_method.
302 For example, a BSD box might enter the following line in their
303 ".check_postgresrc" file:
304
305 get_method=fetch
306
307 --language=VAL
308 Set the language to use for all output messages. Normally, this is
309 detected by examining the environment variables LC_ALL,
310 LC_MESSAGES, and LANG, but setting this option will override any
311 such detection.
312
314 The script runs one or more actions. This can either be done with the
315 --action flag, or by using a symlink to the main file that contains the
316 name of the action inside of it. For example, to run the action
317 "timesync", you may either issue:
318
319 check_postgres.pl --action=timesync
320
321 or use a program named:
322
323 check_postgres_timesync
324
325 All the symlinks are created for you in the current directory if use
326 the option --symlinks
327
328 perl check_postgres.pl --symlinks
329
330 If the file name already exists, it will not be overwritten. If the
331 file exists and is a symlink, you can force it to overwrite by using
332 "--action=build_symlinks_force"
333
334 Most actions take a --warning and a --critical option, indicating at
335 what point we change from OK to WARNING, and what point we go to
336 CRITICAL. Note that because criticals are always checked first, setting
337 the warning equal to the critical is an effective way to turn warnings
338 off and always give a critical.
339
340 The current supported actions are:
341
342 archive_ready
343 ("symlink: check_postgres_archive_ready") Checks how many WAL files
344 with extension .ready exist in the pg_xlog/archive_status directory,
345 which is found off of your data_directory. This action must be run as a
346 superuser, in order to access the contents of the
347 pg_xlog/archive_status directory. The minimum version to use this
348 action is Postgres 8.1. The --warning and --critical options are simply
349 the number of .ready files in the pg_xlog/archive_status directory.
350 Usually, these values should be low, turning on the archive mechanism,
351 we usually want it to archive WAL files as fast as possible.
352
353 If the archive command fail, number of WAL in your pg_xlog directory
354 will grow until exhausting all the disk space and force PostgreSQL to
355 stop immediately.
356
357 Example 1: Check that the number of ready WAL files is 10 or less on
358 host "pluto"
359
360 check_postgres_archive_ready --host=pluto --critical=10
361
362 For MRTG output, reports the number of ready WAL files on line 1.
363
364 autovac_freeze
365 ("symlink: check_postgres_autovac_freeze") Checks how close each
366 database is to the Postgres autovacuum_freeze_max_age setting. This
367 action will only work for databases version 8.2 or higher. The
368 --warning and --critical options should be expressed as percentages.
369 The 'age' of the transactions in each database is compared to the
370 autovacuum_freeze_max_age setting (200 million by default) to generate
371 a rounded percentage. The default values are 90% for the warning and
372 95% for the critical. Databases can be filtered by use of the --include
373 and --exclude options. See the "BASIC FILTERING" section for more
374 details.
375
376 Example 1: Give a warning when any databases on port 5432 are above 97%
377
378 check_postgres_autovac_freeze --port=5432 --warning="97%"
379
380 For MRTG output, the highest overall percentage is reported on the
381 first line, and the highest age is reported on the second line. All
382 databases which have the percentage from the first line are reported on
383 the fourth line, separated by a pipe symbol.
384
385 backends
386 ("symlink: check_postgres_backends") Checks the current number of
387 connections for one or more databases, and optionally compares it to
388 the maximum allowed, which is determined by the Postgres configuration
389 variable max_connections. The --warning and --critical options can take
390 one of three forms. First, a simple number can be given, which
391 represents the number of connections at which the alert will be given.
392 This choice does not use the max_connections setting. Second, the
393 percentage of available connections can be given. Third, a negative
394 number can be given which represents the number of connections left
395 until max_connections is reached. The default values for --warning and
396 --critical are '90%' and '95%'. You can also filter the databases by
397 use of the --include and --exclude options. See the "BASIC FILTERING"
398 section for more details.
399
400 To view only non-idle processes, you can use the --noidle argument.
401 Note that the user you are connecting as must be a superuser for this
402 to work properly.
403
404 Example 1: Give a warning when the number of connections on host quirm
405 reaches 120, and a critical if it reaches 150.
406
407 check_postgres_backends --host=quirm --warning=120 --critical=150
408
409 Example 2: Give a critical when we reach 75% of our max_connections
410 setting on hosts lancre or lancre2.
411
412 check_postgres_backends --warning='75%' --critical='75%' --host=lancre,lancre2
413
414 Example 3: Give a warning when there are only 10 more connection slots
415 left on host plasmid, and a critical when we have only 5 left.
416
417 check_postgres_backends --warning=-10 --critical=-5 --host=plasmid
418
419 Example 4: Check all databases except those with "test" in their name,
420 but allow ones that are named "pg_greatest". Connect as port 5432 on
421 the first two hosts, and as port 5433 on the third one. We want to
422 always throw a critical when we reach 30 or more connections.
423
424 check_postgres_backends --dbhost=hong,kong --dbhost=fooey --dbport=5432 --dbport=5433 --warning=30 --critical=30 --exclude="~test" --include="pg_greatest,~prod"
425
426 For MRTG output, the number of connections is reported on the first
427 line, and the fourth line gives the name of the database, plus the
428 current maximum_connections. If more than one database has been
429 queried, the one with the highest number of connections is output.
430
431 bloat
432 ("symlink: check_postgres_bloat") Checks the amount of bloat in tables
433 and indexes. (Bloat is generally the amount of dead unused space taken
434 up in a table or index. This space is usually reclaimed by use of the
435 VACUUM command.) This action requires that stats collection be enabled
436 on the target databases, and requires that ANALYZE is run frequently.
437 The --include and --exclude options can be used to filter out which
438 tables to look at. See the "BASIC FILTERING" section for more details.
439
440 The --warning and --critical options can be specified as sizes,
441 percents, or both. Valid size units are bytes, kilobytes, megabytes,
442 gigabytes, terabytes, exabytes, petabytes, and zettabytes. You can
443 abbreviate all of those with the first letter. Items without units are
444 assumed to be 'bytes'. The default values are '1 GB' and '5 GB'. The
445 value represents the number of "wasted bytes", or the difference
446 between what is actually used by the table and index, and what we
447 compute that it should be.
448
449 Note that this action has two hard-coded values to avoid false alarms
450 on smaller relations. Tables must have at least 10 pages, and indexes
451 at least 15, before they can be considered by this test. If you really
452 want to adjust these values, you can look for the variables $MINPAGES
453 and $MINIPAGES at the top of the "check_bloat" subroutine. These values
454 are ignored if either --exclude or --include is used.
455
456 Only the top 10 most bloated relations are shown. You can change this
457 number by using the --perflimit option to set your own limit.
458
459 The schema named 'information_schema' is excluded from this test, as
460 the only tables it contains are small and do not change.
461
462 Please note that the values computed by this action are not precise,
463 and should be used as a guideline only. Great effort was made to
464 estimate the correct size of a table, but in the end it is only an
465 estimate. The correct index size is even more of a guess than the
466 correct table size, but both should give a rough idea of how bloated
467 things are.
468
469 Example 1: Warn if any table on port 5432 is over 100 MB bloated, and
470 critical if over 200 MB
471
472 check_postgres_bloat --port=5432 --warning='100 M' --critical='200 M'
473
474 Example 2: Give a critical if table 'orders' on host 'sami' has more
475 than 10 megs of bloat
476
477 check_postgres_bloat --host=sami --include=orders --critical='10 MB'
478
479 Example 3: Give a critical if table 'q4' on database 'sales' is over
480 50% bloated
481
482 check_postgres_bloat --db=sales --include=q4 --critical='50%'
483
484 Example 4: Give a critical any table is over 20% bloated and has over
485 150 MB of bloat:
486
487 check_postgres_bloat --port=5432 --critical='20% and 150 M'
488
489 Example 5: Give a critical any table is over 40% bloated or has over
490 500 MB of bloat:
491
492 check_postgres_bloat --port=5432 --warning='500 M or 40%'
493
494 For MRTG output, the first line gives the highest number of wasted
495 bytes for the tables, and the second line gives the highest number of
496 wasted bytes for the indexes. The fourth line gives the database name,
497 table name, and index name information. If you want to output the bloat
498 ratio instead (how many times larger the relation is compared to how
499 large it should be), just pass in "--mrtg=ratio".
500
501 checkpoint
502 ("symlink: check_postgres_checkpoint") Determines how long since the
503 last checkpoint has been run. This must run on the same server as the
504 database that is being checked (e.g. the -h flag will not work). This
505 check is meant to run on a "warm standby" server that is actively
506 processing shipped WAL files, and is meant to check that your warm
507 standby is truly 'warm'. The data directory must be set, either by the
508 environment variable "PGDATA", or passing the "--datadir" argument. It
509 returns the number of seconds since the last checkpoint was run, as
510 determined by parsing the call to "pg_controldata". Because of this,
511 the pg_controldata executable must be available in the current path.
512 Alternatively, you can specify "PGBINDIR" as the directory that it
513 lives in. It is also possible to use the special options --assume-prod
514 or --assume-standby-mode, if the mode found is not the one expected, a
515 CRITICAL is emitted.
516
517 At least one warning or critical argument must be set.
518
519 This action requires the Date::Parse module.
520
521 For MRTG or simple output, returns the number of seconds.
522
523 cluster_id
524 ("symlink: check_postgres_cluster-id") Checks that the Database System
525 Identifier provided by pg_controldata is the same as last time you
526 checked. This must run on the same server as the database that is being
527 checked (e.g. the -h flag will not work). Either the --warning or the
528 --critical option should be given, but not both. The value of each one
529 is the cluster identifier, an integer value. You can run with the
530 special "--critical=0" option to find out an existing cluster
531 identifier.
532
533 Example 1: Find the initial identifier
534
535 check_postgres_cluster_id --critical=0 --datadir=/var//lib/postgresql/9.0/main
536
537 Example 2: Make sure the cluster is the same and warn if not, using the
538 result from above.
539
540 check_postgres_cluster_id --critical=5633695740047915135
541
542 For MRTG output, returns a 1 or 0 indicating success of failure of the
543 identifier to match. A identifier must be provided as the "--mrtg"
544 argument. The fourth line always gives the current identifier.
545
546 commitratio
547 ("symlink: check_postgres_commitratio") Checks the commit ratio of all
548 databases and complains when they are too low. There is no need to run
549 this command more than once per database cluster. Databases can be
550 filtered with the --include and --exclude options. See the "BASIC
551 FILTERING" section for more details. They can also be filtered by the
552 owner of the database with the --includeuser and --excludeuser options.
553 See the "USER NAME FILTERING" section for more details.
554
555 The warning and critical options should be specified as percentages.
556 There are not defaults for this action: the warning and critical must
557 be specified. The warning value cannot be greater than the critical
558 value. The output returns all databases sorted by commitratio, smallest
559 first.
560
561 Example: Warn if any database on host flagg is less than 90% in
562 commitratio, and critical if less then 80%.
563
564 check_postgres_database_commitratio --host=flagg --warning='90%' --critical='80%'
565
566 For MRTG output, returns the percentage of the database with the
567 smallest commitratio on the first line, and the name of the database on
568 the fourth line.
569
570 connection
571 ("symlink: check_postgres_connection") Simply connects, issues a
572 'SELECT version()', and leaves. Takes no --warning or --critical
573 options.
574
575 For MRTG output, simply outputs a 1 (good connection) or a 0 (bad
576 connection) on the first line.
577
578 custom_query
579 ("symlink: check_postgres_custom_query") Runs a custom query of your
580 choosing, and parses the results. The query itself is passed in
581 through the "query" argument, and should be kept as simple as possible.
582 If at all possible, wrap it in a view or a function to keep things
583 easier to manage. The query should return one or two columns. It is
584 required that one of the columns be named "result" and is the item that
585 will be checked against your warning and critical values. The second
586 column is for the performance data and any name can be used: this will
587 be the 'value' inside the performance data section.
588
589 At least one warning or critical argument must be specified. What these
590 are set to depends on the type of query you are running. There are four
591 types of custom_queries that can be run, specified by the "valtype"
592 argument. If none is specified, this action defaults to 'integer'. The
593 four types are:
594
595 integer: Does a simple integer comparison. The first column should be a
596 simple integer, and the warning and critical values should be the same.
597
598 string: The warning and critical are strings, and are triggered only if
599 the value in the first column matches it exactly. This is case-
600 sensitive.
601
602 time: The warning and the critical are times, and can have units of
603 seconds, minutes, hours, or days. Each may be written singular or
604 abbreviated to just the first letter. If no units are given, seconds
605 are assumed. The first column should be an integer representing the
606 number of seconds to check.
607
608 size: The warning and the critical are sizes, and can have units of
609 bytes, kilobytes, megabytes, gigabytes, terabytes, or exabytes. Each
610 may be abbreviated to the first letter. If no units are given, bytes
611 are assumed. The first column should be an integer representing the
612 number of bytes to check.
613
614 Normally, an alert is triggered if the values returned are greater than
615 or equal to the critical or warning value. However, an option of
616 --reverse will trigger the alert if the returned value is lower than or
617 equal to the critical or warning value.
618
619 Example 1: Warn if any relation over 100 pages is named "rad", put the
620 number of pages inside the performance data section.
621
622 check_postgres_custom_query --valtype=string -w "rad" --query=
623 "SELECT relname AS result, relpages AS pages FROM pg_class WHERE relpages > 100"
624
625 Example 2: Give a critical if the "foobar" function returns a number
626 over 5MB:
627
628 check_postgres_custom_query --critical='5MB'--valtype=size --query="SELECT foobar() AS result"
629
630 Example 2: Warn if the function "snazzo" returns less than 42:
631
632 check_postgres_custom_query --critical=42 --query="SELECT snazzo() AS result" --reverse
633
634 If you come up with a useful custom_query, consider sending in a patch
635 to this program to make it into a standard action that other people can
636 use.
637
638 This action does not support MRTG or simple output yet.
639
640 database_size
641 ("symlink: check_postgres_database_size") Checks the size of all
642 databases and complains when they are too big. There is no need to run
643 this command more than once per database cluster. Databases can be
644 filtered with the --include and --exclude options. See the "BASIC
645 FILTERING" section for more details. They can also be filtered by the
646 owner of the database with the --includeuser and --excludeuser options.
647 See the "USER NAME FILTERING" section for more details.
648
649 The warning and critical options can be specified as bytes, kilobytes,
650 megabytes, gigabytes, terabytes, or exabytes. Each may be abbreviated
651 to the first letter as well. If no unit is given, the units are
652 assumed to be bytes. There are not defaults for this action: the
653 warning and critical must be specified. The warning value cannot be
654 greater than the critical value. The output returns all databases
655 sorted by size largest first, showing both raw bytes and a "pretty"
656 version of the size.
657
658 Example 1: Warn if any database on host flagg is over 1 TB in size, and
659 critical if over 1.1 TB.
660
661 check_postgres_database_size --host=flagg --warning='1 TB' --critical='1.1 t'
662
663 Example 2: Give a critical if the database template1 on port 5432 is
664 over 10 MB.
665
666 check_postgres_database_size --port=5432 --include=template1 --warning='10MB' --critical='10MB'
667
668 Example 3: Give a warning if any database on host 'tardis' owned by the
669 user 'tom' is over 5 GB
670
671 check_postgres_database_size --host=tardis --includeuser=tom --warning='5 GB' --critical='10 GB'
672
673 For MRTG output, returns the size in bytes of the largest database on
674 the first line, and the name of the database on the fourth line.
675
676 dbstats
677 ("symlink: check_postgres_dbstats") Reports information from the
678 pg_stat_database view, and outputs it in a Cacti-friendly manner. No
679 other output is supported, as the output is informational and does not
680 lend itself to alerts, such as used with Nagios. If no options are
681 given, all databases are returned, one per line. You can include a
682 specific database by use of the "--include" option, or you can use the
683 "--dbname" option.
684
685 Eleven items are returned on each line, in the format name:value,
686 separated by a single space. The items are:
687
688 backends
689 The number of currently running backends for this database.
690
691 commits
692 The total number of commits for this database since it was created
693 or reset.
694
695 rollbacks
696 The total number of rollbacks for this database since it was
697 created or reset.
698
699 read
700 The total number of disk blocks read.
701
702 hit The total number of buffer hits.
703
704 ret The total number of rows returned.
705
706 fetch
707 The total number of rows fetched.
708
709 ins The total number of rows inserted.
710
711 upd The total number of rows updated.
712
713 del The total number of rows deleted.
714
715 dbname
716 The name of the database.
717
718 Note that ret, fetch, ins, upd, and del items will always be 0 if
719 Postgres is version 8.2 or lower, as those stats were not available in
720 those versions.
721
722 If the dbname argument is given, seven additional items are returned:
723
724 idxscan
725 Total number of user index scans.
726
727 idxtupread
728 Total number of user index entries returned.
729
730 idxtupfetch
731 Total number of rows fetched by simple user index scans.
732
733 idxblksread
734 Total number of disk blocks read for all user indexes.
735
736 idxblkshit
737 Total number of buffer hits for all user indexes.
738
739 seqscan
740 Total number of sequential scans against all user tables.
741
742 seqtupread
743 Total number of tuples returned from all user tables.
744
745 Example 1: Grab the stats for a database named "products" on host
746 "willow":
747
748 check_postgres_dbstats --dbhost willow --dbname products
749
750 The output returned will be like this (all on one line, not wrapped):
751
752 backends:82 commits:58374408 rollbacks:1651 read:268435543 hit:2920381758 idxscan:310931294 idxtupread:2777040927
753 idxtupfetch:1840241349 idxblksread:62860110 idxblkshit:1107812216 seqscan:5085305 seqtupread:5370500520
754 ret:0 fetch:0 ins:0 upd:0 del:0 dbname:willow
755
756 disabled_triggers
757 ("symlink: check_postgres_disabled_triggers") Checks on the number of
758 disabled triggers inside the database. The --warning and --critical
759 options are the number of such triggers found, and both default to "1",
760 as in normal usage having disabled triggers is a dangerous event. If
761 the database being checked is 8.3 or higher, the check is for the
762 number of triggers that are in a 'disabled' status (as opposed to being
763 'always' or 'replica'). The output will show the name of the table and
764 the name of the trigger for each disabled trigger.
765
766 Example 1: Make sure that there are no disabled triggers
767
768 check_postgres_disabled_triggers
769
770 For MRTG output, returns the number of disabled triggers on the first
771 line.
772
773 disk_space
774 ("symlink: check_postgres_disk_space") Checks on the available physical
775 disk space used by Postgres. This action requires that you have the
776 executable "/bin/df" available to report on disk sizes, and it also
777 needs to be run as a superuser, so it can examine the data_directory
778 setting inside of Postgres. The --warning and --critical options are
779 given in either sizes or percentages or both. If using sizes, the
780 standard unit types are allowed: bytes, kilobytes, gigabytes,
781 megabytes, gigabytes, terabytes, or exabytes. Each may be abbreviated
782 to the first letter only; no units at all indicates 'bytes'. The
783 default values are '90%' and '95%'.
784
785 This command checks the following things to determine all of the
786 different physical disks being used by Postgres.
787
788 data_directory - The disk that the main data directory is on.
789
790 log directory - The disk that the log files are on.
791
792 WAL file directory - The disk that the write-ahead logs are on (e.g.
793 symlinked pg_xlog)
794
795 tablespaces - Each tablespace that is on a separate disk.
796
797 The output shows the total size used and available on each disk, as
798 well as the percentage, ordered by highest to lowest percentage used.
799 Each item above maps to a file system: these can be included or
800 excluded. See the "BASIC FILTERING" section for more details.
801
802 Example 1: Make sure that no file system is over 90% for the database
803 on port 5432.
804
805 check_postgres_disk_space --port=5432 --warning='90%' --critical='90%'
806
807 Example 2: Check that all file systems starting with /dev/sda are
808 smaller than 10 GB and 11 GB (warning and critical)
809
810 check_postgres_disk_space --port=5432 --warning='10 GB' --critical='11 GB' --include="~^/dev/sda"
811
812 Example 4: Make sure that no file system is both over 50% and has over
813 15 GB
814
815 check_postgres_disk_space --critical='50% and 15 GB'
816
817 Example 5: Issue a warning if any file system is either over 70% full
818 or has more than 1T
819
820 check_postgres_disk_space --warning='1T or 75'
821
822 For MRTG output, returns the size in bytes of the file system on the
823 first line, and the name of the file system on the fourth line.
824
825 fsm_pages
826 ("symlink: check_postgres_fsm_pages") Checks how close a cluster is to
827 the Postgres max_fsm_pages setting. This action will only work for
828 databases of 8.2 or higher, and it requires the contrib module
829 pg_freespacemap be installed. The --warning and --critical options
830 should be expressed as percentages. The number of used pages in the
831 free-space-map is determined by looking in the
832 pg_freespacemap_relations view, and running a formula based on the
833 formula used for outputting free-space-map pageslots in the vacuum
834 verbose command. The default values are 85% for the warning and 95% for
835 the critical.
836
837 Example 1: Give a warning when our cluster has used up 76% of the free-
838 space pageslots, with pg_freespacemap installed in database robert
839
840 check_postgres_fsm_pages --dbname=robert --warning="76%"
841
842 While you need to pass in the name of the database where
843 pg_freespacemap is installed, you only need to run this check once per
844 cluster. Also, checking this information does require obtaining special
845 locks on the free-space-map, so it is recommend you do not run this
846 check with short intervals.
847
848 For MRTG output, returns the percent of free-space-map on the first
849 line, and the number of pages currently used on the second line.
850
851 fsm_relations
852 ("symlink: check_postgres_fsm_relations") Checks how close a cluster is
853 to the Postgres max_fsm_relations setting. This action will only work
854 for databases of 8.2 or higher, and it requires the contrib module
855 pg_freespacemap be installed. The --warning and --critical options
856 should be expressed as percentages. The number of used relations in the
857 free-space-map is determined by looking in the
858 pg_freespacemap_relations view. The default values are 85% for the
859 warning and 95% for the critical.
860
861 Example 1: Give a warning when our cluster has used up 80% of the free-
862 space relations, with pg_freespacemap installed in database dylan
863
864 check_postgres_fsm_relations --dbname=dylan --warning="75%"
865
866 While you need to pass in the name of the database where
867 pg_freespacemap is installed, you only need to run this check once per
868 cluster. Also, checking this information does require obtaining special
869 locks on the free-space-map, so it is recommend you do not run this
870 check with short intervals.
871
872 For MRTG output, returns the percent of free-space-map on the first
873 line, the number of relations currently used on the second line.
874
875 hitratio
876 ("symlink: check_postgres_hitratio") Checks the hit ratio of all
877 databases and complains when they are too low. There is no need to run
878 this command more than once per database cluster. Databases can be
879 filtered with the --include and --exclude options. See the "BASIC
880 FILTERING" section for more details. They can also be filtered by the
881 owner of the database with the --includeuser and --excludeuser options.
882 See the "USER NAME FILTERING" section for more details.
883
884 The warning and critical options should be specified as percentages.
885 There are not defaults for this action: the warning and critical must
886 be specified. The warning value cannot be greater than the critical
887 value. The output returns all databases sorted by hitratio, smallest
888 first.
889
890 Example: Warn if any database on host flagg is less than 90% in
891 hitratio, and critical if less then 80%.
892
893 check_postgres_hitratio --host=flagg --warning='90%' --critical='80%'
894
895 For MRTG output, returns the percentage of the database with the
896 smallest hitratio on the first line, and the name of the database on
897 the fourth line.
898
899 hot_standby_delay
900 ("symlink: check_hot_standby_delay") Checks the streaming replication
901 lag by computing the delta between the current xlog position of a
902 master server and the replay location of a slave connected to it. The
903 slave server must be in hot_standby (e.g. read only) mode, therefore
904 the minimum version to use this action is Postgres 9.0. The --warning
905 and --critical options are the delta between the xlog locations. Since
906 these values are byte offsets in the WAL they should match the expected
907 transaction volume of your application to prevent false positives or
908 negatives.
909
910 The first "--dbname", "--host", and "--port", etc. options are
911 considered the master; the second belongs to the slave.
912
913 Byte values should be based on the volume of transactions needed to
914 have the streaming replication disconnect from the master because of
915 too much lag, determined by the Postgres configuration variable
916 wal_keep_segments. For units of time, valid units are 'seconds',
917 'minutes', 'hours', or 'days'. Each may be written singular or
918 abbreviated to just the first letter. When specifying both, in the form
919 'bytes and time', both conditions must be true for the threshold to be
920 met.
921
922 You must provide information on how to reach the databases by providing
923 a comma separated list to the --dbhost and --dbport parameters, such as
924 "--dbport=5432,5543". If not given, the action fails.
925
926 Example 1: Warn a database with a local replica on port 5433 is behind
927 on any xlog replay at all
928
929 check_hot_standby_delay --dbport=5432,5433 --warning='1'
930
931 Example 2: Give a critical if the last transaction replica1 receives is
932 more than 10 minutes ago
933
934 check_hot_standby_delay --dbhost=master,replica1 --critical='10 min'
935
936 Example 3: Allow replica1 to be 1 WAL segment behind, if the master is
937 momentarily seeing more activity than the streaming replication
938 connection can handle, or 10 minutes behind, if the master is seeing
939 very little activity and not processing any transactions, but not both,
940 which would indicate a lasting problem with the replication connection.
941
942 check_hot_standby_delay --dbhost=master,replica1 --warning='1048576 and 2 min' --critical='16777216 and 10 min'
943
944 index_size
945 table_size
946 relation_size
947 (symlinks: "check_postgres_index_size", "check_postgres_table_size",
948 and "check_postgres_relation_size") The actions table_size and
949 index_size are simply variations of the relation_size action, which
950 checks for a relation that has grown too big. Relations (in other
951 words, tables and indexes) can be filtered with the --include and
952 --exclude options. See the "BASIC FILTERING" section for more details.
953 Relations can also be filtered by the user that owns them, by using the
954 --includeuser and --excludeuser options. See the "USER NAME FILTERING"
955 section for more details.
956
957 The values for the --warning and --critical options are file sizes, and
958 may have units of bytes, kilobytes, megabytes, gigabytes, terabytes, or
959 exabytes. Each can be abbreviated to the first letter. If no units are
960 given, bytes are assumed. There are no default values: both the warning
961 and the critical option must be given. The return text shows the size
962 of the largest relation found.
963
964 If the --showperf option is enabled, all of the relations with their
965 sizes will be given. To prevent this, it is recommended that you set
966 the --perflimit option, which will cause the query to do a "ORDER BY
967 size DESC LIMIT (perflimit)".
968
969 Example 1: Give a critical if any table is larger than 600MB on host
970 burrick.
971
972 check_postgres_table_size --critical='600 MB' --warning='600 MB' --host=burrick
973
974 Example 2: Warn if the table products is over 4 GB in size, and give a
975 critical at 4.5 GB.
976
977 check_postgres_table_size --host=burrick --warning='4 GB' --critical='4.5 GB' --include=products
978
979 Example 3: Warn if any index not owned by postgres goes over 500 MB.
980
981 check_postgres_index_size --port=5432 --excludeuser=postgres -w 500MB -c 600MB
982
983 For MRTG output, returns the size in bytes of the largest relation, and
984 the name of the database and relation as the fourth line.
985
986 last_analyze
987 last_vacuum
988 last_autoanalyze
989 last_autovacuum
990 (symlinks: "check_postgres_last_analyze", "check_postgres_last_vacuum",
991 "check_postgres_last_autoanalyze", and
992 "check_postgres_last_autovacuum") Checks how long it has been since
993 vacuum (or analyze) was last run on each table in one or more
994 databases. Use of these actions requires that the target database is
995 version 8.3 or greater, or that the version is 8.2 and the
996 configuration variable stats_row_level has been enabled. Tables can be
997 filtered with the --include and --exclude options. See the "BASIC
998 FILTERING" section for more details. Tables can also be filtered by
999 their owner by use of the --includeuser and --excludeuser options. See
1000 the "USER NAME FILTERING" section for more details.
1001
1002 The units for --warning and --critical are specified as times. Valid
1003 units are seconds, minutes, hours, and days; all can be abbreviated to
1004 the first letter. If no units are given, 'seconds' are assumed. The
1005 default values are '1 day' and '2 days'. Please note that there are
1006 cases in which this field does not get automatically populated. If
1007 certain tables are giving you problems, make sure that they have dead
1008 rows to vacuum, or just exclude them from the test.
1009
1010 The schema named 'information_schema' is excluded from this test, as
1011 the only tables it contains are small and do not change.
1012
1013 Note that the non-'auto' versions will also check on the auto versions
1014 as well. In other words, using last_vacuum will report on the last
1015 vacuum, whether it was a normal vacuum, or one run by the autovacuum
1016 daemon.
1017
1018 Example 1: Warn if any table has not been vacuumed in 3 days, and give
1019 a critical at a week, for host wormwood
1020
1021 check_postgres_last_vacuum --host=wormwood --warning='3d' --critical='7d'
1022
1023 Example 2: Same as above, but skip tables belonging to the users 'eve'
1024 or 'mallory'
1025
1026 check_postgres_last_vacuum --host=wormwood --warning='3d' --critical='7d' --excludeusers=eve,mallory
1027
1028 For MRTG output, returns (on the first line) the LEAST amount of time
1029 in seconds since a table was last vacuumed or analyzed. The fourth line
1030 returns the name of the database and name of the table.
1031
1032 listener
1033 ("symlink: check_postgres_listener") Confirm that someone is listening
1034 for one or more specific strings (using the LISTEN/NOTIFY system), by
1035 looking at the pg_listener table. Only one of warning or critical is
1036 needed. The format is a simple string representing the LISTEN target,
1037 or a tilde character followed by a string for a regular expression
1038 check. Note that this check will not work on versions of Postgres 9.0
1039 or higher.
1040
1041 Example 1: Give a warning if nobody is listening for the string
1042 bucardo_mcp_ping on ports 5555 and 5556
1043
1044 check_postgres_listener --port=5555,5556 --warning=bucardo_mcp_ping
1045
1046 Example 2: Give a critical if there are no active LISTEN requests
1047 matching 'grimm' on database oskar
1048
1049 check_postgres_listener --db oskar --critical=~grimm
1050
1051 For MRTG output, returns a 1 or a 0 on the first, indicating success or
1052 failure. The name of the notice must be provided via the --mrtg option.
1053
1054 locks
1055 ("symlink: check_postgres_locks") Check the total number of locks on
1056 one or more databases. There is no need to run this more than once per
1057 database cluster. Databases can be filtered with the --include and
1058 --exclude options. See the "BASIC FILTERING" section for more details.
1059
1060 The --warning and --critical options can be specified as simple
1061 numbers, which represent the total number of locks, or they can be
1062 broken down by type of lock. Valid lock names are 'total', 'waiting',
1063 or the name of a lock type used by Postgres. These names are case-
1064 insensitive and do not need the "lock" part on the end, so exclusive
1065 will match 'ExclusiveLock'. The format is name=number, with different
1066 items separated by colons or semicolons (or any other symbol).
1067
1068 Example 1: Warn if the number of locks is 100 or more, and critical if
1069 200 or more, on host garrett
1070
1071 check_postgres_locks --host=garrett --warning=100 --critical=200
1072
1073 Example 2: On the host artemus, warn if 200 or more locks exist, and
1074 give a critical if over 250 total locks exist, or if over 20 exclusive
1075 locks exist, or if over 5 connections are waiting for a lock.
1076
1077 check_postgres_locks --host=artemus --warning=200 --critical="total=250:waiting=5:exclusive=20"
1078
1079 For MRTG output, returns the number of locks on the first line, and the
1080 name of the database on the fourth line.
1081
1082 logfile
1083 ("symlink: check_postgres_logfile") Ensures that the logfile is in the
1084 expected location and is being logged to. This action issues a command
1085 that throws an error on each database it is checking, and ensures that
1086 the message shows up in the logs. It scans the various log_* settings
1087 inside of Postgres to figure out where the logs should be. If you are
1088 using syslog, it does a rough (but not foolproof) scan of
1089 /etc/syslog.conf. Alternatively, you can provide the name of the
1090 logfile with the --logfile option. This is especially useful if the
1091 logs have a custom rotation scheme driven be an external program. The
1092 --logfile option supports the following escape characters: "%Y %m %d
1093 %H", which represent the current year, month, date, and hour
1094 respectively. An error is always reported as critical unless the
1095 warning option has been passed in as a non-zero value. Other than that
1096 specific usage, the "--warning" and "--critical" options should not be
1097 used.
1098
1099 Example 1: On port 5432, ensure the logfile is being written to the
1100 file /home/greg/pg8.2.log
1101
1102 check_postgres_logfile --port=5432 --logfile=/home/greg/pg8.2.log
1103
1104 Example 2: Same as above, but raise a warning, not a critical
1105
1106 check_postgres_logfile --port=5432 --logfile=/home/greg/pg8.2.log -w 1
1107
1108 For MRTG output, returns a 1 or 0 on the first line, indicating success
1109 or failure. In case of a failure, the fourth line will provide more
1110 detail on the failure encountered.
1111
1112 new_version_bc
1113 ("symlink: check_postgres_new_version_bc") Checks if a newer version of
1114 the Bucardo program is available. The current version is obtained by
1115 running "bucardo_ctl --version". If a major upgrade is available, a
1116 warning is returned. If a revision upgrade is available, a critical is
1117 returned. (Bucardo is a master to slave, and master to master
1118 replication system for Postgres: see http://bucardo.org for more
1119 information). See also the information on the "--get_method" option.
1120
1121 new_version_box
1122 ("symlink: check_postgres_new_version_box") Checks if a newer version
1123 of the boxinfo program is available. The current version is obtained by
1124 running "boxinfo.pl --version". If a major upgrade is available, a
1125 warning is returned. If a revision upgrade is available, a critical is
1126 returned. (boxinfo is a program for grabbing important information from
1127 a server and putting it into a HTML format: see
1128 http://bucardo.org/wiki/boxinfo for more information). See also the
1129 information on the "--get_method" option.
1130
1131 new_version_cp
1132 ("symlink: check_postgres_new_version_cp") Checks if a newer version of
1133 this program (check_postgres.pl) is available, by grabbing the version
1134 from a small text file on the main page of the home page for the
1135 project. Returns a warning if the returned version does not match the
1136 one you are running. Recommended interval to check is once a day. See
1137 also the information on the "--get_method" option.
1138
1139 new_version_pg
1140 ("symlink: check_postgres_new_version_pg") Checks if a newer revision
1141 of Postgres exists for each database connected to. Note that this only
1142 checks for revision, e.g. going from 8.3.6 to 8.3.7. Revisions are
1143 always 100% binary compatible and involve no dump and restore to
1144 upgrade. Revisions are made to address bugs, so upgrading as soon as
1145 possible is always recommended. Returns a warning if you do not have
1146 the latest revision. It is recommended this check is run at least once
1147 a day. See also the information on the "--get_method" option.
1148
1149 new_version_tnm
1150 ("symlink: check_postgres_new_version_tnm") Checks if a newer version
1151 of the tail_n_mail program is available. The current version is
1152 obtained by running "tail_n_mail --version". If a major upgrade is
1153 available, a warning is returned. If a revision upgrade is available, a
1154 critical is returned. (tail_n_mail is a log monitoring tool that can
1155 send mail when interesting events appear in your Postgres logs. See:
1156 http://bucardo.org/wiki/Tail_n_mail for more information). See also
1157 the information on the "--get_method" option.
1158
1159 pgb_pool_cl_active
1160 pgb_pool_cl_waiting
1161 pgb_pool_sv_active
1162 pgb_pool_sv_idle
1163 pgb_pool_sv_used
1164 pgb_pool_sv_tested
1165 pgb_pool_sv_login
1166 pgb_pool_maxwait
1167 (symlinks: "check_postgres_pgb_pool_cl_active",
1168 "check_postgres_pgb_pool_cl_waiting",
1169 "check_postgres_pgb_pool_sv_active", "check_postgres_pgb_pool_sv_idle",
1170 "check_postgres_pgb_pool_sv_used", "check_postgres_pgb_pool_sv_tested",
1171 "check_postgres_pgb_pool_sv_login", and
1172 "check_postgres_pgb_pool_maxwait")
1173
1174 Examines pgbouncer's pool statistics. Each pool has a set of "client"
1175 connections, referring to connections from external clients, and
1176 "server" connections, referring to connections to PostgreSQL itself.
1177 The related check_postgres actions are prefixed by "cl_" and "sv_",
1178 respectively. Active client connections are those connections currently
1179 linked with an active server connection. Client connections may also be
1180 "waiting", meaning they have not yet been allocated a server
1181 connection. Server connections are "active" (linked to a client),
1182 "idle" (standing by for a client connection to link with), "used" (just
1183 unlinked from a client, and not yet returned to the idle pool),
1184 "tested" (currently being tested) and "login" (in the process of
1185 logging in). The maxwait value shows how long in seconds the oldest
1186 waiting client connection has been waiting.
1187
1188 pgbouncer_backends
1189 ("symlink: check_postgres_pgbouncer_backends") Checks the current
1190 number of connections for one or more databases through pgbouncer, and
1191 optionally compares it to the maximum allowed, which is determined by
1192 the pgbouncer configuration variable max_client_conn. The --warning and
1193 --critical options can take one of three forms. First, a simple number
1194 can be given, which represents the number of connections at which the
1195 alert will be given. This choice does not use the max_connections
1196 setting. Second, the percentage of available connections can be given.
1197 Third, a negative number can be given which represents the number of
1198 connections left until max_connections is reached. The default values
1199 for --warning and --critical are '90%' and '95%'. You can also filter
1200 the databases by use of the --include and --exclude options. See the
1201 "BASIC FILTERING" section for more details.
1202
1203 To view only non-idle processes, you can use the --noidle argument.
1204 Note that the user you are connecting as must be a superuser for this
1205 to work properly.
1206
1207 Example 1: Give a warning when the number of connections on host quirm
1208 reaches 120, and a critical if it reaches 150.
1209
1210 check_postgres_pgbouncer_backends --host=quirm --warning=120 --critical=150 -p 6432 -u pgbouncer
1211
1212 Example 2: Give a critical when we reach 75% of our max_connections
1213 setting on hosts lancre or lancre2.
1214
1215 check_postgres_pgbouncer_backends --warning='75%' --critical='75%' --host=lancre,lancre2 -p 6432 -u pgbouncer
1216
1217 Example 3: Give a warning when there are only 10 more connection slots
1218 left on host plasmid, and a critical when we have only 5 left.
1219
1220 check_postgres_pgbouncer_backends --warning=-10 --critical=-5 --host=plasmid -p 6432 -u pgbouncer
1221
1222 For MRTG output, the number of connections is reported on the first
1223 line, and the fourth line gives the name of the database, plus the
1224 current max_client_conn. If more than one database has been queried,
1225 the one with the highest number of connections is output.
1226
1227 pgbouncer_checksum
1228 ("symlink: check_postgres_pgbouncer_checksum") Checks that all the
1229 pgBouncer settings are the same as last time you checked. This is done
1230 by generating a checksum of a sorted list of setting names and their
1231 values. Note that you shouldn't specify the database name, it will
1232 automatically default to pgbouncer. Either the --warning or the
1233 --critical option should be given, but not both. The value of each one
1234 is the checksum, a 32-character hexadecimal value. You can run with the
1235 special "--critical=0" option to find out an existing checksum.
1236
1237 This action requires the Digest::MD5 module.
1238
1239 Example 1: Find the initial checksum for pgbouncer configuration on
1240 port 6432 using the default user (usually postgres)
1241
1242 check_postgres_pgbouncer_checksum --port=6432 --critical=0
1243
1244 Example 2: Make sure no settings have changed and warn if so, using the
1245 checksum from above.
1246
1247 check_postgres_pgbouncer_checksum --port=6432 --warning=cd2f3b5e129dc2b4f5c0f6d8d2e64231
1248
1249 For MRTG output, returns a 1 or 0 indicating success of failure of the
1250 checksum to match. A checksum must be provided as the "--mrtg"
1251 argument. The fourth line always gives the current checksum.
1252
1253 pgagent_jobs
1254 ("symlink: check_postgres_pgagent_jobs") Checks that all the pgAgent
1255 jobs that have executed in the preceding interval of time have
1256 succeeded. This is done by checking for any steps that have a non-zero
1257 result.
1258
1259 Either "--warning" or "--critical", or both, may be specified as times,
1260 and jobs will be checked for failures withing the specified periods of
1261 time before the current time. Valid units are seconds, minutes, hours,
1262 and days; all can be abbreviated to the first letter. If no units are
1263 given, 'seconds' are assumed.
1264
1265 Example 1: Give a critical when any jobs executed in the last day have
1266 failed.
1267
1268 check_postgres_pgagent_jobs --critical=1d
1269
1270 Example 2: Give a warning when any jobs executed in the last week have
1271 failed.
1272
1273 check_postgres_pgagent_jobs --warning=7d
1274
1275 Example 3: Give a critical for jobs that have failed in the last 2
1276 hours and a warning for jobs that have failed in the last 4 hours:
1277
1278 check_postgres_pgagent_jobs --critical=2h --warning=4h
1279
1280 prepared_txns
1281 ("symlink: check_postgres_prepared_txns") Check on the age of any
1282 existing prepared transactions. Note that most people will NOT use
1283 prepared transactions, as they are part of two-part commit and
1284 complicated to maintain. They should also not be confused with prepared
1285 STATEMENTS, which is what most people think of when they hear prepare.
1286 The default value for a warning is 1 second, to detect any use of
1287 prepared transactions, which is probably a mistake on most systems.
1288 Warning and critical are the number of seconds a prepared transaction
1289 has been open before an alert is given.
1290
1291 Example 1: Give a warning on detecting any prepared transactions:
1292
1293 check_postgres_prepared_txns -w 0
1294
1295 Example 2: Give a critical if any prepared transaction has been open
1296 longer than 10 seconds, but allow up to 360 seconds for the database
1297 'shrike':
1298
1299 check_postgres_prepared_txns --critical=10 --exclude=shrike
1300 check_postgres_prepared_txns --critical=360 --include=shrike
1301
1302 For MRTG output, returns the number of seconds the oldest transaction
1303 has been open as the first line, and which database is came from as the
1304 final line.
1305
1306 query_runtime
1307 ("symlink: check_postgres_query_runtime") Checks how long a specific
1308 query takes to run, by executing a "EXPLAIN ANALYZE" against it. The
1309 --warning and --critical options are the maximum amount of time the
1310 query should take. Valid units are seconds, minutes, and hours; any can
1311 be abbreviated to the first letter. If no units are given, 'seconds'
1312 are assumed. Both the warning and the critical option must be given.
1313 The name of the view or function to be run must be passed in to the
1314 --queryname option. It must consist of a single word (or schema.word),
1315 with optional parens at the end.
1316
1317 Example 1: Give a critical if the function named "speedtest" fails to
1318 run in 10 seconds or less.
1319
1320 check_postgres_query_runtime --queryname='speedtest()' --critical=10 --warning=10
1321
1322 For MRTG output, reports the time in seconds for the query to complete
1323 on the first line. The fourth line lists the database.
1324
1325 query_time
1326 ("symlink: check_postgres_query_time") Checks the length of running
1327 queries on one or more databases. There is no need to run this more
1328 than once on the same database cluster. Note that this already excludes
1329 queries that are "idle in transaction". Databases can be filtered by
1330 using the --include and --exclude options. See the "BASIC FILTERING"
1331 section for more details. You can also filter on the user running the
1332 query with the --includeuser and --excludeuser options. See the "USER
1333 NAME FILTERING" section for more details.
1334
1335 The values for the --warning and --critical options are amounts of
1336 time, and default to '2 minutes' and '5 minutes' respectively. Valid
1337 units are 'seconds', 'minutes', 'hours', or 'days'. Each may be written
1338 singular or abbreviated to just the first letter. If no units are
1339 given, the unit is assumed to be seconds.
1340
1341 This action requires Postgres 8.1 or better.
1342
1343 Example 1: Give a warning if any query has been running longer than 3
1344 minutes, and a critical if longer than 5 minutes.
1345
1346 check_postgres_query_time --port=5432 --warning='3 minutes' --critical='5 minutes'
1347
1348 Example 2: Using default values (2 and 5 minutes), check all databases
1349 except those starting with 'template'.
1350
1351 check_postgres_query_time --port=5432 --exclude=~^template
1352
1353 Example 3: Warn if user 'don' has a query running over 20 seconds
1354
1355 check_postgres_query_time --port=5432 --includeuser=don --warning=20s
1356
1357 For MRTG output, returns the length in seconds of the longest running
1358 query on the first line. The fourth line gives the name of the
1359 database.
1360
1361 replicate_row
1362 ("symlink: check_postgres_replicate_row") Checks that master-slave
1363 replication is working to one or more slaves.
1364
1365 The first "--dbname", "--host", and "--port", etc. options are
1366 considered the master; subsequent uses are the slaves. The values or
1367 the --warning and --critical options are units of time, and at least
1368 one must be provided (no defaults). Valid units are 'seconds',
1369 'minutes', 'hours', or 'days'. Each may be written singular or
1370 abbreviated to just the first letter. If no units are given, the units
1371 are assumed to be seconds.
1372
1373 This check updates a single row on the master, and then measures how
1374 long it takes to be applied to the slaves. To do this, you need to pick
1375 a table that is being replicated, then find a row that can be changed,
1376 and is not going to be changed by any other process. A specific column
1377 of this row will be changed from one value to another. All of this is
1378 fed to the "repinfo" option, and should contain the following options,
1379 separated by commas: table name, primary key, key id, column, first
1380 value, second value.
1381
1382 Example 1: Slony is replicating a table named 'orders' from host
1383 'alpha' to host 'beta', in the database 'sales'. The primary key of the
1384 table is named id, and we are going to test the row with an id of 3
1385 (which is historical and never changed). There is a column named
1386 'salesrep' that we are going to toggle from a value of 'slon' to 'nols'
1387 to check on the replication. We want to throw a warning if the
1388 replication does not happen within 10 seconds.
1389
1390 check_postgres_replicate_row --host=alpha --dbname=sales --host=beta
1391 --dbname=sales --warning=10 --repinfo=orders,id,3,salesrep,slon,nols
1392
1393 Example 2: Bucardo is replicating a table named 'receipt' from host
1394 'green' to hosts 'red', 'blue', and 'yellow'. The database for both
1395 sides is 'public'. The slave databases are running on port 5455. The
1396 primary key is named 'receipt_id', the row we want to use has a value
1397 of 9, and the column we want to change for the test is called 'zone'.
1398 We'll toggle between 'north' and 'south' for the value of this column,
1399 and throw a critical if the change is not on all three slaves within 5
1400 seconds.
1401
1402 check_postgres_replicate_row --host=green --port=5455 --host=red,blue,yellow
1403 --critical=5 --repinfo=receipt,receipt_id,9,zone,north,south
1404
1405 For MRTG output, returns on the first line the time in seconds the
1406 replication takes to finish. The maximum time is set to 4 minutes 30
1407 seconds: if no replication has taken place in that long a time, an
1408 error is thrown.
1409
1410 same_schema
1411 ("symlink: check_postgres_same_schema") Verifies that two or more
1412 databases are identical as far as their schema (but not the data
1413 within). This is particularly handy for making sure your slaves have
1414 not been modified or corrupted in any way when using master to slave
1415 replication. Unlike most other actions, this has no warning or critical
1416 criteria - the databases are either in sync, or are not. If they are
1417 different, a detailed list of the differences is presented.
1418
1419 You may want to exclude or filter out certain differences. The way to
1420 do this is to add strings to the "--filter" option. To exclude a type
1421 of object, use "noname", where 'name' is the type of object, for
1422 example, "noschema". To exclude objects of a certain type by a regular
1423 expression against their name, use "noname=regex". See the examples
1424 below for a better understanding.
1425
1426 The types of objects that can be filtered include:
1427
1428 user
1429 schema
1430 table
1431 view
1432 index
1433 sequence
1434 constraint
1435 trigger
1436 function
1437
1438 The filter option "noposition" prevents verification of the position
1439 of columns within a table.
1440
1441 The filter option "nofuncbody" prevents comparison of the bodies of all
1442 functions.
1443
1444 The filter option "noperm" prevents comparison of object permissions.
1445
1446 To provide the second database, just append the differences to the
1447 first one by a call to the appropriate connection argument. For
1448 example, to compare databases on hosts alpha and bravo, use
1449 "--dbhost=alpha,bravo". Also see the examples below.
1450
1451 If only a single host is given, it is assumed we are doing a "time-
1452 based" report. The first time this is run a snapshot of all the items
1453 in the database is saved to a local file. When you run it again, that
1454 snapshot is read in and becomes "database #2" and is compared to the
1455 current database.
1456
1457 To replace the old stored file with the new version, use the --replace
1458 argument.
1459
1460 To enable snapshots at various points in time, you can use the
1461 "--suffix" argument to make the filenames unique to each run. See the
1462 examples below.
1463
1464 Example 1: Verify that two databases on hosts star and line are the
1465 same:
1466
1467 check_postgres_same_schema --dbhost=star,line
1468
1469 Example 2: Same as before, but exclude any triggers with "slony" in
1470 their name
1471
1472 check_postgres_same_schema --dbhost=star,line --filter="notrigger=slony"
1473
1474 Example 3: Same as before, but also exclude all indexes
1475
1476 check_postgres_same_schema --dbhost=star,line --filter="notrigger=slony noindexes"
1477
1478 Example 4: Check differences for the database "battlestar" on different
1479 ports
1480
1481 check_postgres_same_schema --dbname=battlestar --dbport=5432,5544
1482
1483 Example 5: Create a daily and weekly snapshot file
1484
1485 check_postgres_same_schema --dbname=cylon --suffix=daily
1486 check_postgres_same_schema --dbname=cylon --suffix=weekly
1487
1488 Example 6: Run a historical comparison, then replace the file
1489
1490 check_postgres_same_schema --dbname=cylon --suffix=daily --replace
1491
1492 sequence
1493 ("symlink: check_postgres_sequence") Checks how much room is left on
1494 all sequences in the database. This is measured as the percent of
1495 total possible values that have been used for each sequence. The
1496 --warning and --critical options should be expressed as percentages.
1497 The default values are 85% for the warning and 95% for the critical.
1498 You may use --include and --exclude to control which sequences are to
1499 be checked. Note that this check does account for unusual minvalue and
1500 increment by values, but does not care if the sequence is set to cycle
1501 or not.
1502
1503 The output for Nagios gives the name of the sequence, the percentage
1504 used, and the number of 'calls' left, indicating how many more times
1505 nextval can be called on that sequence before running into the maximum
1506 value.
1507
1508 The output for MRTG returns the highest percentage across all sequences
1509 on the first line, and the name of each sequence with that percentage
1510 on the fourth line, separated by a "|" (pipe) if there are more than
1511 one sequence at that percentage.
1512
1513 Example 1: Give a warning if any sequences are approaching 95% full.
1514
1515 check_postgres_sequence --dbport=5432 --warning=95%
1516
1517 Example 2: Check that the sequence named "orders_id_seq" is not more
1518 than half full.
1519
1520 check_postgres_sequence --dbport=5432 --critical=50% --include=orders_id_seq
1521
1522 settings_checksum
1523 ("symlink: check_postgres_settings_checksum") Checks that all the
1524 Postgres settings are the same as last time you checked. This is done
1525 by generating a checksum of a sorted list of setting names and their
1526 values. Note that different users in the same database may have
1527 different checksums, due to ALTER USER usage, and due to the fact that
1528 superusers see more settings than ordinary users. Either the --warning
1529 or the --critical option should be given, but not both. The value of
1530 each one is the checksum, a 32-character hexadecimal value. You can run
1531 with the special "--critical=0" option to find out an existing
1532 checksum.
1533
1534 This action requires the Digest::MD5 module.
1535
1536 Example 1: Find the initial checksum for the database on port 5555
1537 using the default user (usually postgres)
1538
1539 check_postgres_settings_checksum --port=5555 --critical=0
1540
1541 Example 2: Make sure no settings have changed and warn if so, using the
1542 checksum from above.
1543
1544 check_postgres_settings_checksum --port=5555 --warning=cd2f3b5e129dc2b4f5c0f6d8d2e64231
1545
1546 For MRTG output, returns a 1 or 0 indicating success of failure of the
1547 checksum to match. A checksum must be provided as the "--mrtg"
1548 argument. The fourth line always gives the current checksum.
1549
1550 slony_status
1551 ("symlink: check_postgres_slony_status") Checks in the status of a
1552 Slony cluster by looking at the results of Slony's sl_status view. This
1553 is returned as the number of seconds of "lag time". The --warning and
1554 --critical options should be expressed as times. The default values are
1555 60 seconds for the warning and 300 seconds for the critical.
1556
1557 The optional argument --schema indicated the schema that Slony is
1558 installed under. If it is not given, the schema will be determined
1559 automatically each time this check is run.
1560
1561 Example 1: Give a warning if any Slony is lagged by more than 20
1562 seconds
1563
1564 check_postgres_slony_status --warning 20
1565
1566 Example 2: Give a critical if Slony, installed under the schema
1567 "_slony", is over 10 minutes lagged
1568
1569 check_postgres_slony_status --schema=_slony --critical=600
1570
1571 timesync
1572 ("symlink: check_postgres_timesync") Compares the local system time
1573 with the time reported by one or more databases. The --warning and
1574 --critical options represent the number of seconds between the two
1575 systems before an alert is given. If neither is specified, the default
1576 values are used, which are '2' and '5'. The warning value cannot be
1577 greater than the critical value. Due to the non-exact nature of this
1578 test, values of '0' or '1' are not recommended.
1579
1580 The string returned shows the time difference as well as the time on
1581 each side written out.
1582
1583 Example 1: Check that databases on hosts ankh, morpork, and klatch are
1584 no more than 3 seconds off from the local time:
1585
1586 check_postgres_timesync --host=ankh,morpork,klatch --critical=3
1587
1588 For MRTG output, returns one the first line the number of seconds
1589 difference between the local time and the database time. The fourth
1590 line returns the name of the database.
1591
1592 txn_idle
1593 ("symlink: check_postgres_txn_idle") Checks the number and duration of
1594 "idle in transaction" queries on one or more databases. There is no
1595 need to run this more than once on the same database cluster. Databases
1596 can be filtered by using the --include and --exclude options. See the
1597 "BASIC FILTERING" section below for more details.
1598
1599 The --warning and --critical options are given as units of time, signed
1600 integers, or integers for units of time, and both must be provided
1601 (there are no defaults). Valid units are 'seconds', 'minutes', 'hours',
1602 or 'days'. Each may be written singular or abbreviated to just the
1603 first letter. If no units are given and the numbers are unsigned, the
1604 units are assumed to be seconds.
1605
1606 This action requires Postgres 8.3 or better.
1607
1608 Example 1: Give a warning if any connection has been idle in
1609 transaction for more than 15 seconds:
1610
1611 check_postgres_txn_idle --port=5432 --warning='15 seconds'
1612
1613 Example 2: Give a warning if there are 50 or more transactions
1614
1615 check_postgres_txn_idle --port=5432 --warning='+50'
1616
1617 Example 3: Give a critical if 5 or more connections have been idle in
1618 transaction for more than 10 seconds:
1619
1620 check_postgres_txn_idle --port=5432 --critical='5 for 10 seconds'
1621
1622 For MRTG output, returns the time in seconds the longest idle
1623 transaction has been running. The fourth line returns the name of the
1624 database and other information about the longest transaction.
1625
1626 txn_time
1627 ("symlink: check_postgres_txn_time") Checks the length of open
1628 transactions on one or more databases. There is no need to run this
1629 command more than once per database cluster. Databases can be filtered
1630 by use of the --include and --exclude options. See the "BASIC
1631 FILTERING" section for more details. The owner of the transaction can
1632 also be filtered, by use of the --includeuser and --excludeuser
1633 options. See the "USER NAME FILTERING" section for more details.
1634
1635 The values or the --warning and --critical options are units of time,
1636 and must be provided (no default). Valid units are 'seconds',
1637 'minutes', 'hours', or 'days'. Each may be written singular or
1638 abbreviated to just the first letter. If no units are given, the units
1639 are assumed to be seconds.
1640
1641 This action requires Postgres 8.3 or better.
1642
1643 Example 1: Give a critical if any transaction has been open for more
1644 than 10 minutes:
1645
1646 check_postgres_txn_time --port=5432 --critical='10 minutes'
1647
1648 Example 1: Warn if user 'warehouse' has a transaction open over 30
1649 seconds
1650
1651 check_postgres_txn_time --port-5432 --warning=30s --includeuser=warehouse
1652
1653 For MRTG output, returns the maximum time in seconds a transaction has
1654 been open on the first line. The fourth line gives the name of the
1655 database.
1656
1657 txn_wraparound
1658 ("symlink: check_postgres_txn_wraparound") Checks how close to
1659 transaction wraparound one or more databases are getting. The
1660 --warning and --critical options indicate the number of transactions
1661 done, and must be a positive integer. If either option is not given,
1662 the default values of 1.3 and 1.4 billion are used. There is no need to
1663 run this command more than once per database cluster. For a more
1664 detailed discussion of what this number represents and what to do about
1665 it, please visit the page
1666 <http://www.postgresql.org/docs/current/static/routine-vacuuming.html#VACUUM-FOR-WRAPAROUND>
1667
1668 The warning and critical values can have underscores in the number for
1669 legibility, as Perl does.
1670
1671 Example 1: Check the default values for the localhost database
1672
1673 check_postgres_txn_wraparound --host=localhost
1674
1675 Example 2: Check port 6000 and give a critical when 1.7 billion
1676 transactions are hit:
1677
1678 check_postgres_txn_wraparound --port=6000 --critical=1_700_000_000
1679
1680 For MRTG output, returns the highest number of transactions for all
1681 databases on line one, while line 4 indicates which database it is.
1682
1683 version
1684 ("symlink: check_postgres_version") Checks that the required version of
1685 Postgres is running. The --warning and --critical options (only one is
1686 required) must be of the format X.Y or X.Y.Z where X is the major
1687 version number, Y is the minor version number, and Z is the revision.
1688
1689 Example 1: Give a warning if the database on port 5678 is not version
1690 8.4.10:
1691
1692 check_postgres_version --port=5678 -w=8.4.10
1693
1694 Example 2: Give a warning if any databases on hosts valley,grain, or
1695 sunshine is not 8.3:
1696
1697 check_postgres_version -H valley,grain,sunshine --critical=8.3
1698
1699 For MRTG output, reports a 1 or a 0 indicating success or failure on
1700 the first line. The fourth line indicates the current version. The
1701 version must be provided via the "--mrtg" option.
1702
1703 wal_files
1704 ("symlink: check_postgres_wal_files") Checks how many WAL files exist
1705 in the pg_xlog directory, which is found off of your data_directory,
1706 sometimes as a symlink to another physical disk for performance
1707 reasons. This action must be run as a superuser, in order to access the
1708 contents of the pg_xlog directory. The minimum version to use this
1709 action is Postgres 8.1. The --warning and --critical options are simply
1710 the number of files in the pg_xlog directory. What number to set this
1711 to will vary, but a general guideline is to put a number slightly
1712 higher than what is normally there, to catch problems early.
1713
1714 Normally, WAL files are closed and then re-used, but a long-running
1715 open transaction, or a faulty archive_command script, may cause
1716 Postgres to create too many files. Ultimately, this will cause the disk
1717 they are on to run out of space, at which point Postgres will shut
1718 down.
1719
1720 Example 1: Check that the number of WAL files is 20 or less on host
1721 "pluto"
1722
1723 check_postgres_wal_files --host=pluto --critical=20
1724
1725 For MRTG output, reports the number of WAL files on line 1.
1726
1727 rebuild_symlinks
1728 rebuild_symlinks_force
1729 This action requires no other arguments, and does not connect to any
1730 databases, but simply creates symlinks in the current directory for
1731 each action, in the form check_postgres_<action_name>. If the file
1732 already exists, it will not be overwritten. If the action is
1733 rebuild_symlinks_force, then symlinks will be overwritten. The option
1734 --symlinks is a shorter way of saying --action=rebuild_symlinks
1735
1737 The options --include and --exclude can be combined to limit which
1738 things are checked, depending on the action. The name of the database
1739 can be filtered when using the following actions: backends,
1740 database_size, locks, query_time, txn_idle, and txn_time. The name of
1741 a relation can be filtered when using the following actions: bloat,
1742 index_size, table_size, relation_size, last_vacuum, last_autovacuum,
1743 last_analyze, and last_autoanalyze. The name of a setting can be
1744 filtered when using the settings_checksum action. The name of a file
1745 system can be filtered when using the disk_space action.
1746
1747 If only an include option is given, then ONLY those entries that match
1748 will be checked. However, if given both exclude and include, the
1749 exclusion is done first, and the inclusion after, to reinstate things
1750 that may have been excluded. Both --include and --exclude can be given
1751 multiple times, and/or as comma-separated lists. A leading tilde will
1752 match the following word as a regular expression.
1753
1754 To match a schema, end the search term with a single period. Leading
1755 tildes can be used for schemas as well.
1756
1757 Be careful when using filtering: an inclusion rule on the backends, for
1758 example, may report no problems not only because the matching database
1759 had no backends, but because you misspelled the name of the database!
1760
1761 Examples:
1762
1763 Only checks items named pg_class:
1764
1765 --include=pg_class
1766
1767 Only checks items containing the letters 'pg_':
1768
1769 --include=~pg_
1770
1771 Only check items beginning with 'pg_':
1772
1773 --include=~^pg_
1774
1775 Exclude the item named 'test':
1776
1777 --exclude=test
1778
1779 Exclude all items containing the letters 'test:
1780
1781 --exclude=~test
1782
1783 Exclude all items in the schema 'pg_catalog':
1784
1785 --exclude='pg_catalog.'
1786
1787 Exclude all items containing the letters 'ace', but allow the item
1788 'faceoff':
1789
1790 --exclude=~ace --include=faceoff
1791
1792 Exclude all items which start with the letters 'pg_', which contain the
1793 letters 'slon', or which are named 'sql_settings' or 'green'.
1794 Specifically check items with the letters 'prod' in their names, and
1795 always check the item named 'pg_relname':
1796
1797 --exclude=~^pg_,~slon,sql_settings --exclude=green --include=~prod,pg_relname
1798
1800 The options --includeuser and --excludeuser can be used on some actions
1801 to only examine database objects owned by (or not owned by) one or more
1802 users. An --includeuser option always trumps an --excludeuser option.
1803 You can give each option more than once for multiple users, or you can
1804 give a comma-separated list. The actions that currently use these
1805 options are:
1806
1807 database_size
1808 last_analyze
1809 last_autoanalyze
1810 last_vacuum
1811 last_autovacuum
1812 query_time
1813 relation_size
1814 txn_time
1815
1816 Examples:
1817
1818 Only check items owned by the user named greg:
1819
1820 --includeuser=greg
1821
1822 Only check items owned by either watson or crick:
1823
1824 --includeuser=watson,crick
1825
1826 Only check items owned by crick,franklin, watson, or wilkins:
1827
1828 --includeuser=watson --includeuser=franklin --includeuser=crick,wilkins
1829
1830 Check all items except for those belonging to the user scott:
1831
1832 --excludeuser=scott
1833
1835 To help in setting things up, this program can be run in a "test mode"
1836 by specifying the --test option. This will perform some basic tests to
1837 make sure that the databases can be contacted, and that certain per-
1838 action prerequisites are met, such as whether the user is a superuser,
1839 if the version of Postgres is new enough, and if stats_row_level is
1840 enabled.
1841
1843 In addition to command-line configurations, you can put any options
1844 inside of a file. The file .check_postgresrc in the current directory
1845 will be used if found. If not found, then the file ~/.check_postgresrc
1846 will be used. Finally, the file /etc/check_postgresrc will be used if
1847 available. The format of the file is option = value, one per line. Any
1848 line starting with a '#' will be skipped. Any values loaded from a
1849 check_postgresrc file will be overwritten by command-line options. All
1850 check_postgresrc files can be ignored by supplying a
1851 "--no-checkpostgresrc" argument.
1852
1854 The environment variable $ENV{HOME} is used to look for a
1855 .check_postgresrc file. The environment variable $ENV{PGBINDIR} is
1856 used to look for PostgreSQL binaries.
1857
1859 Since this program uses the psql program, make sure it is accessible to
1860 the user running the script. If run as a cronjob, this often means
1861 modifying the PATH environment variable.
1862
1863 If you are using Nagios in embedded Perl mode, use the "--action"
1864 argument instead of symlinks, so that the plugin only gets compiled one
1865 time.
1866
1868 Access to a working version of psql, and the following very standard
1869 Perl modules:
1870
1871 Cwd
1872 Getopt::Long
1873 File::Basename
1874 File::Temp
1875 Time::HiRes (if $opt{showtime} is set to true, which is the default)
1876
1877 The "settings_checksum" action requires the Digest::MD5 module.
1878
1879 The "checkpoint" action requires the Date::Parse module.
1880
1881 Some actions require access to external programs. If psql is not
1882 explicitly specified, the command "which" is used to find it. The
1883 program "/bin/df" is needed by the "disk_space" action.
1884
1886 Development happens using the git system. You can clone the latest
1887 version by doing:
1888
1889 git clone git://bucardo.org/check_postgres.git
1890
1892 Three mailing lists are available. For discussions about the program,
1893 bug reports, feature requests, and commit notices, send email to
1894 check_postgres@bucardo.org
1895
1896 https://mail.endcrypt.com/mailman/listinfo/check_postgres
1897
1898 A low-volume list for announcement of new versions and important
1899 notices is the 'check_postgres-announce' list:
1900
1901 https://mail.endcrypt.com/mailman/listinfo/check_postgres-announce
1902
1903 Source code changes (via git-commit) are sent to the
1904 'check_postgres-commit' list:
1905
1906 https://mail.endcrypt.com/mailman/listinfo/check_postgres-commit
1907
1909 Items not specifically attributed are by GSM (Greg Sabino Mullane).
1910
1911 Version 2.22.0 June 30, 2015
1912 Add xact timestamp support to hot_standby_delay.
1913 Allow the hot_standby_delay check to accept xlog byte position or
1914 timestamp lag intervals as thresholds, or even both at the same time.
1915 (Josh Williams)
1916
1917 Query all sequences per DB in parallel for action=sequence.
1918 (Christoph Berg)
1919
1920 Fix bloat check to use correct SQL depending on the server version.
1921 (Adrian Vondendriesch)
1922
1923 Show actual long-running query in query_time output
1924 (Peter Eisentraut)
1925
1926 Add explicit ORDER BY to the slony_status check to get the most lagged server.
1927 (Jeff Frost)
1928
1929 Improved multi-slave support in replicate_row.
1930 (Andrew Yochum)
1931
1932 Change the way tables are quoted in replicate_row.
1933 (Glyn Astill)
1934
1935 Don't swallow space before the -c flag when reporting errors
1936 (Jeff Janes)
1937
1938 Fix and extend hot_standby_delay documentation
1939 (Michael Renner)
1940
1941 Declare POD encoding to be utf8.
1942 (Christoph Berg)
1943
1944 Version 2.21.0 September 24, 2013
1945 Fix issue with SQL steps in check_pgagent_jobs for sql steps which perform deletes
1946 (Rob Emery via github pull)
1947
1948 Install man page in section 1.
1949 (Peter Eisentraut, bug 53, github issue 26)
1950
1951 Order lock types in check_locks output to make the ordering predictable;
1952 setting SKIP_NETWORK_TESTS will skip the new_version tests; other minor test
1953 suite fixes.
1954 (Christoph Berg)
1955
1956 Fix same_schema check on 9.3 by ignoring relminmxid differences in pg_class
1957 (Christoph Berg)
1958
1959 Version 2.20.1 June 24, 2013
1960 Make connection check failures return CRITICAL not UNKNOWN
1961 (Dominic Hargreaves)
1962
1963 Fix --reverse option when using string comparisons in custom queries
1964 (Nathaniel Waisbrot)
1965
1966 Compute correct 'totalwastedbytes' in the bloat query
1967 (Michael Renner)
1968
1969 Do not use pg_stats "inherited" column in bloat query, if the
1970 database is 8.4 or older. (Greg Sabino Mullane, per bug 121)
1971
1972 Remove host reordering in hot_standby_delay check
1973 (Josh Williams, with help from Jacobo Blasco)
1974
1975 Better output for the "simple" flag
1976 (Greg Sabino Mullane)
1977
1978 Force same_schema to ignore the 'relallvisible' column
1979 (Greg Sabino Mullane)
1980
1981 Version 2.20.0 March 13, 2013
1982 Add check for pgagent jobs (David E. Wheeler)
1983
1984 Force STDOUT to use utf8 for proper output
1985 (Greg Sabino Mullane; reported by Emmanuel Lesouef)
1986
1987 Fixes for Postgres 9.2: new pg_stat_activity view,
1988 and use pg_tablespace_location, (Josh Williams)
1989
1990 Allow for spaces in item lists when doing same_schema.
1991
1992 Allow txn_idle to work again for < 8.3 servers by switching to query_time.
1993
1994 Fix the check_bloat SQL to take inherited tables into account,
1995 and assume 2k for non-analyzed columns. (Geert Pante)
1996
1997 Cache sequence information to speed up same_schema runs.
1998
1999 Fix --excludeuser in check_txn_idle (Mika Eloranta)
2000
2001 Fix user clause handling in check_txn_idle (Michael van Bracht)
2002
2003 Adjust docs to show colon as a better separator inside args for locks
2004 (Charles Sprickman)
2005
2006 Fix undefined $SQL2 error in check_txn_idle [github issue 16] (Patric Bechtel)
2007
2008 Prevent "uninitialized value" warnings when showing the port (Henrik Ahlgren)
2009
2010 Do not assume everyone has a HOME [github issue 23]
2011
2012 Version 2.19.0 January 17, 2012
2013 Add the --assume-prod option (Cédric Villemain)
2014
2015 Add the cluster_id check (Cédric Villemain)
2016
2017 Improve settings_checksum and checkpoint tests (Cédric Villemain)
2018
2019 Do not do an inner join to pg_user when checking database size
2020 (Greg Sabino Mullane; reported by Emmanuel Lesouef)
2021
2022 Use the full path when getting sequence information for same_schema.
2023 (Greg Sabino Mullane; reported by Cindy Wise)
2024
2025 Fix the formula for calculating xlog positions (Euler Taveira de Oliveira)
2026
2027 Better ordering of output for bloat check - make indexes as important
2028 as tables (Greg Sabino Mullane; reported by Jens Wilke)
2029
2030 Show the dbservice if it was used at top of same_schema output
2031 (Mike Blackwell)
2032
2033 Better installation paths (Greg Sabino Mullane, per bug 53)
2034
2035 Version 2.18.0 October 2, 2011
2036 Redo the same_schema action. Use new --filter argument for all filtering.
2037 Allow comparisons between any number of databases.
2038 Remove the dbname2, dbport2, etc. arguments.
2039 Allow comparison of the same db over time.
2040
2041 Swap db1 and db2 if the slave is 1 for the hot standby check (David E. Wheeler)
2042
2043 Allow multiple --schema arguments for the slony_status action (GSM and Jehan-Guillaume de Rorthais)
2044
2045 Fix ORDER BY in the last vacuum/analyze action (Nicolas Thauvin)
2046
2047 Fix check_hot_standby_delay perfdata output (Nicolas Thauvin)
2048
2049 Look in the correct place for the .ready files with the archive_ready action (Nicolas Thauvin)
2050
2051 New action: commitratio (Guillaume Lelarge)
2052
2053 New action: hitratio (Guillaume Lelarge)
2054
2055 Make sure --action overrides the symlink naming trick.
2056
2057 Set defaults for archive_ready and wal_files (Thomas Guettler, GSM)
2058
2059 Better output for wal_files and archive_ready (GSM)
2060
2061 Fix warning when client_port set to empty string (bug #79)
2062
2063 Account for "empty row" in -x output (i.e. source of functions).
2064
2065 Fix some incorrectly named data fields (Andy Lester)
2066
2067 Expand the number of pgbouncer actions (Ruslan Kabalin)
2068
2069 Give detailed information and refactor txn_idle, txn_time, and query_time
2070 (Per request from bug #61)
2071
2072 Set maxalign to 8 in the bloat check if box identified as '64-bit'
2073 (Michel Sijmons, bug #66)
2074
2075 Support non-standard version strings in the bloat check.
2076 (Michel Sijmons and Gurjeet Singh, bug #66)
2077
2078 Do not show excluded databases in some output (Ruslan Kabalin)
2079
2080 Allow "and", "or" inside arguments (David E. Wheeler)
2081
2082 Add the "new_version_box" action.
2083
2084 Fix psql version regex (Peter Eisentraut, bug #69)
2085
2086 Add the --assume-standby-mode option (Ruslan Kabalin)
2087
2088 Note that txn_idle and query_time require 8.3 (Thomas Guettler)
2089
2090 Standardize and clean up all perfdata output (bug #52)
2091
2092 Exclude "idle in transaction" from the query_time check (bug #43)
2093
2094 Fix the perflimit for the bloat action (bug #50)
2095
2096 Clean up the custom_query action a bit.
2097
2098 Fix space in perfdata for hot_standby_delay action (Nicolas Thauvin)
2099
2100 Handle undef percents in check_fsm_relations (Andy Lester)
2101
2102 Fix typo in dbstats action (Stas Vitkovsky)
2103
2104 Fix MRTG for last vacuum and last_analyze actions.
2105
2106 Version 2.17.0 no public release
2107 Version 2.16.0 January 20, 2011
2108 Add new action 'hot_standby_delay' (Nicolas Thauvin)
2109 Add cache-busting for the version-grabbing utilities.
2110 Fix problem with going to next method for new_version_pg
2111 (Greg Sabino Mullane, reported by Hywel Mallett in bug #65)
2112 Allow /usr/local/etc as an alternative location for the
2113 check_postgresrc file (Hywel Mallett)
2114 Do not use tgisconstraint in same_schema if Postgres >= 9
2115 (Guillaume Lelarge)
2116
2117 Version 2.15.4 January 3, 2011
2118 Fix warning when using symlinks
2119 (Greg Sabino Mullane, reported by Peter Eisentraut in bug #63)
2120
2121 Version 2.15.3 December 30, 2010
2122 Show OK for no matching txn_idle entries.
2123
2124 Version 2.15.2 December 28, 2010
2125 Better formatting of sizes in the bloat action output.
2126
2127 Remove duplicate perfs in bloat action output.
2128
2129 Version 2.15.1 December 27, 2010
2130 Fix problem when examining items in pg_settings (Greg Sabino Mullane)
2131
2132 For connection test, return critical, not unknown, on FATAL errors
2133 (Greg Sabino Mullane, reported by Peter Eisentraut in bug #62)
2134
2135 Version 2.15.0 November 8, 2010
2136 Add --quiet argument to suppress output on OK Nagios results
2137 Add index comparison for same_schema (Norman Yamada and Greg Sabino Mullane)
2138 Use $ENV{PGSERVICE} instead of "service=" to prevent problems (Guillaume Lelarge)
2139 Add --man option to show the entire manual. (Andy Lester)
2140 Redo the internal run_command() sub to use -x and hashes instead of regexes.
2141 Fix error in custom logic (Andreas Mager)
2142 Add the "pgbouncer_checksum" action (Guillaume Lelarge)
2143 Fix regex to work on WIN32 for check_fsm_relations and check_fsm_pages (Luke Koops)
2144 Don't apply a LIMIT when using --exclude on the bloat action (Marti Raudsepp)
2145 Change the output of query_time to show pid,user,port, and address (Giles Westwood)
2146 Fix to show database properly when using slony_status (Guillaume Lelarge)
2147 Allow warning items for same_schema to be comma-separated (Guillaume Lelarge)
2148 Constraint definitions across Postgres versions match better in same_schema.
2149 Work against "EnterpriseDB" databases (Sivakumar Krishnamurthy and Greg Sabino Mullane)
2150 Separate perfdata with spaces (Jehan-Guillaume (ioguix) de Rorthais)
2151 Add new action "archive_ready" (Jehan-Guillaume (ioguix) de Rorthais)
2152
2153 Version 2.14.3 (March 1, 2010)
2154 Allow slony_status action to handle more than one slave.
2155 Use commas to separate function args in same_schema output (Robert Treat)
2156
2157 Version 2.14.2 (February 18, 2010)
2158 Change autovac_freeze default warn/critical back to 90%/95% (Robert Treat)
2159 Put all items one-per-line for relation size actions if --verbose=1
2160
2161 Version 2.14.1 (February 17, 2010)
2162 Don't use $^T in logfile check, as script may be long-running
2163 Change the error string for the logfile action for easier exclusion
2164 by programs like tail_n_mail
2165
2166 Version 2.14.0 (February 11, 2010)
2167 Added the 'slony_status' action.
2168 Changed the logfile sleep from 0.5 to 1, as 0.5 gets rounded to 0 on some boxes!
2169
2170 Version 2.13.2 (February 4, 2010)
2171 Allow timeout option to be used for logtime 'sleep' time.
2172
2173 Version 2.13.2 (February 4, 2010)
2174 Show offending database for query_time action.
2175 Apply perflimit to main output for sequence action.
2176 Add 'noowner' option to same_schema action.
2177 Raise sleep timeout for logfile check to 15 seconds.
2178
2179 Version 2.13.1 (February 2, 2010)
2180 Fix bug preventing column constraint differences from 2 > 1 for same_schema from being shown.
2181 Allow aliases 'dbname1', 'dbhost1', 'dbport1',etc.
2182 Added "nolanguage" as a filter for the same_schema option.
2183 Don't track "generic" table constraints (e.. $1, $2) using same_schema
2184
2185 Version 2.13.0 (January 29, 2010)
2186 Allow "nofunctions" as a filter for the same_schema option.
2187 Added "noperm" as a filter for the same_schema option.
2188 Ignore dropped columns when considered positions for same_schema (Guillaume Lelarge)
2189
2190 Version 2.12.1 (December 3, 2009)
2191 Change autovac_freeze default warn/critical from 90%/95% to 105%/120% (Marti Raudsepp)
2192
2193 Version 2.12.0 (December 3, 2009)
2194 Allow the temporary directory to be specified via the "tempdir" argument,
2195 for systems that need it (e.g. /tmp is not owned by root).
2196 Fix so old versions of Postgres (< 8.0) use the correct default database (Giles Westwood)
2197 For "same_schema" trigger mismatches, show the attached table.
2198 Add the new_version_bc check for Bucardo version checking.
2199 Add database name to perf output for last_vacuum|analyze (Guillaume Lelarge)
2200 Fix for bloat action against old versions of Postgres without the 'block_size' param.
2201
2202 Version 2.11.1 (August 27, 2009)
2203 Proper Nagios output for last_vacuum|analyze actions. (Cédric Villemain)
2204 Proper Nagios output for locks action. (Cédric Villemain)
2205 Proper Nagios output for txn_wraparound action. (Cédric Villemain)
2206 Fix for constraints with embedded newlines for same_schema.
2207 Allow --exclude for all items when using same_schema.
2208
2209 Version 2.11.0 (August 23, 2009)
2210 Add Nagios perf output to the wal_files check (Cédric Villemain)
2211 Add support for .check_postgresrc, per request from Albe Laurenz.
2212 Allow list of web fetch methods to be changed with the --get_method option.
2213 Add support for the --language argument, which overrides any ENV.
2214 Add the --no-check_postgresrc flag.
2215 Ensure check_postgresrc options are completely overridden by command-line options.
2216 Fix incorrect warning > critical logic in replicate_rows (Glyn Astill)
2217
2218 Version 2.10.0 (August 3, 2009)
2219 For same_schema, compare view definitions, and compare languages.
2220 Make script into a global executable via the Makefile.PL file.
2221 Better output when comparing two databases.
2222 Proper Nagios output syntax for autovac_freeze and backends checks (Cédric Villemain)
2223
2224 Version 2.9.5 (July 24, 2009)
2225 Don't use a LIMIT in check_bloat if --include is used. Per complaint from Jeff Frost.
2226
2227 Version 2.9.4 (July 21, 2009)
2228 More French translations (Guillaume Lelarge)
2229
2230 Version 2.9.3 (July 14, 2009)
2231 Quote dbname in perf output for the backends check. (Davide Abrigo)
2232 Add 'fetch' as an alternative method for new_version checks, as this
2233 comes by default with FreeBSD. (Hywel Mallett)
2234
2235 Version 2.9.2 (July 12, 2009)
2236 Allow dots and dashes in database name for the backends check (Davide Abrigo)
2237 Check and display the database for each match in the bloat check (Cédric Villemain)
2238 Handle 'too many connections' FATAL error in the backends check with a critical,
2239 rather than a generic error (Greg, idea by Jürgen Schulz-Brüssel)
2240 Do not allow perflimit to interfere with exclusion rules in the vacuum and
2241 analyze tests. (Greg, bug reported by Jeff Frost)
2242
2243 Version 2.9.1 (June 12, 2009)
2244 Fix for multiple databases with the check_bloat action (Mark Kirkwood)
2245 Fixes and improvements to the same_schema action (Jeff Boes)
2246 Write tests for same_schema, other minor test fixes (Jeff Boes)
2247
2248 Version 2.9.0 (May 28, 2009)
2249 Added the same_schema action (Greg)
2250
2251 Version 2.8.1 (May 15, 2009)
2252 Added timeout via statement_timeout in addition to perl alarm (Greg)
2253
2254 Version 2.8.0 (May 4, 2009)
2255 Added internationalization support (Greg)
2256 Added the 'disabled_triggers' check (Greg)
2257 Added the 'prepared_txns' check (Greg)
2258 Added the 'new_version_cp' and 'new_version_pg' checks (Greg)
2259 French translations (Guillaume Lelarge)
2260 Make the backends search return ok if no matches due to inclusion rules,
2261 per report by Guillaume Lelarge (Greg)
2262 Added comprehensive unit tests (Greg, Jeff Boes, Selena Deckelmann)
2263 Make fsm_pages and fsm_relations handle 8.4 servers smoothly. (Greg)
2264 Fix missing 'upd' field in show_dbstats (Andras Fabian)
2265 Allow ENV{PGCONTROLDATA} and ENV{PGBINDIR}. (Greg)
2266 Add various Perl module infrastructure (e.g. Makefile.PL) (Greg)
2267 Fix incorrect regex in txn_wraparound (Greg)
2268 For txn_wraparound: consistent ordering and fix duplicates in perf output (Andras Fabian)
2269 Add in missing exabyte regex check (Selena Deckelmann)
2270 Set stats to zero if we bail early due to USERWHERECLAUSE (Andras Fabian)
2271 Add additional items to dbstats output (Andras Fabian)
2272 Remove --schema option from the fsm_ checks. (Greg Mullane and Robert Treat)
2273 Handle case when ENV{PGUSER} is set. (Andy Lester)
2274 Many various fixes. (Jeff Boes)
2275 Fix --dbservice: check version and use ENV{PGSERVICE} for old versions (Cédric Villemain)
2276
2277 Version 2.7.3 (February 10, 2009)
2278 Make the sequence action check if sequence being used for a int4 column and
2279 react appropriately. (Michael Glaesemann)
2280
2281 Version 2.7.2 (February 9, 2009)
2282 Fix to prevent multiple groupings if db arguments given.
2283
2284 Version 2.7.1 (February 6, 2009)
2285 Allow the -p argument for port to work again.
2286
2287 Version 2.7.0 (February 4, 2009)
2288 Do not require a connection argument, but use defaults and ENV variables when
2289 possible: PGHOST, PGPORT, PGUSER, PGDATABASE.
2290
2291 Version 2.6.1 (February 4, 2009)
2292 Only require Date::Parse to be loaded if using the checkpoint action.
2293
2294 Version 2.6.0 (January 26, 2009)
2295 Add the 'checkpoint' action.
2296
2297 Version 2.5.4 (January 7, 2009)
2298 Better checking of $opt{dbservice} structure (Cédric Villemain)
2299 Fix time display in timesync action output (Selena Deckelmann)
2300 Fix documentation typos (Josh Tolley)
2301
2302 Version 2.5.3 (December 17, 2008)
2303 Minor fix to regex in verify_version (Lee Jensen)
2304
2305 Version 2.5.2 (December 16, 2008)
2306 Minor documentation tweak.
2307
2308 Version 2.5.1 (December 11, 2008)
2309 Add support for --noidle flag to prevent backends action from counting idle processes.
2310 Patch by Selena Deckelmann.
2311
2312 Fix small undefined warning when not using --dbservice.
2313
2314 Version 2.5.0 (December 4, 2008)
2315 Add support for the pg_Service.conf file with the --dbservice option.
2316
2317 Version 2.4.3 (November 7, 2008)
2318 Fix options for replicate_row action, per report from Jason Gordon.
2319
2320 Version 2.4.2 (November 6, 2008)
2321 Wrap File::Temp::cleanup() calls in eval, in case File::Temp is an older version.
2322 Patch by Chris Butler.
2323
2324 Version 2.4.1 (November 5, 2008)
2325 Cast numbers to numeric to support sequences ranges > bigint in check_sequence action.
2326 Thanks to Scott Marlowe for reporting this.
2327
2328 Version 2.4.0 (October 26, 2008)
2329 Add Cacti support with the dbstats action.
2330 Pretty up the time output for last vacuum and analyze actions.
2331 Show the percentage of backends on the check_backends action.
2332
2333 Version 2.3.10 (October 23, 2008)
2334 Fix minor warning in action check_bloat with multiple databases.
2335 Allow warning to be greater than critical when using the --reverse option.
2336 Support the --perflimit option for the check_sequence action.
2337
2338 Version 2.3.9 (October 23, 2008)
2339 Minor tweak to way we store the default port.
2340
2341 Version 2.3.8 (October 21, 2008)
2342 Allow the default port to be changed easily.
2343 Allow transform of simple output by MB, GB, etc.
2344
2345 Version 2.3.7 (October 14, 2008)
2346 Allow multiple databases in 'sequence' action. Reported by Christoph Zwerschke.
2347
2348 Version 2.3.6 (October 13, 2008)
2349 Add missing $schema to check_fsm_pages. (Robert Treat)
2350
2351 Version 2.3.5 (October 9, 2008)
2352 Change option 'checktype' to 'valtype' to prevent collisions with -c[ritical]
2353 Better handling of errors.
2354
2355 Version 2.3.4 (October 9, 2008)
2356 Do explicit cleanups of the temp directory, per problems reported by sb@nnx.com.
2357
2358 Version 2.3.3 (October 8, 2008)
2359 Account for cases where some rounding queries give -0 instead of 0.
2360 Thanks to Glyn Astill for helping to track this down.
2361
2362 Version 2.3.2 (October 8, 2008)
2363 Always quote identifiers in check_replicate_row action.
2364
2365 Version 2.3.1 (October 7, 2008)
2366 Give a better error if one of the databases cannot be reached.
2367
2368 Version 2.3.0 (October 4, 2008)
2369 Add the "sequence" action, thanks to Gavin M. Roy for the idea.
2370 Fix minor problem with autovac_freeze action when using MRTG output.
2371 Allow output argument to be case-insensitive.
2372 Documentation fixes.
2373
2374 Version 2.2.4 (October 3, 2008)
2375 Fix some minor typos
2376
2377 Version 2.2.3 (October 1, 2008)
2378 Expand range of allowed names for --repinfo argument (Glyn Astill)
2379 Documentation tweaks.
2380
2381 Version 2.2.2 (September 30, 2008)
2382 Fixes for minor output and scoping problems.
2383
2384 Version 2.2.1 (September 28, 2008)
2385 Add MRTG output to fsm_pages and fsm_relations.
2386 Force error messages to one-line for proper Nagios output.
2387 Check for invalid prereqs on failed command. From conversations with Euler Taveira de Oliveira.
2388 Tweak the fsm_pages formula a little.
2389
2390 Version 2.2.0 (September 25, 2008)
2391 Add fsm_pages and fsm_relations actions. (Robert Treat)
2392
2393 Version 2.1.4 (September 22, 2008)
2394 Fix for race condition in txn_time action.
2395 Add --debugoutput option.
2396
2397 Version 2.1.3 (September 22, 2008)
2398 Allow alternate arguments "dbhost" for "host" and "dbport" for "port".
2399 Output a zero as default value for second line of MRTG output.
2400
2401 Version 2.1.2 (July 28, 2008)
2402 Fix sorting error in the "disk_space" action for non-Nagios output.
2403 Allow --simple as a shortcut for --output=simple.
2404
2405 Version 2.1.1 (July 22, 2008)
2406 Don't check databases with datallowconn false for the "autovac_freeze" action.
2407
2408 Version 2.1.0 (July 18, 2008)
2409 Add the "autovac_freeze" action, thanks to Robert Treat for the idea and design.
2410 Put an ORDER BY on the "txn_wraparound" action.
2411
2412 Version 2.0.1 (July 16, 2008)
2413 Optimizations to speed up the "bloat" action quite a bit.
2414 Fix "version" action to not always output in mrtg mode.
2415
2416 Version 2.0.0 (July 15, 2008)
2417 Add support for MRTG and "simple" output options.
2418 Many small improvements to nearly all actions.
2419
2420 Version 1.9.1 (June 24, 2008)
2421 Fix an error in the bloat SQL in 1.9.0
2422 Allow percentage arguments to be over 99%
2423 Allow percentages in the bloat --warning and --critical (thanks to Robert Treat for the idea)
2424
2425 Version 1.9.0 (June 22, 2008)
2426 Don't include information_schema in certain checks. (Jeff Frost)
2427 Allow --include and --exclude to use schemas by using a trailing period.
2428
2429 Version 1.8.5 (June 22, 2008)
2430 Output schema name before table name where appropriate.
2431 Thanks to Jeff Frost.
2432
2433 Version 1.8.4 (June 19, 2008)
2434 Better detection of problems in --replicate_row.
2435
2436 Version 1.8.3 (June 18, 2008)
2437 Fix 'backends' action: there may be no rows in pg_stat_activity, so run a second
2438 query if needed to find the max_connections setting.
2439 Thanks to Jeff Frost for the bug report.
2440
2441 Version 1.8.2 (June 10, 2008)
2442 Changes to allow working under Nagios' embedded Perl mode. (Ioannis Tambouras)
2443
2444 Version 1.8.1 (June 9, 2008)
2445 Allow 'bloat' action to work on Postgres version 8.0.
2446 Allow for different commands to be run for each action depending on the server version.
2447 Give better warnings when running actions not available on older Postgres servers.
2448
2449 Version 1.8.0 (June 3, 2008)
2450 Add the --reverse option to the custom_query action.
2451
2452 Version 1.7.1 (June 2, 2008)
2453 Fix 'query_time' action: account for race condition in which zero rows appear in pg_stat_activity.
2454 Thanks to Dustin Black for the bug report.
2455
2456 Version 1.7.0 (May 11, 2008)
2457 Add --replicate_row action
2458
2459 Version 1.6.1 (May 11, 2008)
2460 Add --symlinks option as a shortcut to --action=rebuild_symlinks
2461
2462 Version 1.6.0 (May 11, 2008)
2463 Add the custom_query action.
2464
2465 Version 1.5.2 (May 2, 2008)
2466 Fix problem with too eager creation of custom pgpass file.
2467
2468 Version 1.5.1 (April 17, 2008)
2469 Add example Nagios configuration settings (Brian A. Seklecki)
2470
2471 Version 1.5.0 (April 16, 2008)
2472 Add the --includeuser and --excludeuser options. Documentation cleanup.
2473
2474 Version 1.4.3 (April 16, 2008)
2475 Add in the 'output' concept for future support of non-Nagios programs.
2476
2477 Version 1.4.2 (April 8, 2008)
2478 Fix bug preventing --dbpass argument from working (Robert Treat).
2479
2480 Version 1.4.1 (April 4, 2008)
2481 Minor documentation fixes.
2482
2483 Version 1.4.0 (April 2, 2008)
2484 Have 'wal_files' action use pg_ls_dir (idea by Robert Treat).
2485 For last_vacuum and last_analyze, respect autovacuum effects, add separate
2486 autovacuum checks (ideas by Robert Treat).
2487
2488 Version 1.3.1 (April 2, 2008)
2489 Have txn_idle use query_start, not xact_start.
2490
2491 Version 1.3.0 (March 23, 2008)
2492 Add in txn_idle and txn_time actions.
2493
2494 Version 1.2.0 (February 21, 2008)
2495 Add the 'wal_files' action, which counts the number of WAL files
2496 in your pg_xlog directory.
2497 Fix some typos in the docs.
2498 Explicitly allow -v as an argument.
2499 Allow for a null syslog_facility in the 'logfile' action.
2500
2501 Version 1.1.2 (February 5, 2008)
2502 Fix error preventing --action=rebuild_symlinks from working.
2503
2504 Version 1.1.1 (February 3, 2008)
2505 Switch vacuum and analyze date output to use 'DD', not 'D'. (Glyn Astill)
2506
2507 Version 1.1.0 (December 16, 2008)
2508 Fixes, enhancements, and performance tracking.
2509 Add performance data tracking via --showperf and --perflimit
2510 Lots of refactoring and cleanup of how actions handle arguments.
2511 Do basic checks to figure out syslog file for 'logfile' action.
2512 Allow for exact matching of beta versions with 'version' action.
2513 Redo the default arguments to only populate when neither 'warning' nor 'critical' is provided.
2514 Allow just warning OR critical to be given for the 'timesync' action.
2515 Remove 'redirect_stderr' requirement from 'logfile' due to 8.3 changes.
2516 Actions 'last_vacuum' and 'last_analyze' are 8.2 only (Robert Treat)
2517
2518 Version 1.0.16 (December 7, 2007)
2519 First public release, December 2007
2520
2522 The index bloat size optimization is rough.
2523
2524 Some actions may not work on older versions of Postgres (before 8.0).
2525
2526 Please report any problems to check_postgres@bucardo.org
2527
2529 Greg Sabino Mullane <greg@endpoint.com>
2530
2532 Some example Nagios configuration settings using this script:
2533
2534 define command {
2535 command_name check_postgres_size
2536 command_line $USER2$/check_postgres.pl -H $HOSTADDRESS$ -u pgsql -db postgres --action database_size -w $ARG1$ -c $ARG2$
2537 }
2538
2539 define command {
2540 command_name check_postgres_locks
2541 command_line $USER2$/check_postgres.pl -H $HOSTADDRESS$ -u pgsql -db postgres --action locks -w $ARG1$ -c $ARG2$
2542 }
2543
2544
2545 define service {
2546 use generic-other
2547 host_name dbhost.gtld
2548 service_description dbhost PostgreSQL Service Database Usage Size
2549 check_command check_postgres_size!256000000!512000000
2550 }
2551
2552 define service {
2553 use generic-other
2554 host_name dbhost.gtld
2555 service_description dbhost PostgreSQL Service Database Locks
2556 check_command check_postgres_locks!2!3
2557 }
2558
2560 Copyright (c) 2007-2015 Greg Sabino Mullane <greg@endpoint.com>.
2561
2562 Redistribution and use in source and binary forms, with or without
2563 modification, are permitted provided that the following conditions are
2564 met:
2565
2566 1. Redistributions of source code must retain the above copyright notice,
2567 this list of conditions and the following disclaimer.
2568 2. Redistributions in binary form must reproduce the above copyright notice,
2569 this list of conditions and the following disclaimer in the documentation
2570 and/or other materials provided with the distribution.
2571
2572 THIS SOFTWARE IS PROVIDED BY THE AUTHOR "AS IS" AND ANY EXPRESS OR
2573 IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED
2574 WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
2575 DISCLAIMED. IN NO EVENT SHALL THE AUTHOR BE LIABLE FOR ANY DIRECT,
2576 INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES
2577 (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
2578 SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
2579 HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT,
2580 STRICT LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING
2581 IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE
2582 POSSIBILITY OF SUCH DAMAGE.
2583
2584
2585
2586perl v5.26.1 2015-06-30 CHECK_POSTGRES(1)