1MONIT(1) User Commands MONIT(1)
2
3
4
6 Monit - utility for monitoring services on a Unix system
7
9 monit [options] <arguments>
10
12 Monit is a utility for managing and monitoring processes, programs,
13 files, directories and filesystems on a Unix system. Monit conducts
14 automatic maintenance and repair and can execute meaningful causal
15 actions in error situations. E.g. Monit can start a process if it does
16 not run, restart a process if it does not respond and stop a process if
17 it uses too much resources. You can use Monit to monitor files,
18 directories and filesystems for changes, such as timestamps changes,
19 checksum changes or size changes.
20
21 Monit is controlled via an easy to configure control file based on a
22 free-format, token-oriented syntax. Monit logs to syslog or to its own
23 log file and notifies you about error conditions via customisable alert
24 messages. Monit can perform various TCP/IP network checks, protocol
25 checks and can utilise SSL for such checks. Monit provides a HTTP(S)
26 interface and you may use a browser to access the Monit program.
27
29 You can use Monit to monitor daemon processes or similar programs
30 running on localhost. Monit is particularly useful for monitoring
31 daemon processes, such as those started at system boot time. For
32 instance sendmail, sshd, apache and mysql. In contrast to many other
33 monitoring systems, Monit can act if an error situation should occur,
34 e.g.; if sendmail is not running, monit can start sendmail again
35 automatically or if apache is using too many resources (e.g. if a DoS
36 attack is in progress) Monit can stop or restart apache and send you an
37 alert message. Monit can also monitor process characteristics, such as
38 how much memory or cpu cycles a process is using.
39
40 You can also use Monit to monitor files, directories and filesystems on
41 localhost. Monit can monitor these items for changes, such as
42 timestamps changes, checksum changes or size changes. This is also
43 useful for security reasons - you can monitor the md5 or sha1 checksum
44 of files that should not change and get an alert or perform an action
45 if they should change.
46
47 Monit can monitor network connections to various servers, either on
48 localhost or on remote hosts. TCP, UDP and Unix Domain Sockets are
49 supported. Network test can be performed on a protocol level; Monit has
50 built-in tests for the main Internet protocols, such as HTTP, SMTP etc.
51 Even if a protocol is not supported you can still test the server
52 because you can configure Monit to send any data and test the response
53 from the server.
54
55 Monit can be used to test programs or scripts at certain times, much
56 like cron, but in addition, you can test the exit value of a program
57 and perform an action or send an alert if the exit value indicates an
58 error. This means that you can use Monit to perform any type of check
59 you can write a script for.
60
61 Finally, Monit can be used to monitor general system resources on
62 localhost such as overall CPU usage, Memory and System Load.
63
65 The behaviour of Monit is controlled by command-line options and a run
66 control file, monitrc, the syntax of which we describe in a later
67 section. Command-line options override .monitrc declarations.
68
69 The default location for monitrc is ~/.monitrc. If this file does not
70 exist, Monit will try /etc/monitrc and a few other places. See FILES
71 for details. You can also specify the control file directly by using
72 the -c command-line switch to monit. For instance,
73
74 $ monit -c /var/monit/monitrc
75
76 Before Monit is started the first time, you can test the control file
77 for syntax errors:
78
79 $ monit -t
80 $ Control file syntax OK
81
82 If there was an error, Monit will print an error message to the
83 console, including the line number in the control file from where the
84 error was found.
85
86 Once you have a working Monit control file, simply start Monit from the
87 console, like so:
88
89 $ monit
90
91 You can change some configuration directives via command-line switches,
92 but for simplicity it is recommended that you put these in the control
93 file.
94
95 Monit will detach from the terminal and run as a background process,
96 i.e. as a daemon process. As a daemon, Monit runs in cycles; It monitor
97 services, then goes to sleep for a configured period, then wakes up and
98 start monitoring again in an endless loop.
99
100 Options
101 The following options are recognized by Monit. However, it is
102 recommended that you set options (when applicable) directly in the
103 .monitrc control file.
104
105 -c file
106 Use this control file
107
108 -d n
109 Run Monit as a daemon once per n seconds. Or use "set
110 daemon" in monitrc.
111
112 -g name
113 Set group name for start, stop, restart, monitor, unmonitor,
114 status and summary action.
115
116 -l file
117 Print log information to this file. Or use "set log"
118 in monitrc.
119
120 -p pidfile
121 Use this lock file in daemon mode. Or use "set pidfile"
122 in monitrc.
123
124 -s statefile
125 Write state information to this file. Or use "set
126 statefile" in monitrc.
127
128 -B
129 Batch command line mode (no tabular output and no colors). Or
130 use "set terminal batch" in monitrc.
131
132 -I
133 Do not run in background mode (needed to run from init). Or use
134 "set init" in monitrc.
135
136 -i
137 Print Monit's unique ID
138
139 -r
140 Reset Monit's unique ID. Use with caution
141
142 -t
143 Run syntax check for the control file
144
145 -v
146 Verbose mode, work noisy (diagnostic output)
147
148 -vv
149 Very verbose mode, same as -v plus log stack-trace on error
150
151 -H [filename]
152 Print MD5 and SHA1 hashes of the file or of stdin if the
153 filename is omitted; Monit will exit afterwards
154
155 -V
156 Print version number and patch level
157
158 -h
159 Print a help text
160
161 Arguments
162 Once you have Monit running as a daemon process, you can call Monit
163 with one of the following arguments. Monit will then connect to the
164 Monit daemon (on TCP port 127.0.0.1:2812 by default) and ask the Monit
165 daemon to perform the requested action. In other words; calling monit
166 without arguments starts the Monit daemon, and calling monit with
167 arguments enables you to communicate with the Monit daemon process.
168
169 start all
170 Start all services listed in the control file and enable monitoring
171 for them. If the group option is set (-g), only start and enable
172 monitoring of services in the named group ("all" is not required in
173 this case).
174
175 start <name>
176 Start the named service and enable monitoring for it. The name is a
177 service entry name from the monitrc file.
178
179 stop all
180 Stop all services listed in the control file and disable their
181 monitoring. If the group option is set, only stop and disable
182 monitoring of the services in the named group ("all" is not
183 required in this case).
184
185 stop <name>
186 Stop the named service and disable its monitoring. The name is a
187 service entry name from the monitrc file.
188
189 restart all
190 Stop and start all services. If the group option is set, only
191 restart the services in the named group ("all" is not required in
192 this case).
193
194 restart <name>
195 Restart the named service. The name is a service entry name from
196 the monitrc file.
197
198 monitor all
199 Enable monitoring of all services listed in the control file. If
200 the group option is set, only start monitoring of services in the
201 named group ("all" is not required in this case).
202
203 monitor <name>
204 Enable monitoring of the named service. The name is a service entry
205 name from the monitrc file. Monit will also enable monitoring of
206 all services this service depends on.
207
208 unmonitor all
209 Disable monitoring of all services listed in the control file. If
210 the group option is set, only disable monitoring of services in the
211 named group ("all" is not required in this case).
212
213 unmonitor <name>
214 Disable monitoring of the named service. The name is a service
215 entry name from the monitrc file. Monit will also disable
216 monitoring of all services that depends on this service.
217
218 status [name]
219 Print service status information.
220
221 summary [name]
222 Print a short status summary.
223
224 report [up | down | initialising | unmonitored | total]
225 Report services state. The output can easily be parsed by scripts.
226 Without options, prints a short overview of the state of all
227 services managed by Monit. The option, up prints the number of all
228 services in this state, down likewise and so on.
229
230 reload
231 Reinitialise a running Monit daemon, the daemon will reread its
232 configuration, close and reopen log files.
233
234 quit
235 Kill the Monit daemon process
236
237 validate
238 Check all services listed in the control file. This action is also
239 the default behaviour when Monit runs in daemon mode.
240
241 procmatch <regex>
242 Allows for easy testing of pattern for process match check. The
243 command takes regular expression as an argument and displays all
244 running processes matching the pattern.
245
247 Monit is configured and controlled via a control file called monitrc.
248 The default location for this file is ~/.monitrc. If this file does not
249 exist, Monit will try /etc/monitrc, then @sysconfdir@/monitrc and
250 finally ./monitrc. If you build Monit from source, the value of
251 @sysconfdir@ can be given at configure time as ./configure
252 --sysconfdir. For instance, using ./configure --sysconfdir
253 /var/monit/etc will make Monit search for monitrc in /var/monit/etc
254
255 To protect the security of your control file and passwords the control
256 file must have read-write permissions no more than 0700 (u=xrw,g=,o=);
257 Monit will complain and exit otherwise.
258
259 When there is a conflict between the command-line arguments and the
260 arguments in this file, the command-line arguments takes precedence.
261
262 Monit uses its own Domain Specific Language (DSL); The control file
263 consists of a series of service entries and global option statements.
264
265 Comments begin with a '#' and extend through the end of the line.
266 Otherwise the file consists of a series of service entries or global
267 option statements in a free-format, token-oriented syntax.
268
269 You can use noise keywords like 'if', 'and', 'with(in)', 'has',
270 'us(ing|e)', 'on(ly)', 'then', 'for', 'of' anywhere in an entry to make
271 it resemble English. They're ignored, but can make entries much easier
272 to read at a glance. Keywords are case insensitive.
273
274 There are three kinds of tokens: grammar, numbers (i.e. decimal digit
275 sequences) and strings. Strings can be either quoted or unquoted. A
276 quoted string is bounded by double quotes and may contain whitespace
277 (and quoted digits are treated as a string). An unquoted string is any
278 whitespace-delimited token, containing characters and/or numbers.
279
280 On a semantic level, the control file consists of three types of
281 entries:
282
283 1. Global set-statements
284 A global set-statement starts with the keyword "set" and the item
285 to configure.
286
287 2. Global include-statement
288 The include statement consists of the keyword "include" and a glob
289 string. This statement is used to include configure directives from
290 separate files.
291
292 3. One or more service entry statements.
293
294 Service checks
295 Each service entry consists of the keywords "check", followed by the
296 service type. Each entry requires a unique descriptive name, which may
297 be freely chosen. This name is used by Monit to refer to the service
298 internally and in all interactions with the user.
299
300 Currently, nine types of check statements are supported:
301
302 Process
303
304 CHECK PROCESS <unique name> <PIDFILE <path> | MATCHING <regex>>
305
306 <path> is the absolute path to the program's pid-file. A pid-file is a
307 file, containing a Process's unique ID. If the pid-file does not exist
308 or does not contain the PID number of a running process, Monit will
309 call the entry's start method if defined.
310
311 <regex> is an alternative to using PID files and uses process name
312 pattern matching to find the process to monitor. The top-most matching
313 parent with highest uptime is selected, so this form of check is most
314 useful if the process name is unique. Pid-file should be used where
315 possible as it defines expected PID exactly. You can test if a process
316 match a pattern from the command-line using "monit procmatch
317 "regex-pattern"". This will lists all processes matching or not, the
318 regex-pattern.
319
320 File
321
322 CHECK FILE <unique name> PATH <path>
323
324 <path> is the absolute path to the file. If the file does not exist,
325 Monit will call the entry's start method if defined, if <path> does not
326 point to a regular file type (for instance a directory), Monit will
327 disable monitoring of this entry. If Monit runs in passive mode or the
328 start method is not defined, Monit will just send an alert on error.
329
330 Fifo
331
332 CHECK FIFO <unique name> PATH <path>
333
334 <path> is the absolute path to the fifo. If the fifo does not exist,
335 Monit will call the entry's start method if defined, if <path> does not
336 point to a fifo type (for instance a directory), Monit will disable
337 monitoring of this entry. If Monit runs in passive mode or the start
338 method is not defined, Monit will just send an alert on error.
339
340 Filesystem
341
342 CHECK FILESYSTEM <unique name> PATH <string>
343
344 <path> is the path to the device/disk, mount point or NFS/CIFS/FUSE
345 connection string. If the filesystem becomes unavailable, Monit will
346 call the service's start method if defined. If Monit runs in passive
347 mode or the start method is not defined, Monit will just send an alert
348 on error.
349
350 Directory
351
352 CHECK DIRECTORY <unique name> PATH <path>
353
354 <path> is the absolute path to the directory. If the directory does not
355 exist, Monit will call the entry's start method if defined. If <path>
356 does not point to a directory, monit will disable monitoring of this
357 entry. If Monit runs in passive mode or the start methods is not
358 defined, Monit will just send an alert on error.
359
360 Remote host
361
362 CHECK HOST <unique name> ADDRESS <host>
363
364 The host address can be specified as a hostname string or as an IP-
365 address string on a dotted decimal format. Such as, "tildeslash.com" or
366 "64.87.72.95".
367
368 System
369
370 CHECK SYSTEM <unique name>
371
372 The unique name is usually the local host name, but any descriptive
373 name can be used. If you use the variable $HOST as the name, it will
374 expand to the hostname. This check allows one to monitor general system
375 resources such as CPU usage, total memory usage or load average. The
376 unique name is used as the system hostname in mail alerts and as the
377 initial name of the host entry in M/Monit.
378
379 Program
380
381 CHECK PROGRAM <unique name> PATH <executable file> [TIMEOUT <number> SECONDS]
382
383 <path> is the absolute path to the executable program or script. The
384 status test allows one to check the program's exit status. If the
385 program does not finish executing within <number> seconds, Monit will
386 terminate it. The default program timeout is 300 seconds (5 minutes).
387 The output of the program is recorded and made available in the User
388 Interface and in alerts, by default up to 512 bytes. You can change the
389 output limit using the set limits statement).
390
391 Network
392
393 CHECK NETWORK <unique name> <ADDRESS <ipaddress> | INTERFACE <name>>
394
395 <ipaddress> is the IPv4 or IPv6 address of the monitored network
396 interface. It is also possible to use interface name, such as "eth0" on
397 Linux.
398
400 Monit will log status and error messages to a file or via syslog. Use
401 the set log statement in the monitrc control file.
402
403 To setup Monit to log to its own file, use e.g. set log
404 /var/log/monit.log. Note, the previous set logfile statement is
405 deprecated, but can alternatively be used.
406
407 If syslog is given as a value for the "-l" command-line switch or the
408 keyword set log syslog is found in the control file, Monit will use the
409 syslog system daemon to log messages with a priority assigned to each
410 message based on the context.
411
412 To turn off logging, simply do not set the log in the control file (and
413 of course, do not use the -l switch)
414
415 The format for log file is:
416
417 [date] priority : message
418
419 for example:
420
421 [CET Jan 5 18:49:29] info : 'localhost' Monit started
422
424 Monit uses ANSI escape sequences to colorise important parts of the
425 command-line output, if the terminal supports colors, and UTF-8 box
426 characters for tabular output.
427
428 If you want to process the monit CLI output in a script, you can use
429 either the -B option or use the following statement in the monit
430 configuration file to disable tabular output and colors completely:
431
432 SET TERMINAL BATCH
433
435 Use
436
437 SET DAEMON <seconds>
438 [[WITH] START DELAY <seconds>]
439
440 to specify Monit's poll cycle length and run Monit in daemon mode. You
441 must specify a numeric argument which is a polling interval in seconds.
442
443 In daemon mode, Monit detaches from the console, puts itself in the
444 background and runs continuously, monitoring each specified service and
445 then goes to sleep for the given poll interval, wakes up and start
446 monitoring again in an endless cycle.
447
448 Alternatively, you can use the "-d" command line switch to set the poll
449 interval, but it is strongly recommended to set the poll interval in
450 your ~/.monitrc file, by using set daemon.
451
452 Monit will then always start in daemon mode. If you do not use this
453 statement and do not start monit with the -d option, Monit will just
454 run through the service checks once and then exit. This might be useful
455 in some situations, but Monit is primarily designed to run as a daemon
456 process.
457
458 Calling "monit" with a Monit daemon running in the background sends a
459 wake-up signal to the daemon, forcing it to check services immediately.
460 Calling "monit" with the quit argument will kill a running Monit daemon
461 process instead of waking it up.
462
463 The start delay option can be used to wait (once) before Monit starts
464 checking services after system reboot. Monit will by default start
465 checking services immediately at startup.
466
468 The "set init" statement prevents Monit from transforming itself into a
469 daemon process. Instead Monit will run as a foreground process. (You
470 should still use "set daemon" to specify the poll cycle).
471
472 This is required to run Monit from init. Using init to start Monit is
473 probably the best way to run Monit if you want to be certain that you
474 always have a running Monit daemon on your system. Another option is to
475 run Monit from crontab. In any case, you should make sure that the
476 control file does not have any syntax errors before you start Monit
477 from init or crontab (use "monit -t" to check).
478
479 To setup Monit to run from init, you can either use the "set init"
480 statement in Monit's control file or use the "-I" option from the
481 command line. Here is what you must add to "/etc/inittab":
482
483 # Run Monit in standard run-levels
484 mo:2345:respawn:/usr/local/bin/monit -Ic /etc/monitrc
485
486 After you have modified init's configuration file, you can run the
487 following command to re-examine /etc/inittab and start Monit:
488
489 telinit q
490
491 For systems without telinit:
492
493 kill -1 1
494
495 If Monit is used to monitor services that are also started at boot time
496 (e.g. services started via SYSV init rc scripts or via inittab) then,
497 in some cases, a race condition could occur. That is; if a service is
498 slow to start, Monit can assume that the service is not running and
499 possibly try to start it and raise an alert, while, in fact the service
500 is already about to start or already in its startup sequence. Please
501 see the FAQ for a solution to this problem. The short version is to
502 start Monit on a higher run-level after system processes.
503
505 The Monit control file, "monitrc", can include additional configuration
506 files. This feature helps one to organise configuration into separate
507 files instead of having everything in one file, if you like this kind
508 of thing. Include statements can be placed at virtually any place in
509 "monitrc" though the convention is at the bottom. The syntax is the
510 following:
511
512 INCLUDE <globstring>
513
514 The globstring is any kind of string as defined in glob(7). Thus, you
515 can refer to a single file or you can load several files at once. If
516 you want to use whitespace in your string the globstring needs to be
517 embedded into quotes (') or double quotes ("). If the globstring
518 matches a directory instead of a file, it is silently ignored.
519
520 Any include statements in an included file are parsed as in the main
521 control file.
522
523 If the globstring matches several results, the files are included in a
524 non sorted manner. If you need to rely on a certain order, you should
525 avoid wild-card globbing and instead specify the full path of files
526 included.
527
528 An example,
529
530 include /etc/monit.d/*.cfg
531
532 This will load any file matching the globstring. That is, all files in
533 /etc/monit.d that ends with the prefix .cfg.
534
536 Common SSL/TLS options can be set using the following statement and
537 will apply to all SSL connections made through Monit:
538
539 SET <SSL | TLS> [OPTIONS] {
540 VERSION: <AUTO | SSLV2 | SSLV3 | TLSV1 | TLSV11 | TLSV12 | TLSV13>
541 VERIFY: <ENABLE | DISABLE>
542 SELFSIGNED: <ALLOW | REJECT>
543 CIPHERS: <string>
544 PEMFILE: <path>
545 CLIENTPEMFILE: <path>
546 CACERTIFICATEFILE: <path>
547 CACERTIFICATEPATH: <path>
548 }
549
550 VERSION set the specific SSL/TLS version to use. By default Monit uses
551 AUTO. In AUTO mode, only TLS is used, SSLv2 and SSLv3 is considered
552 obsolete. If you have to use SSLv2 or SSLv3, you must explicitly set
553 the version.
554
555 VERIFY enable SSL server certificate verification. This will verify and
556 report an error if the server certificate is not trusted, not valid or
557 has expired. By default certificate verification is disabled, though we
558 recommend enabling it, otherwise there is no guarantee that Monit
559 speaks with the server you think it speaks with.
560
561 SELFSIGNED self-signed certificates are rejected by default. Use this
562 option to allow self-signed certificates. Warning: not recommended in
563 production for security reasons, as in such case the client cannot
564 verify it talks to the correct server and attack types like man-in-the-
565 middle or DNS hijacking are possible).
566
567 CIPHERS override default SSL/TLS ciphers.
568
569 PEMFILE set the path to the SSL server certificate "database-file" in
570 PEM format. This options has effect only for the monit HTTP interface.
571
572 CLIENTPEMFILE set the path to the PEM encoded SSL client certificates
573 database file. If set, a client certificate authentication is enabled.
574
575 CACERTIFICATEFILE set the path to the PEM encoded file containing
576 Certificate Authority (CA) certificates. Monit uses OpenSSL's default
577 CA certificates if this options is not used (openssl version -d can be
578 used to get the default CA certificates). Many distributions comes with
579 SSL and CA certificates already setup and using this option is normally
580 not necessary.
581
582 CACERTIFICATEPATH set the path to the directory containing Certificate
583 Authority (CA) certificates. Monit uses OpenSSL's default CA
584 certificates if this options is not used. Many distributions comes with
585 SSL and CA certificates already setup and using this option is normally
586 not necessary.
587
588 The SSL options statement will globally apply to all SSL/TLS connection
589 made through Monit. SSL options can also be set in a local check, in
590 mailserver settings or in the mmonit statement, and will then override
591 or extend the global settings.
592
593 To set global SSL options, put this statement near the top of your
594 .monitrc file:
595
596 set ssl options {...}
597
598 Here is an example of setting both global and local SSL options:
599
600 # Enable certificate verification for all SSL connections
601 # Self-signed certificates are not allowed by default
602 set ssl options {
603 verify: enable
604 }
605
606 # Verify certificate (via global setting)
607 # Allow self-signed certificate for this check
608 check host example with address example.com
609 if failed
610 port 443
611 protocol https
612 with ssl options {selfsigned: allow}
613 then alert
614
615 # Do not verify example2.com's certificate (override global setting)
616 check host example2 with address example2.com
617 if failed
618 port 443
619 protocol https
620 with ssl options {verify: disable}
621 then alert
622
624 To enable FIPS mode (provided your OpenSSL library supports it), add
625 this statement to Monit control file:
626
627 SET FIPS
628
630 If specified in the control file, Monit will start with HTTP support.
631 You can then use Monit CLI to start and stop services, disable or
632 enable service monitoring as well as view the status of each service.
633
634 If HTTP support is enabled over TCP rather than over a Unix Socket, you
635 can also view Monit's informative dashboard in your web browser.
636
637 Note that if HTTP support is disabled, the Monit CLI interface will
638 have reduced functionality, as most CLI commands (such as "monit
639 status") needs to communicate with the Monit background process via the
640 HTTP interface. We strongly recommend having HTTP support enabled. If
641 security is a concern, bind the HTTP interface to local host only or
642 use Unix Socket so Monit is not accessible from the outside.
643
644 UNIX SOCKET
645 Syntax for Unix Socket:
646
647 SET HTTPD UNIXSOCKET <path>
648 [UID <uid | username>]
649 [GID <gid | groupname>]
650 [PERMISSION <octal number>]
651 ALLOW <user:password>+
652
653 Example:
654
655 set httpd unixsocket /var/run/monit.sock
656 allow username:password
657
658 UNIXSOCKET set the path to the Unix Socket Monit should bind to and
659 listen on.
660
661 UID Socket owner (optional, defaults to the user who executes Monit)
662
663 GID Socket group (optional, defaults to primary group of the user who
664 executes Monit)
665
666 PERMISSION Socket permissions - absolute octal mode (optional, process
667 UMASK is applied by default)
668
669 TCP PORT
670 Syntax for TCP port:
671
672 SET HTTPD PORT <number>
673 [ADDRESS <hostname | IP-address>]
674 [[with] SSL {pemfile: <path>}]
675 ALLOW <user:password | IP-address | IP-range>+
676
677 PORT set the port Monit should bind to and listen on. Monit is usually
678 setup on port 2812. Example:
679
680 set httpd port 2812
681 allow username:password
682
683 You can now use <http://localhost:2812/> to access Monit's web
684 interface from a browser, after you have entered username and password
685 as credentials. You might need to use double quotes around the password
686 if it cointains special chars such as "p@ssw:r#".
687
688 ADDRESS make Monit listen on a specific interface only. For example if
689 you don't want to expose Monit's web interface to the network, bind it
690 to localhost only. Monit will accept connections on any addresses if
691 the ADDRESS option is not used:
692
693 set httpd
694 port 2812
695 use address 127.0.0.1
696 allow username:password
697
698 Monit HTTP over TCP supports both IP version 4 and 6. Support is
699 transparent and does not require any special configuration. If the bind
700 address is not specified as in this example:
701
702 set httpd
703 port 2812
704 allow ...
705
706 Monit will bind to and listen on port 2812 on all interfaces, both IPv4
707 and IPv6 if available. To force Monit HTTP to only listen on and accept
708 connections over IP version 6, specify an IPv6 address:
709
710 set httpd
711 port 2812
712 use address "fe80::222:19ff:fe53:6c59"
713 allow ...
714
715 Likewise, to force Monit HTTP to only listen on and accept connections
716 over IP version 4, specify an IPv4 address:
717
718 set httpd
719 port 2812
720 use address 62.109.39.247
721 allow ...
722
723 SSL settings
724
725 SSL enable SSL/TLS for Monit's web interface. See options for full
726 list of SSL options.
727
728 PEMFILE set the path to the PEM encoded file, which contains the
729 server's private key and certificate. This file should be stored in a
730 safe place on the filesystem and should have strict permissions, no
731 more than 0700.
732
733 Example:
734
735 set httpd
736 port 2812
737 with ssl {
738 pemfile: /etc/ssl/certs/monit.pem
739 }
740 allow myuser:mypassword
741
742 You can now use <https://localhost:2812/> to access the Monit web
743 server over a TLS encrypted connection.
744
745 Self-signed server certificates note: The Monit CLI works on a client-
746 server basis and uses the Monit HTTP GUI to collect status from the
747 Monit daemon and pass commands like start/stop to it. As self-signed
748 certificates are rejected by default for security reasons, the CLI
749 won't work unless you explicitly allow it by using the SELFSIGNED:
750 ALLOW option:
751
752 set httpd
753 port 2812
754 with ssl {
755 pemfile: /etc/ssl/certs/monit.pem
756 selfsigned: allow
757 }
758 allow myuser:mypassword
759
760 CLIENTPEMFILE enables a client certificate based authentication and
761 sets the path to a PEM encoded database file, that contains a list of
762 allowed client certificates. A connecting client has to provide a
763 certificate known to Monit (listed in clientpemfile), otherwise it is
764 rejected. This file must also include all necessary CA certificates. By
765 default self-signed client certificates are rejected for security
766 reasons, if you want to allow self-signed client certificates
767 (recommended only for testing), you have to allow it explicitly using
768 the SELFSIGNED: ALLOW option (see the example above). See your
769 browser's documentation for how to import client certificate to it.
770
771 Example:
772
773 set httpd
774 port 2812
775 with SSL {
776 pemfile: /etc/ssl/certs/monit.pem
777 clientpemfile: /etc/ssl/certs/monit-client.pem
778 }
779
780 Monit version signature
781 SIGNATURE can be used to hide Monit version from the HTTP response
782 header and error pages. For example:
783
784 set httpd
785 port 2812
786 signature disable
787 allow myuser:mypassword
788
789 Authentication
790 Access to the Monit web interface is controlled primarily via the ALLOW
791 option which is used to specify authentication and authorise only
792 specific clients to connect.
793
794 If the Monit command line interface is being used, at least one
795 cleartext password is necessary (see below), otherwise the Monit
796 command line interface will not be able to connect to the Monit web
797 interface.
798
799 Clients that try to connect to Monit, but submit a wrong username
800 and/or password are logged with their IP-address.
801
802 Client certificates
803
804 This authentication method is a strong authentication mechanism and
805 employ HTTPS client certificates to verify the authenticity of a
806 connecting client. Clients must posses a Public Key Certificate known
807 by Monit. The client must connect to Monit over SSL and Monit will ask
808 the client to send its certificate. Upon receiving the certificate
809 Monit compares the certificate to certificates located in the
810 CLIENTPEMFILE file. Access is granted if the client certificate is in
811 this file. See SSL settings for details.
812
813 Basic Authentication
814
815 Monit supports Basic Authentication as described in RFC 2617.
816
817 In short; a server challenge a client (e.g. a Browser) to send
818 authentication information (username and password) and if accepted, the
819 server will allow the client access to the requested document.
820
821 The biggest weakness with Basic Authentication is that username and
822 password is sent in clear-text over the network (i.e. base64 encoded).
823 It is therefor recommended that you do not use this authentication
824 method unless you run Monit with ssl support. With ssl, it is safe to
825 use Basic Authentication since all HTTP data, including Basic
826 Authentication headers will be encrypted.
827
828 Cleartext user and password
829
830 Monit will use Basic Authentication if an allow statement contains a
831 username and a password separated with a single ':' character.
832
833 Note: Special characters can be used, but for non-alphanumerics the
834 password has to be quoted.
835
836 Syntax:
837
838 ALLOW <username>:<password>
839
840 Host and network allow list
841
842 Monit maintains an access-control list of hosts and networks allowed to
843 connect. You can add as many hosts as you want to, but only hosts with
844 a valid domain name or its IP address are allowed.
845
846 Monit will query a name server to check any hosts trying to connect. If
847 a host (client) is trying to connect, but cannot be found in the access
848 list or cannot be resolved, Monit will shutdown the connection to the
849 client promptly.
850
851 Control file example:
852
853 set httpd port 2812
854 allow localhost
855 allow my.other.work.machine.com
856 allow 10.1.1.1
857 allow 192.168.1.0/255.255.255.0
858 allow 10.0.0.0/8
859
860 Clients, not mentioned in the allow list and trying to connect to Monit
861 will be denied access and are logged with their IP-address.
862
863 PAM
864
865 PAM is supported on platforms which provide PAM (such as Linux, macOS,
866 FreeBSD, NetBSD).
867
868 Syntax:
869
870 ALLOW @<group>
871
872 where "group" is the group name allowed to access Monit's web
873 interface. Monit uses a PAM service called monit for PAM
874 authentication, see the PAM manual page for detailed instructions on
875 how to set the PAM service and PAM authentication plugins.
876
877 Sample PAM service for Monit on macOS (store as "/etc/pam.d/monit"
878 file):
879
880 # monit: auth account password session
881 auth sufficient pam_securityserver.so
882 auth sufficient pam_unix.so
883 auth required pam_deny.so
884 account required pam_permit.so
885
886 A "monitrc" config which only allows group "admin" authenticated via
887 PAM to access the web interface:
888
889 set httpd
890 port 2812
891 allow @admin
892
893 htpasswd file
894
895 Alternatively you store credentials in a "htpasswd" formatted file (one
896 user:passwd entry per line), like so: allow [cleartext|crypt|md5] /path
897 [users]. The default is cleartext passwords. In case passwords are
898 digested it is necessary to specify the cryptographic method. If you do
899 not want all users in the password file to have access to Monit, you
900 can specify only those users that should have access in the allow
901 statement. Otherwise all users are added.
902
903 Example1:
904
905 set httpd port 2812
906 allow md5 /etc/httpd/htpasswd john paul ringo george
907
908 If you use this method together with a host list, then only clients
909 from the listed hosts will be allowed to connect to the Monit HTTP
910 server and each client will be asked to provide a username and a
911 password.
912
913 Example2:
914
915 set httpd port 2812
916 allow localhost
917 allow 10.1.1.1
918 allow hauk:"passw@rd"
919
920 If you only want to use Basic Authentication, then just provide allow
921 entries with username and password or password files as in example 1
922 above.
923
924 Read-only users
925
926 Finally it is possible to define some users as read-only. A read-only
927 user can read the Monit web pages but will not get access to push-
928 buttons and cannot change a service from the web interface.
929
930 set httpd port 2812
931 allow admin:password
932 allow hauk:password read-only
933 allow @admins
934 allow @users read-only
935
936 A user is set to read-only by using the read-only keyword after
937 username:password. In the above example the user hauk is defined as a
938 read-only user, while the admin user has all access rights.
939
941 Monit will raise an alert in the following situations:
942
943 o A service does not exist (e.g. process is not running)
944 o Cannot read service data (e.g. cannot get filesystem usage)
945 o Execution of a service related script failed (e.g. start failed)
946 o Invalid service type (e.g. if path points to directory instead of file)
947 o Custom test script returned error
948 o Ping test failed
949 o TCP/UDP connection and/or port test failed
950 o Resource usage test failed (e.g. cpu usage too high)
951 o Checksum mismatch or change (e.g. file changed)
952 o File size test failed (e.g. file too large)
953 o Timestamp test failed (e.g. file is older then expected)
954 o Permission test failed (e.g. file mode doesn't match)
955 o An UID test failed (e.g. file owned by different user)
956 o A GID test failed (e.g. file owned by different group)
957 o A process' PID changed out of Monit's control
958 o A process' PPID changed out of Monit control
959 o Too many service recovery attempts failed
960 o A file content test found a match
961 o Filesystem flags changed
962 o A service action was performed by administrator
963 o A network link failed
964 o A network link capacity changed
965 o A network link saturation failed
966 o A network link upload/download rate failed
967 o Monit was started, stopped or reloaded
968
969 To get an alert via e-mail, set the alert target using the global "set
970 alert" statement (for all services) or the "alert" statement in the
971 context of a service entry (for a single service).
972
973 Setting an alert recipient
974 If an event occurs, Monit will send an alert. There are two kinds of
975 alert statement: global and local.
976
977 Global syntax:
978
979 SET ALERT mail-address [[NOT] {event, ...}] [REMINDER cycles]
980
981 Example:
982
983 set alert foo@bar
984
985 will send a default email to the address foo@bar whenever any event
986 occurs on any service.
987
988 If you want to send alert messages to more email addresses, add a "set
989 alert 'email'" statement for each address.
990
991 It is also possible to use the local alert statement in the context of
992 a service check to enable alert for the given service only:
993
994 ALERT mail-address [[NOT] {event, ...}] [REMINDER cycles]
995
996 Local alert example:
997
998 check host myhost with address 1.2.3.4
999 if failed port 3306 protocol mysql then alert
1000 if failed port 80 protocol http then alert
1001 alert foo@baz # Local service alert
1002
1003 You can combine global and local alert statements. If there is a
1004 conflict, the local alert has precedence and overrides the global
1005 statement.
1006
1007 Setting an event filter
1008
1009 If you only want an alert message sent for certain events, list them in
1010 an "{event, ...}" block, e.g.:
1011
1012 set alert foo@bar only on { timeout, nonexist }
1013
1014 The event list can also be negated to send alerts for all events except
1015 those which are listed, by prepending the list with the word "not". For
1016 example, to receive all alerts except notification about Monit program
1017 start and stop:
1018
1019 set alert foo@bar but not on { instance }
1020
1021 Here is a list of all possible event types emitted by Monit. Values
1022 from the first column can be used in the event filter list mentioned
1023 above:
1024
1025 Event: | Failure state: | Success state:
1026 ---------------------------------------------------------------------
1027 action | "Action failed" | "Action done"
1028 checksum | "Checksum failed" | "Checksum succeeded"
1029 bytein | "Download bytes exceeded" | "Download bytes ok"
1030 byteout | "Upload bytes exceeded" | "Upload bytes ok"
1031 connection | "Connection failed" | "Connection succeeded"
1032 content | "Content failed", | "Content succeeded"
1033 data | "Data access error" | "Data access succeeded"
1034 exec | "Execution failed" | "Execution succeeded"
1035 fsflags | "Filesystem flags failed" | "Filesystem flags succeeded"
1036 gid | "GID failed" | "GID succeeded"
1037 icmp | "Ping failed" | "Ping succeeded"
1038 instance | "Monit instance changed" | "Monit instance changed not"
1039 invalid | "Invalid type" | "Type succeeded"
1040 link | "Link down" | "Link up"
1041 nonexist | "Does not exist" | "Exists"
1042 packetin | "Download packets exceeded" | "Download packets ok"
1043 packetout | "Upload packets exceeded" | "Upload packets ok"
1044 permission | "Permission failed" | "Permission succeeded"
1045 pid | "PID failed" | "PID succeeded"
1046 ppid | "PPID failed" | "PPID succeeded"
1047 resource | "Resource limit matched" | "Resource limit succeeded"
1048 saturation | "Saturation exceeded" | "Saturation ok"
1049 size | "Size failed" | "Size succeeded"
1050 speed | "Speed failed" | "Speed ok"
1051 status | "Status failed" | "Status succeeded"
1052 timeout | "Timeout" | "Timeout recovery"
1053 timestamp | "Timestamp failed" | "Timestamp succeeded"
1054 uid | "UID failed" | "UID succeeded"
1055 uptime | "Uptime failed" | "Uptime succeeded"
1056
1057 Each alert recipient can have it's own filter, for example:
1058
1059 set alert foo@bar { nonexist, timeout, resource, icmp, connection }
1060 set alert security@bar on { checksum, permission, uid, gid }
1061 set alert admin@bar
1062
1063 Setting an error reminder
1064
1065 Monit by default sends just one notification if a service failed and
1066 another when/if it recovers. If you want to be notified that the
1067 service is still in a failed state, you can use the reminder option in
1068 the alert statement:
1069
1070 SET ALERT mail-address [WITH] REMINDER [ON] number [CYCLES]
1071
1072 For example if you want to be notified each tenth cycle if a service
1073 remains in a failed state, you can use:
1074
1075 alert foo@bar with reminder on 10 cycles
1076
1077 Likewise if you want to be notified on each failed cycle, you can use:
1078
1079 alert foo@bar with reminder on 1 cycle
1080
1081 Disabling alerts for some service
1082 To suppress alerts for some user and service, add the "noalert"
1083 statement in the context of a service check.
1084
1085 NOALERT mail-address
1086
1087 Example (send all alerts to foo@bar except for service p3):
1088
1089 set alert foo@bar
1090
1091 check process p1 with pidfile /var/run/p1.pid
1092
1093 check process p2 with pidfile /var/run/p2.pid
1094
1095 check process p3 with pidfile /var/run/p3.pid
1096 noalert foo@bar
1097
1098 Message format
1099 The alert message format can be modified by using the "set mail-format"
1100 statement:
1101
1102 SET MAIL-FORMAT {mail-format}
1103
1104 Example:
1105
1106 set mail-format {
1107 from: Monit Support <monit@foo.bar>
1108 reply-to: support@domain.com
1109 subject: $SERVICE $EVENT at $DATE
1110 message: Monit $ACTION $SERVICE at $DATE on $HOST: $DESCRIPTION.
1111 Yours sincerely,
1112 monit
1113 }
1114
1115 The from: option is the sender's email address for Monit alerts. A
1116 sender's name is optional, but if used, requires that the subsequent
1117 email-address is enclosed in angle brackets as in the example above.
1118
1119 The reply-to: option can be used to set the reply-to mail header,
1120 optionally with a name.
1121
1122 The subject: option sets the message subject and must be on only one
1123 line.
1124
1125 The message: option sets the mail body. This option should always be
1126 the last in a mail-format statement. The mail body can be as long as
1127 needed, but must not contain the block-closing '}' character.
1128
1129 You need not use all options, only the option which you want to
1130 override. For example to globally change the sender address only:
1131
1132 set mail-format { from: bofh@foo.bar }
1133
1134 The subject and body may contain $NAME variables, which are expanded by
1135 Monit. Here is a list of variables that can be used when composing an
1136 alert message.
1137
1138 · $EVENT
1139
1140 A string describing the event that occurred.
1141
1142 · $SERVICE
1143
1144 The service name
1145
1146 · $DATE
1147
1148 The current time and date (RFC 822 date style).
1149
1150 · $HOST
1151
1152 The name of the host Monit is running on
1153
1154 · $ACTION
1155
1156 The name of the action which was done by Monit.
1157
1158 · $DESCRIPTION
1159
1160 The description of the error condition
1161
1162 Setting a mail server for alert delivery
1163 The mail server Monit should use to send alert messages is defined with
1164 a "set mailserver" statement:
1165
1166 SET MAILSERVER
1167 <hostname|ip-address>
1168 [PORT number]
1169 [USERNAME string] [PASSWORD string]
1170 [using SSL [with options {...}]
1171 [CERTIFICATE CHECKSUM [MD5|SHA1] <hash>],
1172 ...
1173 [with TIMEOUT X SECONDS]
1174 [using HOSTNAME hostname]
1175
1176 Multiple mail servers can be set by using a comma separated list. If
1177 Monit cannot connect to the first server, it will try the next in the
1178 list and so on.
1179
1180 The port statement allows one to override the default SMTP port (465
1181 for SSL, or 25 for TLS and non secure connection).
1182
1183 Monit supports AUTH PLAIN and AUTH LOGIN for SMTP authentication. You
1184 can set a username and a password using the USERNAME and PASSWORD
1185 options.
1186
1187 You can set SSL/TLS options for the connection and also check a SSL
1188 certificate checksum.
1189
1190 The default connection timeout is 5 seconds. You can rise this limit
1191 using the TIMEOUT option.
1192
1193 Example (setting two mail servers for failover):
1194
1195 set mailserver smtp.gmail.com, smtp.other.host
1196
1197 By default, Monit uses the local host name in SMTP HELO/EHLO and in the
1198 Message-ID header. You can override this using the HOSTNAME option.
1199
1200 Event queue
1201 If no mail server is available, Monit can queue events in the local
1202 file-system for retry until the mail server recovers.
1203
1204 If Monit is used with M/Monit, the event queue provides a safe event
1205 store for M/Monit in the case of temporary problems.
1206
1207 The event queue is persistent across Monit restarts and provided that
1208 the back-end filesystem is persistent, across system restart as well.
1209
1210 By default, the queue is disabled and if the alert handler fails, Monit
1211 will simply drop the alert message.
1212
1213 To enable the event queue, add the following statement:
1214
1215 SET EVENTQUEUE BASEDIR <path> [SLOTS <number>]
1216
1217 The <path> is the path to the directory where events will be stored.
1218
1219 Optionally if you want to limit the queue size, use the slots option to
1220 only store up to number event messages.
1221
1222 Example:
1223
1224 set eventqueue basedir /var/monit slots 5000
1225
1226 If you are running more then one Monit instance on the same machine,
1227 you must use separated event queue directories.
1228
1230 Each service can have associated start, stop and restart methods which
1231 Monit can use to execute action on the service.
1232
1233 Syntax:
1234
1235 <START | STOP | RESTART> [PROGRAM] = "program"
1236 [[AS] UID <number | string>]
1237 [[AS] GID <number | string>]
1238 [[WITH] TIMEOUT <number> SECOND(S)]
1239
1240 If the "program" is a shell script it must begin with "#!" and the
1241 remainder of the first line must specify an interpreter for the
1242 program. e.g. "#!/bin/sh"
1243
1244 The "program" must also be executable (for example mode 0755).
1245
1246 It's possible to write scripts directly into the program this way:
1247
1248 stop = "/bin/sh -c 'kill -s SIGTERM `cat /var/run/process.pid`'"
1249
1250 By default the program is executed as the user under which Monit is
1251 running. If Monit is running as root, you may optionally specify the
1252 UID and GID the executed program should switch to.
1253
1254 Example:
1255
1256 check process mmonit with pidfile /usr/local/mmonit/mmonit/logs/mmonit.pid
1257 start program = "/usr/local/mmonit/bin/mmonit" as uid "mmonit" and gid "mmonit"
1258 stop program = "/usr/local/mmonit/bin/mmonit stop" as uid "mmonit" and gid "mmonit"
1259
1260 In the case of a process check, Monit will wait up to 30 seconds for
1261 the start/stop action to finish before giving up and report an error.
1262 You can override this timeout using the TIMEOUT option or globally
1263 using the set limits.
1264
1265 Example:
1266
1267 check process foobar with pidfile /var/run/foobar.pid
1268 start program = "/etc/init.d/foobar start" with timeout 60 seconds
1269 stop program = "/etc/init.d/foobar stop"
1270
1272 Services are checked regularly in an interval defined by the "set
1273 daemon n" statement. Checks are performed in the same order as they are
1274 written in the ".monitrc" file, except if dependencies are setup
1275 between services, where pre-requisite services are tested first.
1276
1277 It is possible to modify a service check schedule by using the "every"
1278 statement.
1279
1280 There are three variants:
1281
1282 1. A poll cycle multiple
1283 EVERY [number] CYCLES
1284
1285 2. Cron-style
1286 EVERY [cron]
1287
1288 3. Negative Cron-style (do-not-check)
1289 NOT EVERY [cron]
1290
1291 A cron-style string consist of 5 fields separated with white-space.
1292 All fields are required:
1293
1294 Name: | Allowed values: | Special characters:
1295 ---------------------------------------------------------------
1296 Minutes | 0-59 | * - ,
1297 Hours | 0-23 | * - ,
1298 Day of month | 1-31 | * - ,
1299 Month | 1-12 (1=jan, 12=dec) | * - ,
1300 Day of week | 0-6 (0=sunday, 6=saturday) | * - ,
1301
1302 The special characters:
1303
1304 Character: | Description:
1305 ---------------------------------------------------------------
1306 * (asterisk) | The asterisk indicates that the expression will
1307 | match for all values of the field; e.g., using
1308 | an asterisk in the 4th field (month) would
1309 | indicate every month.
1310 - (hyphen) | Hyphens are used to define ranges. For example,
1311 | 8-9 in the hour field indicate between 8AM and
1312 | 9AM. Note that range is from start time until and
1313 | including end time. That is, from 8AM and until
1314 | 10AM unless minutes are set. Another example,
1315 | 1-5 in the weekday field, specify from monday to
1316 | friday (including friday).
1317 , (comma) | Comma are used to specify a sequence. For example
1318 | 17,18 in the day field indicate the 17th and 18th
1319 | day of the month. A sequence can also include
1320 | ranges. For example, using 1-5,0 in the weekday
1321 | field indicate monday to friday and sunday.
1322
1323 Example 1: Check once per two cycles
1324
1325 check process nginx with pidfile /var/run/nginx.pid
1326 every 2 cycles
1327
1328 Example 2: Check every workday between 8AM to 7PM
1329
1330 check program checkOracleDatabase
1331 with path /var/monit/programs/checkoracle.pl
1332 every "* 8-19 * * 1-5"
1333
1334 Example 3: Do not run the check in the backup window on Sunday between
1335 0AM to 3AM, otherwise run the check with the regular poll cycle
1336 frequency.
1337
1338 check process mysqld with pidfile /var/run/mysqld.pid
1339 not every "* 0-3 * * 0"
1340
1341 Limitations:
1342
1343 The current scheduler is poll cycle based. If a service check is
1344 scheduled with the every cron statement, Monit will check if the
1345 current time match the cron-string pattern. If it does, then the check
1346 is performed otherwise it is skipped. The cron specification does not
1347 guarantee when exactly the test will run, this depends on the default
1348 poll time and the length of the check cycle. In other words, we cannot
1349 guarantee that Monit will run on a specific time. Therefor we strongly
1350 recommend to use an asterix in the minute field or at minimum a range,
1351 e..g. 0-15. Never use a specific minute as Monit may not run on that
1352 minute.
1353
1354 We will address this limitation in a future release and convert the
1355 scheduler from serial polling into a parallel non-blocking scheduler
1356 where checks are guaranteed to run on time and with seconds resolution.
1357
1359 Service entries in the control file, monitrc, can be grouped together
1360 by the group statement. The syntax is simply (keyword in capital):
1361
1362 GROUP groupname
1363
1364 With this statement it is possible to group similar service entries
1365 together and manage them as a whole. Monit provides functions to start,
1366 stop, restart, monitor and unmonitor a group of services, like so:
1367
1368 To start a group of services from the console:
1369
1370 monit -g <groupname> start
1371
1372 To stop a group of services:
1373
1374 monit -g <groupname> stop
1375
1376 To restart a group of services:
1377
1378 monit -g <groupname> restart
1379
1380 A service can be added to multiple groups by using more than one group
1381 statement:
1382
1383 group www
1384 group filesystem
1385
1387 Monit supports two monitoring modes: active and passive.
1388
1389 Syntax:
1390
1391 MODE <ACTIVE | PASSIVE>
1392
1393 In active mode, Monit will pro-actively monitor a service and in case
1394 of problems raise alerts and restart the service. Active is the default
1395 mode.
1396
1397 The passive mode is similar to the active mode, except if the service
1398 fails, monit will not try to fix a problem by restarting the service
1399 and will raise alerts only.
1400
1402 Monit supports three reboot modes: start, nostart and laststate.
1403
1404 Syntax:
1405
1406 ONREBOOT <START | NOSTART | LASTSTATE>
1407
1408 In start mode, Monit will always start the service automatically on
1409 reboot, even if it was stopped before restart. This is the default mode
1410 and used if onreboot is not specified.
1411
1412 In nostart mode, the service is never started automatically after
1413 reboot. This mode is intended for a high-availability solutions with
1414 active/passive clusters. For example, a service group HA, consisting of
1415 e.g. a mobile IP alias and an application server, is started on host
1416 H1, host H2 is backup and heartbeat is in place between both hosts.
1417 The service group HA must be started on one node only. If H1 dies, H2
1418 takes over the HA group. If H1 reboots, it is important that it won't
1419 try to start the HA group also. Even though the group was active on H1
1420 before it crashed, as HA is running on H2 now.
1421
1422 In laststate mode, a service's monitoring state is persistent across
1423 reboot. For instance, if a service was started before reboot, it will
1424 be started after reboot. If it was stopped before reboot, it will not
1425 be started after and so on.
1426
1427 The default ONREBOOT START mode can be overridden globally:
1428
1429 SET ONREBOOT <START | NOSTART | LASTSTATE>
1430
1432 Monit provides a restart limit mechanism for situations where a service
1433 simply refuses to start or respond over a longer period.
1434
1435 The restart limit mechanism is based on number of service restarts and
1436 number of poll-cycles. For example, if a service had x restarts within
1437 y poll-cycles (where x <= y) then Monit will perform an action (for
1438 example unmonitor the service). If a timeout occurs, Monit will send an
1439 alert message if you have register interest for this event.
1440
1441 The syntax for the timeout statement is as follows (keywords are in
1442 capital):
1443
1444 IF <number> RESTART <number> CYCLE(S) THEN <action>
1445
1446 The action value is either one of common actions or TIMEOUT (for
1447 backward compatibility, equals to UNMONITOR action).
1448
1449 Here is an example where Monit will unmonitor the service if it was
1450 restarted 2 times within 3 cycles:
1451
1452 if 2 restarts within 3 cycles then unmonitor
1453
1454 To have Monit check the service again after monitoring was disabled,
1455 run "monit monitor servicename" from the command line.
1456
1457 Example for setting custom exec on timeout:
1458
1459 if 5 restarts within 5 cycles then exec "/foo/bar"
1460
1461 Example for stopping the service:
1462
1463 if 7 restarts within 10 cycles then stop
1464
1466 If specified in the control file, Monit can do dependency checking
1467 before start, stop, monitoring or unmonitoring of services. The
1468 dependency statement may be used within any service entries in the
1469 Monit control file.
1470
1471 The syntax for the depend statement is simply:
1472
1473 DEPENDS on service[, service [,...]]
1474
1475 Where service is a check service entry name used in your ".monitrc"
1476 file, for instance apache or datafs.
1477
1478 You may add more than one service name of any type or use more than one
1479 depend statement in an entry.
1480
1481 Services specified in a depend statement will be checked during
1482 stop/start/monitor/unmonitor operations.
1483
1484 If a service is stopped or unmonitored it will stop/unmonitor any
1485 services that depends on itself.
1486
1487 If the service is started, all services which this service depends on
1488 will be started before starting this service. if start of some service
1489 failed, the service with prerequisites will NOT be started and the, but
1490 will remember that it should start and will retry next cycle.
1491
1492 If a service is restarted, it will first stop any active services that
1493 depend on it and after it is started, start all depending services that
1494 were active before the restart again.
1495
1496 Here is an example where we set up an apache service entry to depend on
1497 the underlying apache binary. If the binary should change an alert is
1498 sent and apache is not monitored anymore. The rationale is security and
1499 that Monit should not execute a possibly cracked apache binary.
1500
1501 (1) check process apache with pidfile "/var/run/httpd.pid"
1502 (2) depends on httpd
1503 (3) ...
1504 (4)
1505 (5) check file httpd with path /usr/bin/httpd
1506 (6) if failed checksum then stop
1507
1508 The first entry is the process entry for apache. The second line sets
1509 up a dependency between this entry and the service entry named httpd in
1510 line 5. A dependency tree works as follows, if an action is conducted
1511 in a lower branch it will propagate upward in the tree and for every
1512 dependent entry execute the same action. In this case, if the checksum
1513 should fail in line 6 then an stop action is executed and apache binary
1514 is not checked anymore. But since the apache process entry depends on
1515 the httpd entry this entry will also execute the stop action. In short,
1516 if the checksum test for the httpd binary file should fail, both the
1517 check file httpd and the check process apache entry are stopped.
1518
1519 A dependency tree is a general construct and can be used between all
1520 types of service entries and span many levels and propagate any
1521 supported action (except the exec action which will not propagate
1522 upward in a dependency tree for obvious reasons).
1523
1524 Here is another different example. Consider the following common server
1525 setup:
1526
1527 WEB-SERVER -> APPLICATION-SERVER -> DATABASE -> FILESYSTEM
1528 (a) (b) (c) (d)
1529
1530 You can set dependencies so that the web-server depends on the
1531 application server to run before the web-server starts and the
1532 application server depends on the database server and the database
1533 depends on the filesystem to be mounted before it starts. See also the
1534 example section below for examples using the depend statement.
1535
1536 Here we describe how Monit will function with the above dependencies:
1537
1538 If no services are running
1539 Monit will start the servers in the following order: d, c, b, a
1540
1541 If all servers are running
1542 When you run 'monit stop all' this is the stop order: a, b, c, d.
1543 If you run 'Monit stop d' then a, b and c are also stopped because
1544 they depend on d and finally d is stopped.
1545
1546 If a does not run
1547 Monit will start a
1548
1549 If b does not run
1550 Monit will first stop a then start b and finally start a if b is up
1551 again.
1552
1553 If c does not run
1554 Monit will first stop a and b then start c and finally start b then
1555 a.
1556
1557 If d does not run
1558 Monit will first stop a, b and c then start d and finally start c,
1559 b then a.
1560
1561 If the control file contains a depend loop.
1562 A depend loop is for example; a->b and b->a or a->b->c->a.
1563
1564 When Monit starts it will check for such loops and complain and
1565 exit if a loop was found. It will also exit with a complaint if a
1566 depend statement was used that does not point to a service in the
1567 control file.
1568
1570 LIMITS
1571 You can configure and set various limits to tweak buffer sizes and
1572 timeouts used by Monit. In most situations the default values are fine.
1573 If needed, below are the limits you can currently modify in Monit.
1574
1575 Syntax:
1576
1577 SET LIMITS {
1578 PROGRAMOUTPUT: <number> <unit>,
1579 SENDEXPECTBUFFER: <number> <unit>,
1580 FILECONTENTBUFFER: <number> <unit>,
1581 HTTPCONTENTBUFFER: <number> <unit>,
1582 NETWORKTIMEOUT: <number> <timeunit>
1583 PROGRAMTIMEOUT: <number> <timeunit>
1584 STOPTIMEOUT: <number> <timeunit>
1585 STARTTIMEOUT: <number> <timeunit>
1586 RESTARTTIMEOUT: <number> <timeunit>
1587 }
1588
1589 Where:
1590 unit is "B" (byte), "kB" (kilobyte) or "MB" (megabyte)
1591 timeunit is "MS" (millisecond) or "S" (second)
1592
1593 Options legend:
1594
1595 ----------------------------------------------------------------------------------
1596 | Option | Description | Default |
1597 ----------------------------------------------------------------------------------
1598 | programOutput | limit for check program output (truncated after) | 512 B |
1599 | sendExpectBuffer | limit for send/expect protocol test | 256 B |
1600 | fileContentBuffer | limit for file content test (line) | 512 B |
1601 | httpContentBuffer | limit for HTTP content test (response body) | 1 MB |
1602 | networkTimeout | timeout for network I/O | 5 s |
1603 | programTimeout | timeout for check program | 300 s |
1604 | stopTimeout | timeout for service stop | 30 s |
1605 | startTimeout | timeout for service start | 30 s |
1606 | restartTimeout | timeout for service restart | 30 s |
1607 ----------------------------------------------------------------------------------
1608
1609 GENERAL SYNTAX
1610 Monit offers several if-tests you can use in a 'check' statement to
1611 test various aspects of a service.
1612
1613 You can test both for a predefined value or for a range and take
1614 actions if the value changes.
1615
1616 General syntax for testing a specific value or range:
1617
1618 IF <test> THEN <action> [ELSE IF SUCCEEDED THEN <action>]
1619
1620 The action is evaluated each time the <TEST> condition is true. Success
1621 action is optional and executed only when the state changes from
1622 failure to success. If success action is not set, Monit will send a
1623 recovery alert by default.
1624
1625 General syntax for a value change test:
1626
1627 IF CHANGED <test> THEN <action>
1628
1629 The action is executed each time the value changes. Monit will remember
1630 the new value and will trigger event if the value change again.
1631
1632 ACTION
1633 In each test you must select the action to be executed from this list:
1634
1635 · ALERT sends the user an alert event on each state change.
1636
1637 · RESTART restarts the service and send an alert. Restart is
1638 performed by calling the service's registered restart method or by
1639 first calling the stop method followed by the start method if
1640 restart is not set.
1641
1642 · START starts the service by calling the service's registered start
1643 method and send an alert.
1644
1645 · STOP stops the service by calling the service's registered stop
1646 method and send an alert. If Monit stops a service it will not be
1647 checked by Monit anymore nor restarted again later. To reactivate
1648 monitoring of the service again you must explicitly enable
1649 monitoring from the web interface or from the console.
1650
1651 · EXEC can be used to execute an arbitrary program and send an alert.
1652 If you choose this action you must state the program to be executed
1653 and if the program requires arguments you must enclose the program
1654 and its arguments in a quoted string. You may optionally specify
1655 the uid and gid the executed program should switch to upon start.
1656 The program is executed only once if the test fails. You can enable
1657 execute repetition if the error persists for a given number of
1658 cycles. For instance:
1659
1660 if failed <test> then exec "/usr/local/bin/sms.sh"
1661 as uid "nobody" and gid "nobody"
1662 repeat every 5 cycles
1663
1664 Remember, if Monit is run by root, then all programs executed by
1665 Monit will be started with superuser privileges unless the uid and
1666 gid extension is used.
1667
1668 · UNMONITOR will disable monitoring of the service and send an alert.
1669 The service will not be checked by Monit anymore nor restarted
1670 again later. To reactivate monitoring of the service you must
1671 explicitly enable monitoring from the web interface or from the
1672 console.
1673
1674 FAULT TOLERANCE
1675 By default an action is executed if it matches and the corresponding
1676 service is set in an error state. However, you can require a test to
1677 fail more than once before the error event is triggered and the service
1678 state is changed to failed. This is useful to avoid getting alerts on
1679 spurious errors, which can happen, especially with network tests.
1680
1681 Syntax:
1682
1683 FOR <X> CYCLES ...
1684
1685 or:
1686
1687 <X> [TIMES WITHIN] <Y> CYCLES ...
1688
1689 The condition can be used both for failure and success action.
1690
1691 The first, simpler and recommended format requires "X" consecutive
1692 events before switching the state:
1693
1694 if failed
1695 port 80
1696 for 3 cycles
1697 then alert
1698
1699 The second format is more advanced and allows one to tolerate
1700 intermittent issues, but still catch excessive problems, where the
1701 service is flapping between error and success states frequently.
1702
1703 For example if every second cycle fails (1-0-1-0-1-0-...), then "for 2
1704 cycles" condition will never match, despite the service having
1705 problems. The following statement will catch such a state:
1706
1707 if failed
1708 port 80
1709 for 3 times within 5 cycles
1710 then alert
1711
1712 Example which sets multiple error levels and actions:
1713
1714 check filesystem rootfs with path /dev/hda1
1715 if space usage > 80% for 5 times within 15 cycles then alert
1716 if space usage > 90% for 5 cycles then exec '/try/to/free/the/space'
1717
1718 Note: the maximum value for cycles is 64.
1719
1720 EXISTENCE TESTS
1721 This test allows one to trigger an action based on the monitored object
1722 existence. It is supported for process, file, directory, filesystem and
1723 fifo services.
1724
1725 If no existence test is defined, the implicit non-existence test with
1726 restart action is activated, so for example if the process stops, Monit
1727 will restart it.
1728
1729 There are two types of existence tests:
1730
1731 NON-EXIST
1732
1733 This test will trigger an action if the object does not exist. It can
1734 be used for example to make sure apache is running, data filesystem is
1735 mounted, etc.
1736
1737 IF [DOES] NOT EXIST THEN <action>
1738
1739 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
1740 "UNMONITOR".
1741
1742 Example: Exec a script if a filesystem does NOT exist:
1743
1744 check filesystem disk1 with path /dev/sda1
1745 if does not exist then exec "/sbin/mount..."
1746
1747 EXIST
1748
1749 This test is the inverse of the non-existence test: it will trigger an
1750 action if the object DOES exist. It can be used for example to kill a
1751 process which shouldn't be running.
1752
1753 IF [DOES] EXIST THEN <action>
1754
1755 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
1756 "UNMONITOR".
1757
1758 Example: kill a process that should not run:
1759
1760 check process vmware matching "vmware"
1761 if exist then exec "/usr/bin/pkill -9 vmware"
1762
1763 Example: Alert if a file exist which shouldn't
1764
1765 check file x with path /some/path/x
1766 if exist then alert
1767
1768 RESOURCE TESTS
1769 Monit can examine how much resources a service is using. This test can
1770 only be used within a system or process service entry in the Monit
1771 control file.
1772
1773 Depending on system or process characteristics, services can be stopped
1774 or restarted and alerts can be generated. Thus it is possible to
1775 utilise systems which are idle and to spare system under high load.
1776
1777 Syntax:
1778
1779 IF <resource> <operator> <value> THEN <action>
1780
1781 operator is a choice of "<", ">", "!=", "==" in C notation, "gt", "lt",
1782 "eq", "ne" in shell sh notation and "greater", "less", "equal",
1783 "notequal" in human readable form (if not specified, default is EQUAL).
1784
1785 value is either an integer or a real number.
1786
1787 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
1788 "UNMONITOR".
1789
1790 resource set depends on the service type:
1791
1792 System resource tests
1793
1794 LOADAVG([1min|5min|15min]) [PER CORE] refers to the system's load
1795 average. The load average is the number of processes in the system run
1796 queue per CPU core, averaged over the specified time period. Example:
1797
1798 if loadavg (1min) per core > 2 for 15 cycles then alert
1799 if loadavg (5min) per core > 1.5 for 10 cycles then alert
1800 if loadavg (15min) per core > 1 for 8 cycles then alert
1801
1802 If you'll omit the per core option, the test will check the total load
1803 average regardless of CPU cores count.
1804
1805 CPU([user|system|wait]) is the percent of time the system spend in user
1806 or kernel space and I/O. The user/system/wait modifier is optional, if
1807 not used, the total system cpu usage is tested. Example:
1808
1809 if cpu usage > 95% for 10 cycles then alert
1810
1811 MEMORY is the system memory usage [%] or absolute value [B, kB, MB,
1812 GB]. Example:
1813
1814 if memory usage > 75% for 5 cycles then alert
1815
1816 SWAP is the swap usage of the system [%] or absolute [B, kB, MB, GB].
1817 Example:
1818
1819 if swap usage > 20% for 10 cycles then alert
1820
1821 Process resource tests
1822
1823 CPU is the CPU usage of the process itself [%]. Monit calculates the
1824 CPU usage based on number of threads vs. available CPU cores. If the
1825 process has one thread, the 100% CPU usage equals to 100% utilization
1826 of one CPU core. If it has 2 threads, 100% CPU usage is reported when
1827 it uses 2 CPU cores on 100%, etc. If the process has more threads then
1828 the machine's available CPU cores, then the 100% CPU usage corresponds
1829 to utilization of all available CPU cores. Example:
1830
1831 if cpu > 10% for 5 cycles then restart
1832
1833 TOTAL CPU is the total CPU usage of the process and its children in
1834 (percent). You will want to use TOTAL CPU typically for services like
1835 Apache web server where one master process forks child processes as
1836 workers. Example:
1837
1838 if total cpu > 50% for 10 cycles then restart
1839
1840 THREADS is the number of processes' threads. Example:
1841
1842 if threads > 3 then alert
1843
1844 CHILDREN is the number of child processes of the process. Example:
1845
1846 if children > 10 then alert
1847
1848 MEMORY is the memory usage of the process itself, [%] or absolute value
1849 [B, kB, MB, GB]. Example:
1850
1851 if memory usage > 8 MB then alert
1852
1853 TOTAL MEMORY is the memory usage of the process and its child processes
1854 in either percent or as an amount [B, kB, MB, GB]. Example:
1855
1856 if total memory usage > 1% for 10 cycles then alert
1857
1858 PROCESS DISK I/O TEST
1859 Monit can test process' filesystem read and write activity. This test
1860 can only be used in the context of a process service type. Monit will
1861 normally need to run as the root user to access this metrics.
1862
1863 The OS usually supports the per-process I/O metrics by bytes or by
1864 operations.
1865
1866 Per-process I/O activity statistics by platform:
1867
1868 -----------------------------------
1869 | Platform | Operation | Byte |
1870 -----------------------------------
1871 | AIX | x | |
1872 | DragonFlyBSD | x | |
1873 | FreeBSD | x | |
1874 | Linux | | x |
1875 | MacOS | | x |
1876 | NetBSD | x | |
1877 | OpenBSD | x | |
1878 | Solaris | x | |
1879 -----------------------------------
1880
1881 Read: bytes per second
1882
1883 Syntax:
1884
1885 IF DISK READ [RATE] <operator> <number> <unit>/S THEN action
1886
1887 operator is a choice of "<",">","!=","==" in c notation, "gt", "lt",
1888 "eq", "ne" in shell sh notation and "greater", "less", "equal",
1889 "notequal" in human readable form (if not specified, default is EQUAL).
1890
1891 unit is a choice of "B","KB","MB","GB" or long alternatives "byte",
1892 "kilobyte", "megabyte", "gigabyte", "percent".
1893
1894 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
1895 "UNMONITOR".
1896
1897 Example:
1898
1899 check process p...
1900 if disk read > 1 MB/s then alert
1901
1902 Read: operations per second
1903
1904 Syntax:
1905
1906 IF DISK READ <operator> <number> operations/S THEN action
1907
1908 operator is a choice of "<",">","!=","==" in c notation, "gt", "lt",
1909 "eq", "ne" in shell sh notation and "greater", "less", "equal",
1910 "notequal" in human readable form (if not specified, default is EQUAL).
1911
1912 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
1913 "UNMONITOR".
1914
1915 Example:
1916
1917 check process p...
1918 if disk read rate > 500 operations/s then alert
1919
1920 Write: bytes per second
1921
1922 Syntax:
1923
1924 IF DISK WRITE <operator> <number> <unit>/S THEN action
1925
1926 operator is a choice of "<",">","!=","==" in c notation, "gt", "lt",
1927 "eq", "ne" in shell sh notation and "greater", "less", "equal",
1928 "notequal" in human readable form (if not specified, default is EQUAL).
1929
1930 unit is a choice of "B","KB","MB","GB" or long alternatives "byte",
1931 "kilobyte", "megabyte", "gigabyte", "percent".
1932
1933 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
1934 "UNMONITOR".
1935
1936 Example:
1937
1938 check process p...
1939 if disk write rate > 1 MB/s then alert
1940
1941 Write: operations per second
1942
1943 Syntax:
1944
1945 IF DISK WRITE <operator> <number> operations/S THEN action
1946
1947 operator is a choice of "<",">","!=","==" in c notation, "gt", "lt",
1948 "eq", "ne" in shell sh notation and "greater", "less", "equal",
1949 "notequal" in human readable form (if not specified, default is EQUAL).
1950
1951 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
1952 "UNMONITOR".
1953
1954 Example:
1955
1956 check process p...
1957 if disk write rate > 500 operations/s then alert
1958
1959 FILE CHECKSUM TEST
1960 The checksum statement may only be used in a file service entry and can
1961 be used to check the file's MD5 or SHA1 checksum.
1962
1963 Check specific checksum:
1964
1965 IF FAILED [MD5|SHA1] CHECKSUM [EXPECT checksum] THEN action
1966
1967 Check any file changes:
1968
1969 IF CHANGED [MD5|SHA1] CHECKSUM THEN action
1970
1971 The choice of MD5 or SHA1 is optional. MD5 features a 128 bits checksum
1972 (32 bytes hex encoded string) and SHA1 a 160 bits checksum (40 bytes
1973 hex encoded string). If this option is omitted, Monit will try to guess
1974 the method from the EXPECT string or use MD5 as the default checksum.
1975
1976 "expect" is optional and if used, specifies the md5 or sha1 string
1977 Monit should expect when testing a file's checksum. Monit will then not
1978 compute an initial checksum for the file, but instead use the string
1979 you submit. For example:
1980
1981 if failed
1982 checksum expect 8f7f419955cefa0b33a2ba316cba3659
1983 then alert
1984
1985 You can, for example, use the GNU utility md5sum(1) or sha1sum(1) to
1986 create a checksum string for a file and use this string in the expect-
1987 statement.
1988
1989 Reloading a server if its configuration file was changed:
1990
1991 check file apache_conf with path /etc/apache/httpd.conf
1992 if changed checksum then exec "/usr/bin/apachectl graceful"
1993
1994 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
1995 "UNMONITOR".
1996
1997 TIMESTAMP TEST
1998 The timestamp statement may only be used in a file, fifo or directory
1999 service entry.
2000
2001 Relative timestamp syntax:
2002
2003 IF <ACCESS TIME | ATIME | MODIFICATION TIME | MTIME | CHANGE TIME | CTIME | TIME[STAMP]> <operator> <value> [unit] THEN <action>
2004
2005 Timestamp change syntax:
2006
2007 IF CHANGED <ACCESS TIME | ATIME | MODIFICATION TIME | MTIME | CHANGE TIME | CTIME | TIME[STAMP]> THEN action
2008
2009 There are four timestamp test types:
2010
2011 ACCESS (ATIME)
2012 Test the timestamp which is updated whenever the object is
2013 accessed, for example the file is read. Filesystem usually
2014 allows one to disable atime updates using mount options, so
2015 this test will work only if the filesystem performs atime
2016 updates.
2017
2018 CHANGE (CTIME)
2019 Test the timestamp which is updated whenever the object
2020 metadata such as owner, group, permissions or hard link
2021 count are changed.
2022
2023 MODIFICATION (MTIME)
2024 Test the timestamp which is updated whenever the object
2025 content is modified. The file modification timestamp is
2026 updated whenever the file is truncated or written to. The
2027 directory modification timestamp is updated whenever some
2028 files/subdirectories were added to the directory or removed
2029 from that directory.
2030
2031 DEFAULT (LATEST OF CHANGE AND MODIFICATION TIMES)
2032 If no specific timestamp type is set, the latest of change
2033 and modification timestamps is checked. This test allows
2034 for simple testing of any object modification (data and
2035 metadata).
2036
2037 operator is a choice of "<", ">", "!=", "==" in C notation, "GT", "LT",
2038 "EQ", "NE" in shell sh notation and "NEWER, "OLDER", "GREATER", "LESS",
2039 "EQUAL", "NOTEQUAL" in human readable form (if not specified, default
2040 is EQUAL).
2041
2042 value is a time watermark.
2043
2044 unit is either "SECOND(S)", "MINUTE(S)", "HOUR(S)" or "DAY(S)".
2045
2046 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
2047 "UNMONITOR".
2048
2049 For example to reload apache if the configuration file changed:
2050
2051 check file apache_conf with path /etc/apache/httpd.conf
2052 if changed timestamp then exec "/usr/bin/apachectl graceful"
2053
2054 For example to test directory for file addition or removal:
2055
2056 check directory bar path /foo/bar
2057 if changed timestamp then alert
2058
2059 Example for sending alert if a log file is not updated for more than 1
2060 hour:
2061
2062 if timestamp is older than 1 hour then alert
2063
2064 FILE SIZE TEST
2065 The size statement may only be used in a check file service entry. If
2066 specified in the control file, Monit will compute a size for a file.
2067
2068 Testing specific size or range:
2069
2070 IF SIZE [[operator] value [unit]] THEN action
2071
2072 Testing size changes:
2073
2074 IF CHANGED SIZE THEN action
2075
2076 operator is a choice of "<", ">", "!=", "==" in C notation, "GT", "LT",
2077 "EQ", "NE" in shell sh notation and "GREATER", "LESS", "EQUAL",
2078 "NOTEQUAL" in human readable form (if not specified, default is EQUAL).
2079
2080 value is a size watermark.
2081
2082 unit is a choice of "B","KB","MB","GB" or long alternatives "byte",
2083 "kilobyte", "megabyte", "gigabyte". If it is not specified, "byte" unit
2084 is assumed by default.
2085
2086 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
2087 "UNMONITOR".
2088
2089 For example to send an alert if the file is too large:
2090
2091 check file mydb with path /data/mydatabase.db
2092 if size > 1 GB then alert
2093
2094 FILE CONTENT TEST
2095 The content statement can be used to incrementally test the content of
2096 a text file by using regular expressions.
2097
2098 Syntax:
2099
2100 IF CONTENT <operator> <regex|path> THEN action
2101
2102 operator is either a "=" for match or "!=" for no-match.
2103
2104 regex is a string containing the extended regular expression. See also
2105 regex(7).
2106
2107 path is an absolute path to a file containing extended regular
2108 expression on every line. See also regex(7).
2109
2110 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
2111 "UNMONITOR".
2112
2113 On startup the read position is set to the end of the file and Monit
2114 continues to scan to the end of the file on each cycle.
2115
2116 If the file size should decrease or inode changed, the read position is
2117 set to the start of the file.
2118
2119 Only lines ending with a newline character are inspected.
2120
2121 By default only the first 511 characters of a line are inspected. You
2122 can increase the limit using the set limits statement.
2123
2124 IGNORE CONTENT <operator> <regex|path>
2125
2126 Lines matching an IGNORE are not inspected during later evaluations.
2127 IGNORE CONTENT has always precedence over IF CONTENT.
2128
2129 All IGNORE CONTENT statements are evaluated first, in the order of
2130 their appearance. Thereafter, all the IF CONTENT statements are
2131 evaluated.
2132
2133 For example:
2134
2135 check file syslog with path /var/log/syslog
2136 ignore content = "monit"
2137 if content = "^mrcoffee" then alert
2138
2139 FILESYSTEM MOUNT FLAGS TEST
2140 Monit can test the filesystem mount flags for changes. This test is
2141 implicit and Monit will send alert in case of failure by default.
2142
2143 This test is useful for detecting changes of filesystem flags such as
2144 if the filesystem become read-only (on disk error) or mount flags were
2145 changed (such as nosuid).
2146
2147 The syntax for the fsflags statement is:
2148
2149 IF CHANGED FSFLAGS THEN action
2150
2151 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
2152 "UNMONITOR".
2153
2154 Example:
2155
2156 check filesystem rootfs with path /
2157 if changed fsflags then exec "/my/script"
2158
2159 SPACE USAGE TEST
2160 Monit can test a filesystem or a disk for space usage. This test may
2161 only be used in the context of a filesystem service type.
2162
2163 Filesystems usually have some space reserved for the root user (ca.
2164 1-5%), so non-superusers cannot write to a nearly full filesystem. If
2165 you set a limit for the filesystem which is used by non-root users you
2166 might want to consider these reserved blocks when setting the limit.
2167 You can use Monit itself to view the reserved blocks percentage by
2168 using the CLI status command or the HTTP interface for the given
2169 filesystem.
2170
2171 Syntax:
2172
2173 IF SPACE operator value unit THEN action
2174
2175 or:
2176
2177 IF SPACE FREE operator value unit THEN action
2178
2179 operator is a choice of "<",">","!=","==" in c notation, "gt", "lt",
2180 "eq", "ne" in shell sh notation and "greater", "less", "equal",
2181 "notequal" in human readable form (if not specified, default is EQUAL).
2182
2183 unit is a choice of "B","KB","MB","GB", "%" or long alternatives
2184 "byte", "kilobyte", "megabyte", "gigabyte", "percent".
2185
2186 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
2187 "UNMONITOR".
2188
2189 Example:
2190
2191 check filesystem rootfs with path /
2192 if space usage > 90% then alert
2193
2194 INODE USAGE TEST
2195 Monit can test filesystem inode usage. This test may only be used in
2196 the context of a filesystem service type.
2197
2198 Syntax:
2199
2200 IF INODE(S) operator value [unit] THEN action
2201
2202 or:
2203
2204 IF INODE(S) FREE operator value [unit] THEN action
2205
2206 operator is a choice of "<",">","!=","==" in c notation, "gt", "lt",
2207 "eq", "ne" in shell sh notation and "greater", "less", "equal",
2208 "notequal" in human readable form (if not specified, default is EQUAL).
2209
2210 unit is optional. If not specified, the value is an absolute count of
2211 inodes. You can use the "%" character or the longer alternative
2212 "percent" as a unit.
2213
2214 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
2215 "UNMONITOR".
2216
2217 Example:
2218
2219 check filesystem rootfs with path /
2220 if inode usage > 90% then alert
2221
2222 DISK I/O TEST
2223 Monit can test a filesystem read and write activity. This test may only
2224 be used in the context of a filesystem service type.
2225
2226 The available I/O metrics depends on the platform and filesystem. Some
2227 platforms allows us to get I/O activity for specific partition, others
2228 just for the whole disk. Some allows us to get metrics for network
2229 filesystems, others just for block devices.
2230
2231 Platforms I/O metrics granularity and filesystem support in Monit:
2232
2233 ---------------------------------------------------------------------------------------
2234 | Platform | Granularity | Supported filesystems | TBD |
2235 ---------------------------------------------------------------------------------------
2236 | AIX | per-disk | Disk io monitoring currently not supported | JFSx |
2237 | DragonFlyBSD | per-disk | UFS | HAMMER |
2238 | FreeBSD | per-disk | UFS | ZFS |
2239 | Linux | per-filesystem | EXTx, XFS, BTRFS, ZFS, NFS, CIFS | |
2240 | MacOS | per-disk | HFS | |
2241 | NetBSD | per-disk | FFS | NFS |
2242 | OpenBSD | per-disk | FFS | |
2243 | Solaris | per-filesystem | ZFS, UFS, NFS | |
2244 ---------------------------------------------------------------------------------------
2245
2246 Read: bytes per second
2247
2248 Syntax:
2249
2250 IF READ [RATE] <operator> <number> <unit>/S THEN action
2251
2252 operator is a choice of "<",">","!=","==" in c notation, "gt", "lt",
2253 "eq", "ne" in shell sh notation and "greater", "less", "equal",
2254 "notequal" in human readable form (if not specified, default is EQUAL).
2255
2256 unit is a choice of "B","KB","MB","GB" or long alternatives "byte",
2257 "kilobyte", "megabyte", "gigabyte", "percent".
2258
2259 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
2260 "UNMONITOR".
2261
2262 Example:
2263
2264 check filesystem disk1...
2265 if read rate > 1 MB/s then alert
2266
2267 Read: operations per second
2268
2269 Syntax:
2270
2271 IF READ [RATE] <operator> <number> operations/S THEN action
2272
2273 operator is a choice of "<",">","!=","==" in c notation, "gt", "lt",
2274 "eq", "ne" in shell sh notation and "greater", "less", "equal",
2275 "notequal" in human readable form (if not specified, default is EQUAL).
2276
2277 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
2278 "UNMONITOR".
2279
2280 Example:
2281
2282 check filesystem disk1...
2283 if read rate > 500 operations/s then alert
2284
2285 Write: bytes per second
2286
2287 Syntax:
2288
2289 IF WRITE [RATE] <operator> <number> <unit>/S THEN action
2290
2291 operator is a choice of "<",">","!=","==" in c notation, "gt", "lt",
2292 "eq", "ne" in shell sh notation and "greater", "less", "equal",
2293 "notequal" in human readable form (if not specified, default is EQUAL).
2294
2295 unit is a choice of "B","KB","MB","GB" or long alternatives "byte",
2296 "kilobyte", "megabyte", "gigabyte", "percent".
2297
2298 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
2299 "UNMONITOR".
2300
2301 Example:
2302
2303 check filesystem disk1...
2304 if write rate > 1 MB/s then alert
2305
2306 Write: operations per second
2307
2308 Syntax:
2309
2310 IF WRITE [RATE] <operator> <number> operations/S THEN action
2311
2312 operator is a choice of "<",">","!=","==" in c notation, "gt", "lt",
2313 "eq", "ne" in shell sh notation and "greater", "less", "equal",
2314 "notequal" in human readable form (if not specified, default is EQUAL).
2315
2316 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
2317 "UNMONITOR".
2318
2319 Example:
2320
2321 check filesystem disk1...
2322 if write rate > 500 operations/s then alert
2323
2324 Service time per operation
2325
2326 Service Time is the time taken to complete a read or a write operation.
2327 This is a fairly important metric. If it grows, it means that the disk
2328 is not able to handle the operations fast enough. Growth charts are
2329 available in M/Monit.
2330
2331 Syntax:
2332
2333 IF SERVICE TIME <operator> <number> <unit> THEN action
2334
2335 operator is a choice of "<",">","!=","==" in c notation, "gt", "lt",
2336 "eq", "ne" in shell sh notation and "greater", "less", "equal",
2337 "notequal" in human readable form (if not specified, default is EQUAL).
2338
2339 unit is "MS" (millisecond) or "S" (second)
2340
2341 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
2342 "UNMONITOR".
2343
2344 Example:
2345
2346 if service time > 10 milliseconds
2347 for 3 times within 5 cycles
2348 then alert
2349
2350 PERMISSION TEST
2351 Monit can test the permissions of file objects. This test may only be
2352 used in the context of a file, fifo, directory or filesystem service
2353 types.
2354
2355 Syntax for testing specific permissions:
2356
2357 IF FAILED PERM(ISSION) octalnumber THEN action
2358
2359 Syntax for testing any permission change:
2360
2361 IF CHANGED PERM(ISSION) THEN action
2362
2363 octalnumber defines permissions for a file, a directory or a filesystem
2364 as four octal digits (0-7). Valid range is 0000 - 7777 (you can omit
2365 the leading zeros, Monit will add the zeros to the left. For example,
2366 "640" is a valid value and matches "0640").
2367
2368 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
2369 "UNMONITOR".
2370
2371 Example:
2372
2373 check file shadow with path /etc/shadow
2374 if failed permission 0640 then alert
2375
2376 UID TEST
2377 Monit can monitor the owner user id (uid) of a file, fifo, directory or
2378 owner and effective user of a process.
2379
2380 Syntax:
2381
2382 IF FAILED [E]UID <value> THEN action
2383
2384 value defines a user id either in numeric or in string form.
2385
2386 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
2387 "UNMONITOR".
2388
2389 Example:
2390
2391 check file shadow with path /etc/shadow
2392 if failed uid "root" then alert
2393
2394 GID TEST
2395 Monit can monitor the owner group id (gid) of a file, fifo, directory
2396 or process.
2397
2398 Syntax:
2399
2400 IF FAILED GID <value> THEN action
2401
2402 value defines a group id either in numeric or in string form.
2403
2404 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
2405 "UNMONITOR".
2406
2407 Example:
2408
2409 check file shadow with path /etc/shadow
2410 if failed gid "shadow" then alert
2411
2412 PID TEST
2413 Monit can test the process' PID. This test is implicit and Monit will
2414 send an alert in case the PID changed outside of Monit's control.
2415
2416 Syntax:
2417
2418 IF CHANGED PID THEN action
2419
2420 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
2421 "UNMONITOR".
2422
2423 This test is useful to detect possible process restarts which has
2424 occurred in the timeframe between two Monit testing cycles.
2425
2426 For example if someone changes sshd configuration and did sshd restart
2427 outside of Monit's control you will be notified that the process was
2428 replaced by a new instance:
2429
2430 check process sshd with pidfile /var/run/sshd.pid
2431 if changed pid then alert
2432
2433 PPID TEST
2434 Monit can test the process' parent PID (PPID) for changes. This test is
2435 implicit and Monit will send alert in the case that the PPID changed
2436 outside of Monit control.
2437
2438 The syntax for the ppid statement is:
2439
2440 IF CHANGED PPID THEN action
2441
2442 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
2443 "UNMONITOR".
2444
2445 Example:
2446
2447 check process myproc with pidfile /var/run/myproc.pid
2448 if changed ppid then exec "/my/script"
2449
2450 UPTIME TEST
2451 The uptime statement may only be used in a process and system service
2452 type context.
2453
2454 Syntax:
2455
2456 IF UPTIME [[operator] value [unit]] THEN action
2457
2458 operator is a choice of "<", ">", "!=", "==" in C notation, "GT", "LT",
2459 "EQ", "NE" in shell sh notation and "GREATER", "LESS", "EQUAL",
2460 "NOTEQUAL" in human readable form (if not specified, default is EQUAL).
2461
2462 value is a uptime watermark.
2463
2464 unit is either "SECOND", "MINUTE", "HOUR" or "DAY" (it is also possible
2465 to use "SECONDS", "MINUTES", "HOURS", or "DAYS").
2466
2467 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
2468 "UNMONITOR".
2469
2470 Example of restarting the process every three days:
2471
2472 check process myapp with pidfile /var/run/myapp.pid
2473 start program = "/etc/init.d/myapp start"
2474 stop program = "/etc/init.d/myapp stop"
2475 if uptime > 3 days then restart
2476
2477 SECURITY ATTRIBUTE TEST
2478 The security attribute statement may only be used in a process context.
2479
2480 Syntax:
2481
2482 IF FAILED SECURITY ATTRIBUTE <string> THEN <action>
2483
2484 string expected security attribute value
2485
2486 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
2487 "UNMONITOR".
2488
2489 Example for SELinux:
2490
2491 check process ntpd matching "ntpd"
2492 if failed security attribute "system_u:system_r:ntpd_t:s0" then alert
2493
2494 Example for AppArmor:
2495
2496 check process ntpd matching "ntpd"
2497 if failed security attribute "/usr/sbin/ntpd (enforce)" then alert
2498
2499 PROGRAM STATUS TEST
2500 You can check the exit status of a program or a script. This test may
2501 only be used within a check program service entry in the Monit control
2502 file.
2503
2504 Syntax for testing specific exit value:
2505
2506 IF STATUS operator value THEN action
2507
2508 Syntax for testing any exit value change:
2509
2510 IF CHANGED STATUS THEN action
2511
2512 operator is a choice of "<",">","!=","==" in c notation, "gt", "lt",
2513 "eq", "ne" in shell sh notation and "greater", "less", "equal",
2514 "notequal" in human readable form (if not specified, default is EQUAL).
2515
2516 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
2517 "UNMONITOR".
2518
2519 Example:
2520
2521 check program myscript with path /usr/local/bin/myscript.sh
2522 if status != 0 then alert
2523
2524 Sample script for the above example (/usr/local/bin/myscript.sh):
2525
2526 #!/bin/sh
2527 echo test
2528 exit $?
2529
2530 You can also send parameters with the program:
2531
2532 check program list-files with path "/bin/ls -lrt /tmp/"
2533 if status != 0 then alert
2534
2535 Arguments to the program or script is a sequence of whitespace
2536 separated strings. In the above example the strings '-lrt' and '/tmp/'
2537 are arguments to the program '/bin/ls'. If arguments are used, it is
2538 recommended to use quotes " to enclose the string, otherwise, if no
2539 arguments are used, quotes are not needed.
2540
2541 Notes: If the program is a script, the interpreter is required in the
2542 first line. The program or script must also be executable.
2543
2544 If Monit is run as the super user, you can optionally run the program
2545 as a different user and/or group. In this example we run the ls program
2546 as user www and as group staff:
2547
2548 check program ls with path "/bin/ls /tmp" as uid "www"
2549 and gid "staff"
2550 if status != 0 then alert
2551
2552 Monit will execute the program periodically and if the exit status of
2553 the program does not match the expected result, Monit can perform an
2554 action. In the example above, Monit will raise an alert if the exit
2555 value is different from 0. By convention, 0 means the program exited
2556 normally.
2557
2558 Program checks are asynchronous. Meaning that Monit will not wait for
2559 the program to exit, but instead, Monit will start the program in the
2560 background and immediately continue checking the next service entry in
2561 monitrc. At the next cycle, Monit will check if the program has
2562 finished and if so, collect the program's exit status. If the status
2563 indicate a failure, Monit will raise an alert message containing the
2564 program's error (stderr) output, if any. If the program has not exited
2565 after the first cycle, Monit will wait another cycle and so on. If the
2566 program is still running after 5 minutes, Monit will kill it and
2567 generate a program timeout event. It is possible to override the
2568 default timeout (see the syntax below).
2569
2570 The asynchronous nature of the program check allows for non-blocking
2571 behaviour in the current Monit design, but it comes with a side-effect:
2572 when the program has finished executing and is waiting for Monit to
2573 collect the result, it becomes a so-called "zombie" process. A zombie
2574 process does not consume any system resources (only the PID remains in
2575 use) and it is under Monit's control and the zombie process is removed
2576 from the system as soon as Monit collects the exit status. This means
2577 that every "check program" will be associated with either a running
2578 process or a temporary zombie. This unwanted zombie side-effect will be
2579 removed in a later release of Monit.
2580
2581 Multiple status tests can be used, for example:
2582
2583 check program hwtest with path /usr/local/bin/hwtest.sh
2584 with timeout 500 seconds
2585 if status = 1 then alert
2586 if status = 3 for 5 cycles then exec "/usr/local/bin/emergency.sh"
2587
2588 NETWORK INTERFACE TESTS
2589 Monit can check network interfaces for:
2590
2591 Status
2592 Capacity
2593 Saturation
2594 Upload and download [bytes]
2595 Upload and download [packets]
2596
2597 Link status
2598
2599 You can check the network link state. This test may only be used within
2600 a check network service entry in the Monit control file.
2601
2602 Syntax:
2603
2604 IF FAILED LINK THEN action
2605
2606 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
2607 "UNMONITOR".
2608
2609 The test will fail if the link/interface is down or link errors were
2610 detected.
2611
2612 Example:
2613
2614 check network eth0 with interface eth0
2615 if failed link then alert
2616
2617 In case a link failed you can add a start and stop program to
2618 automatically restart the interface which might help. (Substitute with
2619 the relevant network commands for your system)
2620
2621 check network eth0 with interface eth0
2622 start program = '/sbin/ipup eth0'
2623 stop program = '/sbin/ipdown eth0'
2624 if failed link then restart
2625
2626 Link capacity
2627
2628 You can check the network link mode capacity for changes. This test may
2629 only be used within a check network service entry in the Monit control
2630 file.
2631
2632 Syntax:
2633
2634 IF CHANGED LINK [CAPACITY] THEN action
2635
2636 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
2637 "UNMONITOR".
2638
2639 The test will match if the link mode has changed (e.g. maximum speed
2640 dropped) or if the duplex mode has changed.
2641
2642 NOTE: not all interface types allow for capacity monitoring. Pseudo
2643 interfaces such as loopback device or VMWare interfaces does not have a
2644 speed attribute.
2645
2646 Example:
2647
2648 check network eth0 with interface eth0
2649 if changed link capacity then alert
2650
2651 Link saturation
2652
2653 You can check the network link saturation. Monit then computes the link
2654 utilisation based on the current transfer rate vs. link capacity. This
2655 test may only be used within a check network service entry in the Monit
2656 control file.
2657
2658 Syntax:
2659
2660 IF SATURATION operator value% THEN action
2661
2662 operator is a choice of "<",">","!=","==" in c notation, "gt", "lt",
2663 "eq", "ne" in shell sh notation and "greater", "less", "equal",
2664 "notequal" in human readable form (if not specified, default is EQUAL).
2665
2666 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
2667 "UNMONITOR".
2668
2669 NOTE: this test depends on the availability of the speed attribute and
2670 not all interface types have this attribute. See the LINK SPEED test
2671 description.
2672
2673 Example:
2674
2675 check network eth0 with interface eth0
2676 if saturation > 90% then alert
2677
2678 Link upload and download [bytes]
2679
2680 You can check a network link upload and download bandwidth usage,
2681 current transfer speed and total data transferred in the last 24 hours.
2682 This test may only be used within a check network service entry in the
2683 Monit control file.
2684
2685 Upload speed test syntax (per second):
2686
2687 IF UPLOAD operator value unit/S THEN action
2688
2689 Download speed test syntax (per second):
2690
2691 IF DOWNLOAD operator value unit/S THEN action
2692
2693 Total upload data test syntax:
2694
2695 IF TOTAL UPLOADED operator value unit IN LAST number time-unit THEN action
2696
2697 Total download data test syntax:
2698
2699 IF TOTAL DOWNLOADED operator value unit IN LAST number time-unit THEN action
2700
2701 operator is a choice of "<",">","!=","==" in c notation, "gt", "lt",
2702 "eq", "ne" in shell sh notation and "greater", "less", "equal",
2703 "notequal" in human readable form (if not specified, default is EQUAL).
2704
2705 unit is a choice of "B","KB","MB","GB" or long alternatives "byte",
2706 "kilobyte", "megabyte", "gigabyte".
2707
2708 time-unit is a choice of "MINUTE(S)", "HOUR(S)", "DAY". NOTE: Monit
2709 maintains a rolling count of total uploaded and downloaded bytes for
2710 the last 24 hours only. The value of time-unit can therefor not specify
2711 a range wider than one day.
2712
2713 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
2714 "UNMONITOR".
2715
2716 Examples:
2717
2718 check network eth0 with interface eth0
2719 if upload > 500 kB/s then alert
2720 if total downloaded > 1 GB in last 2 hours then alert
2721 if total downloaded > 10 GB in last day then alert
2722
2723 Link upload and download [packets]
2724
2725 You can check the network link upload and download packets count,
2726 current transfer rate and total data transferred in last 24 hours. This
2727 test may only be used within a check network service entry in the Monit
2728 control file.
2729
2730 Current upload bandwidth rate test syntax:
2731
2732 IF UPLOAD operator value PACKETS/S THEN action
2733
2734 Current download bandwidth rate test syntax:
2735
2736 IF DOWNLOAD operator value PACKETS/S THEN action
2737
2738 Total upload test syntax:
2739
2740 IF TOTAL UPLOADED operator value PACKETS IN LAST number time-unit THEN action
2741
2742 Total download test syntax:
2743
2744 IF TOTAL DOWNLOADED operator value PACKETS IN LAST number time-unit THEN action
2745
2746 operator is a choice of "<",">","!=","==" in c notation, "gt", "lt",
2747 "eq", "ne" in shell sh notation and "greater", "less", "equal",
2748 "notequal" in human readable form (if not specified, default is EQUAL).
2749
2750 time-unit is a choice of "MINUTE(S)", "HOUR(S)", "DAY". NOTE: Monit
2751 keeps total upload/download statistics only for the last 24 hours. The
2752 time-unit value cannot therefor span more than one day.
2753
2754 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
2755 "UNMONITOR".
2756
2757 Examples:
2758
2759 check network eth0 with interface eth0
2760 if upload > 1000 packets/s then alert
2761 if total uploaded > 900000 packets in last hour then alert
2762
2763 NETWORK PING TEST
2764 Monit can perform a network ping test by sending ICMP echo request
2765 datagram packets to a host and wait for the reply. This test can only
2766 be used within a check host statement. Monit must also run as the root
2767 user in order to be able to perform the ping test (because the ping
2768 test must use raw sockets which usually only the super user is allowed
2769 to).
2770
2771 Syntax:
2772
2773 IF FAILED PING[4|6]
2774 [COUNT number]
2775 [SIZE number]
2776 [TIMEOUT number SECONDS]
2777 [ADDRESS string]
2778 THEN action
2779
2780 If a DNS host name was used in the check host statement and the host
2781 name resolve to several addresses (either IPv4 or IPv6), Monit will
2782 ping the first available address and continue with the next address
2783 until one connection succeed or until there are no more addresses left
2784 to try. You can force Monit to only ping IPv4 or IPv6 addresses by
2785 using the PING4 or the PING6 keyword instead of PING.
2786
2787 The COUNT parameter specifies how many consecutive ping requests will
2788 be sent to the host in one cycle at maximum. The default value is 3.
2789
2790 The SIZE parameter specifies the ping request payload size. Default is
2791 64 bytes, minimum is 8 bytes, maximum 1492 bytes.
2792
2793 If no reply arrive within TIMEOUT seconds, Monit reports an error. If
2794 at least one reply was received, the ping test is considered a success.
2795
2796 The ADDRESS parameter specifies source IP address.
2797
2798 Monit will, by default, send up to three ping request packets in one
2799 cycle to prevent false alarm (i.e. up to 66% packet loss is tolerated).
2800 You can set the COUNT option to a value between 1 and 20 to send more
2801 or fewer packets. If you require 100% ping success, set the count to 1
2802 (i.e. just one request will be sent, and if the packet was lost an
2803 error will be reported).
2804
2805 Note that many ISPs have started to filter out ping or ICMP packets
2806 now, in which case there will be no reply from the host.
2807
2808 If a ping test is used in a check host entry, this test is run first
2809 and if the test should fail, we assume that the connection to the host
2810 is down and Monit will not continue with any subsequent port tests.
2811
2812 Example:
2813
2814 check host mmonit.com with address mmonit.com
2815 if failed ping then alert # IPv4 or IPv6
2816
2817 check host mmonit.com with address 62.109.39.247
2818 if failed ping then alert # Address is IPv4 so IPv4 is preferred
2819
2820 or test that the system is explicit accessible via IPv4 and IPv6:
2821
2822 check host mmonit.com with address mmonit.com
2823 if failed ping4 then alert # IPv4 only
2824 if failed ping6 then alert # IPv6 only
2825
2826 or with all parameters; Send five 128 byte pings to mmonit.com and wait
2827 for up to 10 seconds for a reply
2828
2829 check host mmonit.com with address mmonit.com
2830 if failed ping count 5 size 128 with timeout 10 seconds then alert
2831
2832 CONNECTION TESTS
2833 Monit can perform connection testing via network ports or via Unix
2834 sockets. A connection test may only be used within a process or host
2835 service type context.
2836
2837 If a service listens on one or more sockets, Monit can connect to the
2838 port (using TCP or UDP) and verify that the service will accept a
2839 connection and that it is possible to write and read from the socket.
2840 If a connection is not accepted or if there is a problem with socket
2841 I/O, Monit will execute a specified action.
2842
2843 TCP/UDP port test syntax:
2844
2845 IF FAILED
2846 [HOST string]
2847 <PORT number>
2848 [ADDRESS string]
2849 [IPV4 | IPV6]
2850 [TYPE <TCP|UDP>]
2851 [<SSL|TLS> [with options {...}]
2852 [CERTIFICATE CHECKSUM [MD5|SHA1] string]
2853 [CERTIFICATE VALID for number DAYS]
2854 [PROTOCOL protocol | <SEND|EXPECT> "string",...]
2855 [TIMEOUT number SECONDS]
2856 [RETRY number]
2857 THEN action
2858
2859 Unix socket test syntax:
2860
2861 IF FAILED
2862 <UNIXSOCKET path>
2863 [TYPE <TCP|UDP>]
2864 [PROTOCOL protocol | <SEND|EXPECT> "string",...]
2865 [TIMEOUT number SECONDS]
2866 [RETRY number]
2867 THEN action
2868
2869 Examples:
2870
2871 if failed port 80 then alert
2872
2873 if failed port 53 type udp protocol dns then alert
2874
2875 if failed unixsocket /var/run/sophie then alert
2876
2877 Options:
2878
2879 HOST hostname. Optionally specify the host to connect to. If the host
2880 is not given then localhost is assumed if this test is used inside a
2881 process entry. If this test is used inside a remote host entry then the
2882 entry's remote host is assumed.
2883
2884 PORT number. The port number to connect to
2885
2886 UNIXSOCKET path. Specifies the path to a Unix socket (local machine
2887 only).
2888
2889 ADDRESS string. The source IP address to use.
2890
2891 IPV4 | IPV6 . Optionally specify the IP version Monit should use when
2892 trying to connect to the port. If not used, Monit will try to connect
2893 to the first available address (IPv4 or IPv6). If multiple addresses
2894 are available and connection to one address failed, Monit will try the
2895 next address and so on until a connection succeed or until there are no
2896 more addresses left to try.
2897
2898 TYPE [TCP | UDP]. Optionally specify the socket type Monit should use
2899 when trying to connect to the port. The different socket types are: TCP
2900 or UDP, where TCP is a regular stream based socket, UDP, a datagram
2901 socket. The default socket type is TCP.
2902
2903 [SSL | TLS] [with options {...}]. Set SSL/TLS options and override
2904 global/default SSL options. You can set the SSL/TLS version to use,
2905 whether to verify certificates, trust self-signed certificates or set
2906 the SSL client certificates database-file for client certificate
2907 authentication.
2908
2909 CERTIFICATE CHECKSUM [MD5|SHA1] hash. Verify the SSL server certificate
2910 by checking its checksum. You can use either MD5 or SHA1 checksum (if
2911 you don't specify the type, Monit will determine the digest based on
2912 the hash length). You can use the openssl command line tool to get the
2913 checksum value for your certificate, which you can then use in Monit's
2914 control file:
2915
2916 openssl x509 -fingerprint -sha1 -in server.crt | head -1 | cut -f2 -d'='
2917
2918 Example:
2919
2920 if failed
2921 port 443
2922 protocol https
2923 and certificate checksum = "1ED948A6F4258ACAB964227EF4EB19FCC453B0F8"
2924 then alert
2925
2926 CERTIFICATE VALID for number DAYS. Send an alert if the certificate
2927 will expire in the given number of days. This test is pretty useful to
2928 get a notification when it is time to renew your SSL certificate.
2929
2930 Example:
2931
2932 if failed
2933 port 443
2934 protocol https
2935 and certificate valid > 30 days
2936 then alert
2937
2938 PROTOCOL protocol. Optionally specify the protocol Monit should speak
2939 when a connection is established. At the moment Monit knows how to
2940 speak:
2941 APACHE-STATUS
2942 DNS
2943 DWP
2944 FAIL2BAN
2945 FTP
2946 GPS
2947 HTTP
2948 HTTPS
2949 IMAP
2950 IMAPS
2951 CLAMAV
2952 LDAP2
2953 LDAP3
2954 LMTP
2955 MEMCACHE
2956 MONGODB
2957 MQTT
2958 MYSQL
2959 NNTP
2960 NTP3
2961 PGSQL
2962 POP
2963 POPS
2964 POSTFIX-POLICY
2965 RADIUS
2966 RDATE
2967 REDIS
2968 RSYNC
2969 SIEVE
2970 SIP
2971 SMTP
2972 SMTPS
2973 SPAMASSASSIN
2974 SSH
2975 TNS
2976 WEBSOCKET
2977
2978 If the target server's protocol is not found in this list, simply do
2979 not specify the protocol and Monit will use a default connection test.
2980
2981 TIMEOUT number SECONDS. Optionally specifies the connect and read
2982 timeout for the connection. If Monit cannot connect to the server
2983 within this time it will assume that the connection failed and execute
2984 the specified action. The default connect timeout is 5 seconds.
2985
2986 RETRY number. Optionally specifies the number of consecutive retries
2987 within the same testing cycle in the case that the connection failed.
2988 The default is fail on first error.
2989
2990 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
2991 "UNMONITOR".
2992
2993 Specific protocol test options
2994
2995 GENERIC (SEND/EXPECT)
2996
2997 If Monit does not support the protocol spoken by the server, you can
2998 write your own protocol-test using send and expect strings. The SEND
2999 statement sends a string to the server port and the EXPECT statement
3000 compares a string read from the server with the string given in the
3001 expect statement.
3002
3003 Syntax:
3004
3005 [<SEND|EXPECT> "string"]+
3006
3007 Monit will send a string as it is, and you must remember to include CR
3008 and LF in the string sent to the server if the protocol expects such
3009 characters to terminate a string (most text based protocols used over
3010 Internet do).
3011
3012 Monit will by default read up to 255 bytes from the server and use this
3013 string when comparing the EXPECT string. You can override the default
3014 value using the set limits statement.
3015
3016 You can use non-printable characters in a SEND string if needed. Use
3017 the hex notation, \0xHEXHEX to send any char in the range \0x00-\0xFF,
3018 that is, 0-255 in decimal. For example, to test a Quake 3 server:
3019
3020 send "\0xFF\0xFF\0xFF\0xFFgetstatus"
3021 expect "sv_floodProtect|sv_maxPing"
3022
3023 If your system supports POSIX regular expressions, you can use regular
3024 expressions in the EXPECT string, see regex(7) to learn more about the
3025 types of regular expressions you can use in an expect string.
3026
3027 Since both regex and string compare operates on a zero terminated
3028 string, you cannot test for '\0' in an EXPECT buffer since this
3029 character marks the end of the buffer. However, we escape '\0' in the
3030 expect buffer as "\0" which you can test for. That is, '\' followed by
3031 the ascii value for 0. For instance, here is how to test for an expect
3032 string that starts with zero followed by any number of characters.
3033
3034 expect "^[\\]0.*"
3035
3036 Here is a simple SMTP protocol example:
3037
3038 if failed
3039 port 25 and
3040 expect "^220.*"
3041 send "HELO localhost.localdomain\r\n"
3042 expect "^250.*"
3043 send "QUIT\r\n"
3044 then alert
3045
3046 SEND/EXPECT can be used with any socket type, such as TCP sockets, UNIX
3047 sockets and UDP sockets.
3048
3049 HTTP
3050
3051 Syntax:
3052
3053 PROTO(COL) HTTP
3054 [USERNAME "string"]
3055 [PASSWORD "string"]
3056 [REQUEST "string"]
3057 [METHOD <GET|HEAD>]
3058 [STATUS operator number]
3059 [CHECKSUM checksum]
3060 [HTTP HEADERS list of headers]
3061 [CONTENT < "=" | "!=" > STRING]
3062
3063 USERNAME is an optional username for Basic authentication
3064
3065 PASSWORD is an optional password for Basic authentication
3066
3067 REQUEST option can set an URL string specifying a document on the HTTP
3068 server. If the request statement isn't specified, the default "/" page
3069 will be requested.
3070
3071 For example:
3072
3073 if failed
3074 port 80
3075 protocol http
3076 request "/data/show?a=b&c=d"
3077 then restart
3078
3079 METHOD set the HTTP request method. If not specified, Monit prefers the
3080 HTTP GET request method, which is more common then the HEAD method.
3081 One may want to set the method explicitly to HEAD to save the network
3082 bandwidth.
3083
3084 STATUS option can be used to explicitly test the HTTP status code
3085 returned by the HTTP server. If not used, the HTTP protocol test will
3086 fail if the status code returned is greater than or equal to 400. You
3087 can override this behaviour by using the status qualifier.
3088
3089 For example to test that a page does not exist (the HTTP server should
3090 return 404 in this case):
3091
3092 if failed
3093 port 80
3094 protocol http
3095 request "/non/existent.php"
3096 status = 404
3097 then alert
3098
3099 CHECKSUM You can test the checksum of documents returned by a HTTP
3100 server. Either MD5 or SHA1 hash can be used. Monit will not test the
3101 checksum for a document if the server does not set the HTTP Content-
3102 Length header. A HTTP server should set this header when it server a
3103 static document (i.e. a file). There are no limitation on the document
3104 size, but keep in mind that Monit will use time to download the
3105 document over the network to compute the checksum.
3106
3107 Example:
3108
3109 if failed
3110 port 80
3111 protocol http
3112 request "/page.html"
3113 checksum 8f7f419955cefa0b33a2ba316cba3659
3114 then alert
3115
3116 HTTP HEADERS can be used to send a list of HTTP headers when using the
3117 HTTP protocol test. For instance, the host header. If the host header
3118 is not set, Monit will use the hostname or IP-address of the host as
3119 specified in the check host statement. Specifying a host header is
3120 useful if you want to connect to and test a name-based virtual host.
3121 The syntax for setting HTTP headers is
3122
3123 http headers [name:value, name:value,..]
3124
3125 where each name:value pair is separated with ','. If you need to use
3126 ':' in the value string, for instance to set port number for a host
3127 header, you must enclose the value in quotes. For example,
3128
3129 http headers [Host: "mmonit.com:443"]
3130
3131 In a check host context, using this statement might look like
3132
3133 check host mmonit.com with address mmonit.com
3134 if failed
3135 port 80 protocol http
3136 with http headers [Host: mmonit.com, Cache-Control: no-cache,
3137 Cookie: csrftoken=nj1bI3CnMCaiNv4beqo8ZaCfAQQvpgLH]
3138 and request /monit/ with content = "Monit [0-9.]+"
3139 then alert
3140
3141 Setting HTTP headers is associated with the HTTP protocol test and must
3142 come before request as in the example above.
3143
3144 The CONTENT option sets the pattern which is expected in the data
3145 returned by the server. If the pattern doesn't match, the test fails.
3146 In the example above, if the server does not return a page with the
3147 name Monit followed by a version number the test will fail.
3148
3149 By default, at maximum 1MB of content is inspected. You can increase
3150 this limit using the set limits statement.
3151
3152 For example:
3153
3154 if failed
3155 port 80
3156 protocol http
3157 content = "foobar [0-9.]+"
3158 then alert
3159
3160 APACHE-STATUS
3161
3162 The APACHE-STATUS test allows one to check server performance by
3163 examination of the status page generated by Apache's mod_status, which
3164 is expected to be at its default address of
3165 http://www.example.com/server-status.
3166
3167 Syntax:
3168
3169 PROTOCOL APACHE-STATUS [PATH <path>] [USERNAME <string>] [PASSWORD <string>] [<property> <operator> <number>]+
3170
3171 PATH is an optional path to apache status ("/server-status" by default)
3172
3173 USERNAME is an optional username for Basic authentication
3174
3175 PASSWORD is an optional password for Basic authentication
3176
3177 property is acronym for child status:
3178
3179 (1) logging (loglimit)
3180 (2) closing connections (closelimit)
3181 (3) performing DNS lookups (dnslimit)
3182 (4) in keepalive with a client (keepalivelimit)
3183 (5) replying to a client (replylimit)
3184 (6) receiving a request (requestlimit)
3185 (7) initialising (startlimit)
3186 (8) waiting for incoming connections (waitlimit)
3187 (9) gracefully closing down (gracefullimit)
3188 (10) performing cleanup procedures (cleanuplimit)
3189
3190 operator is one of "<", "=", ">".
3191
3192 number is percentile numeric limit.
3193
3194 Each of these limits can be compared against a value relative to the
3195 total number of active Apache child processes.
3196
3197 You can combine all of these tests into one expression or you can
3198 choose to test a certain limit only. If you combine the limits you must
3199 connect them together using the OR keyword.
3200
3201 Example:
3202
3203 if failed port 80 protocol apache-status
3204 loglimit > 10% or
3205 dnslimit > 50% or
3206 waitlimit < 20%
3207 then alert
3208
3209 MQTT
3210
3211 Syntax:
3212
3213 PROTOCOL MQTT [USERNAME string PASSWORD string]
3214
3215 USERNAME MQTT username
3216
3217 PASSWORD MQTT password
3218
3219 Username and password (credentials) are optional.
3220
3221 Example:
3222
3223 check process mosquitto with pidfile /var/run/mosquitto.pid
3224 start program = "/sbin/start mosquitto"
3225 stop program = "/sbin/stop mosquitto"
3226 if failed port 1883 protocol mqtt then alert
3227
3228 MYSQL
3229
3230 Syntax:
3231
3232 PROTOCOL MYSQL [USERNAME string PASSWORD string]
3233
3234 USERNAME MySQL username (maximum 16 characters).
3235
3236 PASSWORD MySQL password (special characters can be used, but for non-
3237 alphanumerics the password has to be quoted).
3238
3239 Username and password (credentials) are optional and if not set, Monit
3240 will perform the test using anonymous login. This can cause an
3241 authentication error to be logged in your MySQL log, depending on your
3242 MySQL configuration.
3243
3244 If credentials are set, Monit will login and perform a MySQL ping test.
3245 Monit does not require any database privileges, it just needs the
3246 database user. You might want to create standalone user for Monit to
3247 use when testing, for example:
3248
3249 CREATE USER 'monit'@'host_from_which_monit_performs_testing' IDENTIFIED BY 'mysecretpassword';
3250 FLUSH PRIVILEGES;
3251
3252 Example:
3253
3254 check process mysql with pidfile /var/run/mysqld/mysqld.pid
3255 start program = "/sbin/start mysql"
3256 stop program = "/sbin/stop mysql"
3257 if failed
3258 port 3306
3259 protocol mysql username "foo" password "bar"
3260 then alert
3261
3262 or with unix-socket start/stop commands
3263
3264 check process mysql with pidfile /var/run/mysqld/mysqld.pid
3265 start program = "/usr/local/mysql/support-files/mysql.server start"
3266 stop program = "/usr/local/mysql/support-files/mysql.server stop"
3267 if failed
3268 unixsocket /tmp/mysql.sock
3269 protocol mysql username "foo" password "bar"
3270 then alert
3271
3272 RADIUS
3273
3274 Syntax:
3275
3276 PROTOCOL RADIUS [SECRET string]
3277
3278 SECRET you may specify an alternative secret, default is "testing123".
3279
3280 For example:
3281
3282 check process radiusd with pidfile /var/run/radiusd.pid
3283 start program = "/etc/init.d/freeradius start"
3284 stop program = "/etc/init.d/freeradius stop"
3285 if failed
3286 host 127.0.0.1 port 1812 type udp protocol radius
3287 secret pingpong
3288 then alert
3289
3290 SIP
3291
3292 The SIP protocol is used by communication platform servers such as
3293 Asterisk and FreeSWITCH.
3294
3295 Syntax:
3296
3297 PROTOCOL SIP [TARGET valid@uri] [MAXFORWARD n]
3298
3299 TARGET you may specify an alternative recipient for the message, by
3300 adding a valid sip uri after this keyword.
3301
3302 MAXFORWARD Limit the number of proxies or gateways that can forward the
3303 request to the next server. It's value is an integer in the range
3304 0-255, set by default to 70. If max-forward = 0, the next server may
3305 respond 200 OK (test succeeded) or send a 483 Too Many Hops (test
3306 failed)
3307
3308 For example:
3309
3310 check host openser_all with address 127.0.0.1
3311 if failed
3312 port 5060 type udp protocol sip
3313 with target "localhost:5060" and maxforward 6
3314 then alert
3315
3316 SMTP
3317
3318 Syntax:
3319
3320 PROTOCOL SMTP[S] [USERNAME string PASSWORD string]
3321
3322 USERNAME SMTP username.
3323
3324 PASSWORD SMTP password (special characters can be used, but for non-
3325 alphanumerics the password has to be quoted).
3326
3327 Credentials are optional and when used will perform authentication
3328 during testing so you can test that authentication also works. We
3329 recommend using smtps if authentication is to be used to encrypt the
3330 communication. If no credentials are set, Monit will just perform a
3331 basic protocol test.
3332
3333 Example:
3334
3335 check process postfix with pidfile /var/spool/postfix/pid/master.pid
3336 start program = "/etc/init.d/postfix start"
3337 stop program = "/etc/init.d/postfix stop"
3338 if failed
3339 port 25
3340 protocol smtp
3341 then alert
3342
3343 Example using authentication and STARTTLS/SMTPS:
3344
3345 check process postfix with pidfile /var/spool/postfix/pid/master.pid
3346 start program = "/etc/init.d/postfix start"
3347 stop program = "/etc/init.d/postfix stop"
3348 if failed
3349 port 25
3350 protocol smtps
3351 username "foo"
3352 password "bar"
3353 then alert
3354
3355 WEBSOCKET
3356
3357 Syntax:
3358
3359 PROTOCOL WEBSOCKET
3360 [REQUEST string]
3361 [HOST string]
3362 [ORIGIN string]
3363 [VERSION number]
3364
3365 HOST you may specify an alternative Host header
3366
3367 REQUEST you may specify an alternative request, default is "/"
3368
3369 ORIGIN you may specify an alternative origin, default is
3370 "https://mmonit.com"
3371
3372 VERSION you may specify an alternative version, default is "0"
3373
3374 For example:
3375
3376 check host websocket.org with address "echo.websocket.org"
3377 if failed
3378 port 80 protocol websocket
3379 host "echo.websocket.org"
3380 request "/"
3381 origin 'http://websocket.com'
3382 version 13
3383 then alert
3384
3386 M/Monit <https://mmonit.com> expands on Monit's capabilities and
3387 provides monitoring and management of all your Monit enabled hosts.
3388
3389 M/Monit uses Monit as an agent. With regular intervals, Monit sends a
3390 status message to M/Monit with a snapshot of the host it is running on.
3391
3392 M/Monit presents the collected data in charts and event logs and give
3393 you the option to view key performance data of all your hosts in a
3394 modern, clean and well designed user interface which also works on
3395 mobile devices.
3396
3397 From M/Monit, you can also start, stop and restart services on your
3398 hosts running Monit.
3399
3400 To send data to M/Monit, add the following statement to your Monit
3401 control file:
3402
3403 SET MMONIT <url>
3404 [TIMEOUT <number> SECONDS]
3405 [REGISTER WITHOUT CREDENTIALS]
3406
3407 Example:
3408
3409 set mmonit https://monit:monit@192.168.1.10:8443/collector
3410
3411 Monit will register itself in M/Monit and will start sending status and
3412 event messages to M/Monit. We recommend using https as in the example
3413 above to ensure that the communication between Monit and M/Monit is
3414 secure.
3415
3416 The password should be URL encoded if it contains URL-significant
3417 characters like ":", "?", "@".
3418
3419 The default timeout is 5 seconds, you can customise the timeout using
3420 the TIMEOUT option.
3421
3422 When Monit registers itself in M/Monit it sends credentials that can be
3423 used to perform service actions from M/Monit. You can disable sending
3424 credentials by using REGISTER WITHOUT CREDENTIALS and instead manually
3425 add credentials in M/Monit.
3426
3428 The simplest form is just the check statement. In this example we check
3429 to see if our web server is running and raise an alert if not:
3430
3431 check process nginx with pidfile /var/run/nginx.pid
3432
3433 To have Monit start the server if it's not running, add a start
3434 statement:
3435
3436 check process nginx with pidfile /var/run/nginx.pid
3437 start program = "/etc/init.d/nginx start"
3438
3439 Here's a more advanced example for monitoring an apache web-server
3440 listening on the default port number for HTTP and HTTPS. In this
3441 example Monit will restart apache if it's not accepting connections at
3442 the port numbers. The method Monit use for restart is to first execute
3443 the stop-program, then wait (up to 30s) for the process to stop and
3444 then execute the start-program and wait (30s) for it to start. The
3445 length of start or stop wait can be overridden using the 'timeout'
3446 option. If Monit was unable to stop or start the service a failed alert
3447 message will be sent if you have requested alert messages to be sent.
3448
3449 check process apache with pidfile /var/run/httpd.pid
3450 start program = "/etc/init.d/httpd start" with timeout 60 seconds
3451 stop program = "/etc/init.d/httpd stop"
3452 if failed port 80 for 2 cycles then restart
3453 if failed port 443 for 2 cycles then restart
3454
3455 This example demonstrate how you can run a program as a specified user
3456 (uid) and with a specified group (gid). Many daemon programs can do the
3457 uid and gid switch by themselves, but for those programs that does not
3458 (e.g. Java programs), monit's ability to start a program as a certain
3459 user can be very useful. In this example we start the Tomcat Java
3460 Servlet Engine as the standard nobody user and group. Please note that
3461 Monit can only switch uid and gid for the program if the super-user is
3462 running Monit, otherwise Monit will simply ignore the request to change
3463 uid and gid.
3464
3465 check process tomcat with pidfile /var/run/tomcat.pid
3466 start program = "/etc/init.d/tomcat start"
3467 as uid "nobody" and gid "nobody"
3468 stop program = "/etc/init.d/tomcat stop"
3469 # You can also use id numbers instead and write:
3470 as uid 99 and with gid 99
3471 if failed port 8080 then alert
3472
3473 In this example we use udp for connection testing to check if the name-
3474 server is running:
3475
3476 check process named with pidfile /var/run/named.pid
3477 start program = "/etc/init.d/named start"
3478 stop program = "/etc/init.d/named stop"
3479 if failed port 53 use type udp protocol dns then restart
3480
3481 The following example illustrates how to check if the service 'sophie'
3482 is answering connections on its Unix domain socket:
3483
3484 check process sophie with pidfile /var/run/sophie.pid
3485 start program = "/etc/init.d/sophie start"
3486 stop program = "/etc/init.d/sophie stop"
3487 if failed unix /var/run/sophie then restart
3488
3489 In this example we check an apache web-server running on localhost
3490 which answers for several IP-based virtual hosts or vhosts, hence the
3491 host statement before port:
3492
3493 check process apache with pidfile /var/run/httpd.pid
3494 start "/etc/init.d/httpd start"
3495 stop "/etc/init.d/httpd stop"
3496 if failed host www.sol.no port 80 then alert
3497 if failed host shop.sol.no port 443 then alert
3498 if failed host chat.sol.no port 80 then alert
3499
3500 To make sure that Monit is communicating with a HTTP server a protocol
3501 test can be added:
3502
3503 check process apache with pidfile /var/run/httpd.pid
3504 start "/etc/init.d/httpd start"
3505 stop "/etc/init.d/httpd stop"
3506 if failed
3507 host www.sol.no port 80 protocol http
3508 then alert
3509
3510 This example demonstrate a different way to check a web-server using
3511 the send/expect mechanism:
3512
3513 check process apache with pidfile /var/run/httpd.pid
3514 start "/etc/init.d/httpd start"
3515 stop "/etc/init.d/httpd stop"
3516 if failed
3517 host www.sol.no port 80 and
3518 send "GET / HTTP/1.1\r\nHost: www.sol.no\r\n\r\n"
3519 expect "HTTP/[0-9\.]{3} 200.*"
3520 then alert
3521
3522 Here we ping a remote host to check if it is up and if not, send an
3523 alert:
3524
3525 check host www.tildeslash.com with address www.tildeslash.com
3526 if failed ping then alert
3527
3528 In the following example we ask Monit to compute and verify the
3529 checksum for the underlying apache binary used by the start and stop
3530 programs. If the checksum test should fail, monitoring will be disabled
3531 to prevent possibly restarting a compromised binary:
3532
3533 check process apache with pidfile /var/run/httpd.pid
3534 start program = "/etc/init.d/httpd start"
3535 stop program = "/etc/init.d/httpd stop"
3536 if failed host www.tildeslash.com port 80 then restart
3537 depends on apache_bin
3538
3539 check file apache_bin with path /usr/local/apache/bin/httpd
3540 if failed checksum then unmonitor
3541
3542 In this example we ask Monit to test a document's checksum on a remote
3543 server. If the checksum was changed we send an alert:
3544
3545 check host mmonit.com with address mmonit.com
3546 if failed
3547 port 80 protocol http and
3548 request "/monit/dist/monit-5.7.tar.gz"
3549 with checksum f9d26b8393736b5dfad837bb13780786
3550 then alert
3551
3552 Here are a couple of tests for some popular communication servers,
3553 using the SIP protocol. First we test a FreeSWITCH server and then an
3554 Asterisk server
3555
3556 check process freeswitch
3557 with pidfile /usr/local/freeswitch/log/freeswitch.pid
3558 start program = "/usr/local/freeswitch/bin/freeswitch -nc -hp"
3559 stop program = "/usr/local/freeswitch/bin/freeswitch -stop"
3560 if total memory > 1000.0 MB for 5 cycles then alert
3561 if total memory > 1500.0 MB for 5 cycles then alert
3562 if total memory > 2000.0 MB for 5 cycles then restart
3563 if cpu > 60% for 5 cycles then alert
3564 if failed
3565 port 5060 type udp protocol SIP
3566 target me@foo.bar and maxforward 10
3567 then restart
3568
3569 check process asterisk
3570 with pidfile /var/run/asterisk/asterisk.pid
3571 start program = "/usr/sbin/asterisk"
3572 stop program = "/usr/sbin/asterisk -r -x 'shutdown now'"
3573 if total memory > 1000.0 MB for 5 cycles then alert
3574 if total memory > 1500.0 MB for 5 cycles then alert
3575 if total memory > 2000.0 MB for 5 cycles then restart
3576 if cpu > 60% for 5 cycles then alert
3577 if failed
3578 port 5060 type udp protocol SIP
3579 and target me@foo.bar maxforward 10
3580 then restart
3581
3582 Some servers are slow starters, like for example Java based Application
3583 Servers. If we want to keep the poll-cycle low (i.e. < 60 seconds) but
3584 allow some services to take its time to start, the every statement is
3585 handy:
3586
3587 check process dynamo with pidfile /etc/dynamo.pid every 2 cycles
3588 start program = "/etc/init.d/dynamo start"
3589 stop program = "/etc/init.d/dynamo stop"
3590 if failed port 8840 then alert
3591
3592 Here is an example where we group together two database entries so you
3593 can manage them together, e.g.; 'Monit -g database start all'. The mode
3594 statement is also illustrated in the first entry and have the effect
3595 that Monit will not try to (re)start this service if it is not running:
3596
3597 check process sybase with pidfile /var/run/sybase.pid
3598 start = "/etc/init.d/sybase start"
3599 stop = "/etc/init.d/sybase stop"
3600 mode passive
3601 group database
3602
3603 check process oracle with pidfile /var/run/oracle.pid
3604 start program = "/etc/init.d/oracle start"
3605 stop program = "/etc/init.d/oracle stop"
3606 if failed
3607 port 9001 protocol tns
3608 then restart
3609 group database
3610
3611 This resource checks example will send an alert if CPU usage of the
3612 Apache's HTTP daemon and its child processes goes beyond 60% for two
3613 cycles. Apache is restarted if the CPU usage is over 80% for five
3614 cycles or the memory usage is over 100Mb for five cycles:
3615
3616 check process apache with pidfile /var/run/httpd.pid
3617 start program = "/etc/init.d/httpd start"
3618 stop program = "/etc/init.d/httpd stop"
3619 if cpu > 40% for 2 cycles then alert
3620 if total cpu > 60% for 2 cycles then alert
3621 if total cpu > 80% for 5 cycles then restart
3622 if mem > 100 MB for 5 cycles then stop
3623
3624 This examples demonstrate the timestamp statement with exec and how you
3625 may restart apache if its configuration file was changed.
3626
3627 check file httpd.conf with path /etc/httpd/httpd.conf
3628 if changed timestamp
3629 then exec "/etc/init.d/httpd graceful"
3630
3631 In this example we demonstrate usage of the extended alert statement
3632 and a file check dependency:
3633
3634 check process apache with pidfile /var/run/httpd.pid
3635 start = "/etc/init.d/httpd start"
3636 stop = "/etc/init.d/httpd stop"
3637 alert admin@bar on {nonexist, timeout}
3638 with mail-format {
3639 from: bofh@$HOST
3640 subject: apache $EVENT - $ACTION
3641 message: This event occurred on $HOST at $DATE.
3642 Your faithful employee,
3643 monit
3644 }
3645 if failed host www.tildeslash.com port 80 then restart
3646 depend httpd_bin
3647 group apache
3648
3649 check file httpd_bin with path /usr/local/apache/bin/httpd
3650 alert security@bar on {checksum, timestamp,
3651 permission, uid, gid}
3652 with mail-format {subject: Alaaarrm! on $HOST}
3653 if failed checksum
3654 and expect 8f7f419955cefa0b33a2ba316cba3659
3655 then unmonitor
3656 if failed permission 755 then unmonitor
3657 if failed uid "root" then unmonitor
3658 if failed gid "root" then unmonitor
3659 if changed timestamp then alert
3660 group apache
3661
3662 In this example, we demonstrate usage of the depend statement. In this
3663 case, we want to start oracle and apache. However, we've set up apache
3664 to use oracle as a back end, and if oracle is restarted, apache must be
3665 restarted as well.
3666
3667 check process apache with pidfile /var/run/httpd.pid
3668 start = "/etc/init.d/httpd start"
3669 stop = "/etc/init.d/httpd stop"
3670 depends on oracle
3671
3672 check process oracle with pidfile /var/run/oracle.pid
3673 start = "/etc/init.d/oracle start"
3674 stop = "/etc/init.d/oracle stop"
3675 if failed port 9001 for 5 cycles then restart
3676
3677 Next, we have 2 services, oracle-import and oracle-export that need to
3678 be restarted if oracle is restarted, but are independent of each other.
3679
3680 check process oracle with pidfile /var/run/oracle.pid
3681 start = "/etc/init.d/oracle start"
3682 stop = "/etc/init.d/oracle stop"
3683 if failed port 9001 for 3 cycles then restart
3684
3685 check process oracle-import
3686 with pidfile /var/run/oracle-import.pid
3687 start = "/etc/init.d/oracle-import start"
3688 stop = "/etc/init.d/oracle-import stop"
3689 depends on oracle
3690
3691 check process oracle-export
3692 with pidfile /var/run/oracle-export.pid
3693 start = "/etc/init.d/oracle-export start"
3694 stop = "/etc/init.d/oracle-export stop"
3695 depends on oracle
3696
3698 ~/.monitrc
3699 Default run control file
3700
3701 /etc/monitrc
3702 If the control file is not found in the default
3703 location and /etc contains a monitrc file, this
3704 file will be used instead.
3705
3706 ./monitrc
3707 If the control file is not found in either of the
3708 previous two locations, and the current working
3709 directory contains a monitrc file, this file is
3710 used instead.
3711
3712 ~/.monit.pid
3713 Lock file to help prevent concurrent runs (non-root
3714 mode).
3715
3716 /run/monit.pid
3717 Lock file to help prevent concurrent runs (root mode,
3718 Linux systems, if /run directory is available).
3719
3720 /var/run/monit.pid
3721 Lock file to help prevent concurrent runs (root mode,
3722 Linux systems).
3723
3724 /etc/monit.pid
3725 Lock file to help prevent concurrent runs (root mode,
3726 systems without /var/run).
3727
3728 ~/.monit.state
3729 Monit saves its state to this file and utilises
3730 information found in this file to recover from
3731 a crash. This is a binary file and its content is
3732 only of interest to monit. You may set the location
3733 of this file in the Monit control file or by using
3734 the -s switch when Monit is started.
3735
3736 ~/.monit.id
3737 Monit save its unique id to this file.
3738
3740 No environment variables are used by Monit. However, when Monit
3741 executes a start/stop/restart program or an exec action, it will set
3742 several environment variables which can be utilised by the executable
3743 to get information about the event, which triggered the action.
3744
3745 The following environment variable is set for every program executed by
3746 monit, including check program:
3747
3748 MONIT_SERVICE
3749 The name of the service (from monitrc) for which the program is
3750 executed.
3751
3752 The following environment variables are only available in the service
3753 start/stop/restart program and exec action context:
3754
3755 MONIT_EVENT
3756 The event that occurred on the service
3757
3758 MONIT_DESCRIPTION
3759 A description of the error condition
3760
3761 MONIT_DATE
3762 The time and date (RFC 822 style) the event occurred
3763
3764 MONIT_HOST
3765 The host the event occurred on
3766
3767 The following environment variables are only available in the check
3768 process start/stop/restart program and exec action context:
3769
3770 MONIT_PROCESS_PID
3771 The process pid. This may be 0 if the process was (re)started,
3772
3773 MONIT_PROCESS_MEMORY
3774 Process memory. This may be 0 if the process was (re)started,
3775
3776 MONIT_PROCESS_CHILDREN
3777 Process children. This may be 0 if the process was (re)started,
3778
3779 MONIT_PROCESS_CPU_PERCENT
3780 Process cpu%. This may be 0 if the process was (re)started,
3781
3782 The following environment variables are only available for check
3783 program start/stop/restart program and exec action context:
3784
3785 MONIT_PROGRAM_STATUS
3786 The program status (exit value).
3787
3789 If a Monit daemon is running, SIGUSR1 wakes it up from its sleep phase
3790 and forces a poll of all services. SIGTERM and SIGINT will gracefully
3791 terminate a Monit daemon. The SIGTERM signal is sent to a Monit daemon
3792 if Monit is started with the quit action argument.
3793
3794 Sending a SIGHUP signal to a running Monit daemon will force the daemon
3795 to reinitialise itself, specifically it will reread configuration,
3796 close and reopen log files.
3797
3798 Running Monit in foreground while a background Monit daemon is running
3799 will wake up the daemon.
3800
3802 This is a very silent program. Use the -v switch if you want to see
3803 what Monit is doing, and tail -f the log file. Optionally for testing
3804 purposes; you can start Monit with the -Iv switch. Monit will then
3805 print debug information to the console, to stop monit in this mode,
3806 simply press CTRL^C (i.e. SIGINT) in the same console.
3807
3808 The syntax (and parser) of the control file was inspired by Eric S.
3809 Raymond et al.'s excellent fetchmail program. Some portions of this man
3810 page also receive inspiration from the same authors.
3811
3813 Copyright (C) 2001-2019 by Tildeslash Ltd. All Rights Reserved. This
3814 product is distributed in the hope that it will be useful, but WITHOUT
3815 any warranty; without even the implied warranty of MERCHANTABILITY or
3816 FITNESS for a particular purpose.
3817
3819 GNU text utilities; md5sum(1); sha1sum(1); openssl(1); glob(7);
3820 regex(7); https://mmonit.com
3821
3822
3823
38245.26.0 www.mmonit.com MONIT(1)