1MONIT(1) User Commands MONIT(1)
2
3
4
6 Monit - utility for monitoring services on a Unix system
7
9 monit [options] <arguments>
10
12 Monit is a utility for managing and monitoring processes, programs,
13 files, directories and filesystems on a Unix system. Monit conducts
14 automatic maintenance and repair and can execute meaningful causal
15 actions in error situations. E.g. Monit can start a process if it does
16 not run, restart a process if it does not respond and stop a process if
17 it uses too much resources. You can use Monit to monitor files,
18 directories and filesystems for changes, such as timestamps changes,
19 checksum changes or size changes.
20
21 Monit is controlled via an easy to configure control file based on a
22 free-format, token-oriented syntax. Monit logs to syslog or to its own
23 log file and notifies you about error conditions via customisable alert
24 messages. Monit can perform various TCP/IP network checks, protocol
25 checks and can utilise SSL for such checks. Monit provides a HTTP(S)
26 interface and you may use a browser to access the Monit program.
27
29 You can use Monit to monitor daemon processes or similar programs
30 running on localhost. Monit is particularly useful for monitoring
31 daemon processes, such as those started at system boot time. For
32 instance sendmail, sshd, apache and mysql. In contrast to many other
33 monitoring systems, Monit can act if an error situation should occur,
34 e.g.; if sendmail is not running, monit can start sendmail again
35 automatically or if apache is using too many resources (e.g. if a DoS
36 attack is in progress) Monit can stop or restart apache and send you an
37 alert message. Monit can also monitor process characteristics, such as
38 how much memory or cpu cycles a process is using.
39
40 You can also use Monit to monitor files, directories and filesystems on
41 localhost. Monit can monitor these items for changes, such as
42 timestamps changes, checksum changes or size changes. This is also
43 useful for security reasons - you can monitor the md5 or sha1 checksum
44 of files that should not change and get an alert or perform an action
45 if they should change.
46
47 Monit can monitor network connections to various servers, either on
48 localhost or on remote hosts. TCP, UDP and Unix Domain Sockets are
49 supported. Network test can be performed on a protocol level; Monit has
50 built-in tests for the main Internet protocols, such as HTTP, SMTP etc.
51 Even if a protocol is not supported you can still test the server
52 because you can configure Monit to send any data and test the response
53 from the server.
54
55 Monit can be used to test programs or scripts at certain times, much
56 like cron, but in addition, you can test the exit value of a program
57 and perform an action or send an alert if the exit value indicates an
58 error. This means that you can use Monit to perform any type of check
59 you can write a script for.
60
61 Finally, Monit can be used to monitor general system resources on
62 localhost such as overall CPU usage, Memory and System Load.
63
65 The behaviour of Monit is controlled by command-line options and a run
66 control file, monitrc, the syntax of which we describe in a later
67 section. Command-line options override .monitrc declarations.
68
69 The default location for monitrc is ~/.monitrc. If this file does not
70 exist, Monit will try /etc/monitrc and a few other places. See FILES
71 for details. You can also specify the control file directly by using
72 the -c command-line switch to monit. For instance,
73
74 $ monit -c /var/monit/monitrc
75
76 Before Monit is started the first time, you can test the control file
77 for syntax errors:
78
79 $ monit -t
80 $ Control file syntax OK
81
82 If there was an error, Monit will print an error message to the
83 console, including the line number in the control file from where the
84 error was found.
85
86 Once you have a working Monit control file, simply start Monit from the
87 console, like so:
88
89 $ monit
90
91 You can change some configuration directives via command-line switches,
92 but for simplicity it is recommended that you put these in the control
93 file.
94
95 Monit will detach from the terminal and run as a background process,
96 i.e. as a daemon process. As a daemon, Monit runs in cycles; It monitor
97 services, then goes to sleep for a configured period, then wakes up and
98 start monitoring again in an endless loop.
99
100 Options
101 The following options are recognized by Monit. However, it is
102 recommended that you set options (when applicable) directly in the
103 .monitrc control file.
104
105 -c file
106 Use this control file
107
108 -d n
109 Run Monit as a daemon once per n seconds. Or use "set
110 daemon" in monitrc.
111
112 -g name
113 Set group name for start, stop, restart, monitor, unmonitor,
114 status and summary action.
115
116 -l file
117 Print log information to this file. Or use "set log"
118 in monitrc.
119
120 -p pidfile
121 Use this lock file in daemon mode. Or use "set pidfile"
122 in monitrc.
123
124 -s statefile
125 Write state information to this file. Or use "set
126 statefile" in monitrc.
127
128 -B
129 Batch command line mode (no tabular output and no colors). Or
130 use "set terminal batch" in monitrc.
131
132 -I
133 Do not run in background mode (needed to run from init). Or use
134 "set init" in monitrc.
135
136 -i
137 Print Monit's unique ID
138
139 -r
140 Reset Monit's unique ID. Use with caution
141
142 -t
143 Run syntax check for the control file
144
145 -v
146 Verbose mode, work noisy (diagnostic output)
147
148 -vv
149 Very verbose mode, same as -v plus log stack-trace on error
150
151 -H [filename]
152 Print MD5 and SHA1 hashes of the file or of stdin if the
153 filename is omitted; Monit will exit afterwards
154
155 -V
156 Print version number and patch level
157
158 -h
159 Print a help text
160
161 Arguments
162 Once you have Monit running as a daemon process, you can call Monit
163 with one of the following arguments. Monit will then connect to the
164 Monit daemon (on TCP port 127.0.0.1:2812 by default) and ask the Monit
165 daemon to perform the requested action. In other words; calling monit
166 without arguments starts the Monit daemon, and calling monit with
167 arguments enables you to communicate with the Monit daemon process.
168
169 start all
170 Start all services listed in the control file and enable monitoring
171 for them. If the group option is set (-g), only start and enable
172 monitoring of services in the named group ("all" is not required in
173 this case).
174
175 start <name>
176 Start the named service and enable monitoring for it. The name is a
177 service entry name from the monitrc file.
178
179 stop all
180 Stop all services listed in the control file and disable their
181 monitoring. If the group option is set, only stop and disable
182 monitoring of the services in the named group ("all" is not
183 required in this case).
184
185 stop <name>
186 Stop the named service and disable its monitoring. The name is a
187 service entry name from the monitrc file.
188
189 restart all
190 Stop and start all services. If the group option is set, only
191 restart the services in the named group ("all" is not required in
192 this case).
193
194 restart <name>
195 Restart the named service. The name is a service entry name from
196 the monitrc file.
197
198 monitor all
199 Enable monitoring of all services listed in the control file. If
200 the group option is set, only start monitoring of services in the
201 named group ("all" is not required in this case).
202
203 monitor <name>
204 Enable monitoring of the named service. The name is a service entry
205 name from the monitrc file. Monit will also enable monitoring of
206 all services this service depends on.
207
208 unmonitor all
209 Disable monitoring of all services listed in the control file. If
210 the group option is set, only disable monitoring of services in the
211 named group ("all" is not required in this case).
212
213 unmonitor <name>
214 Disable monitoring of the named service. The name is a service
215 entry name from the monitrc file. Monit will also disable
216 monitoring of all services that depends on this service.
217
218 status [name]
219 Print service status information.
220
221 summary [name]
222 Print a short status summary.
223
224 report [up | down | initialising | unmonitored | total]
225 Report services state. The output can easily be parsed by scripts.
226 Without options, prints a short overview of the state of all
227 services managed by Monit. The option, up prints the number of all
228 services in this state, down likewise and so on.
229
230 reload
231 Reinitialise a running Monit daemon, the daemon will reread its
232 configuration, close and reopen log files.
233
234 quit
235 Kill the Monit daemon process
236
237 validate
238 Check all services listed in the control file. This action is also
239 the default behaviour when Monit runs in daemon mode.
240
241 procmatch <regex>
242 Allows for easy testing of pattern for process match check. The
243 command takes regular expression as an argument and displays all
244 running processes matching the pattern.
245
247 Monit is configured and controlled via a control file called monitrc.
248 The default location for this file is ~/.monitrc. If this file does not
249 exist, Monit will try /etc/monitrc, then @sysconfdir@/monitrc and
250 finally ./monitrc. If you build Monit from source, the value of
251 @sysconfdir@ can be given at configure time as ./configure
252 --sysconfdir. For instance, using ./configure --sysconfdir
253 /var/monit/etc will make Monit search for monitrc in /var/monit/etc
254
255 To protect the security of your control file and passwords the control
256 file must have read-write permissions no more than 0700 (u=xrw,g=,o=);
257 Monit will complain and exit otherwise.
258
259 When there is a conflict between the command-line arguments and the
260 arguments in this file, the command-line arguments takes precedence.
261
262 Monit uses its own Domain Specific Language (DSL); The control file
263 consists of a series of service entries and global option statements.
264
265 Comments begin with a '#' and extend through the end of the line.
266 Otherwise the file consists of a series of service entries or global
267 option statements in a free-format, token-oriented syntax.
268
269 You can use noise keywords like 'if', 'and', 'with(in)', 'has',
270 'us(ing|e)', 'on(ly)', 'then', 'for', 'of' anywhere in an entry to make
271 it resemble English. They're ignored, but can make entries much easier
272 to read at a glance. Keywords are case insensitive.
273
274 There are three kinds of tokens: grammar, numbers (i.e. decimal digit
275 sequences) and strings. Strings can be either quoted or unquoted. A
276 quoted string is bounded by double quotes and may contain whitespace
277 (and quoted digits are treated as a string). An unquoted string is any
278 whitespace-delimited token, containing characters and/or numbers.
279
280 On a semantic level, the control file consists of three types of
281 entries:
282
283 1. Global set-statements
284 A global set-statement starts with the keyword "set" and the item
285 to configure.
286
287 2. Global include-statement
288 The include statement consists of the keyword "include" and a glob
289 string. This statement is used to include configure directives from
290 separate files.
291
292 3. One or more service entry statements.
293
294 Service checks
295 Each service entry consists of the keywords "check", followed by the
296 service type. Each entry requires a unique descriptive name, which may
297 be freely chosen. This name is used by Monit to refer to the service
298 internally and in all interactions with the user.
299
300 Currently, nine types of check statements are supported:
301
302 Process
303
304 CHECK PROCESS <unique name> <PIDFILE <path> | MATCHING <regex>>
305
306 <path> is the absolute path to the program's pid-file. A pid-file is a
307 file, containing a Process's unique ID. If the pid-file does not exist
308 or does not contain the PID number of a running process, Monit will
309 call the entry's start method if defined.
310
311 <regex> is an alternative to using PID files and uses process name
312 pattern matching to find the process to monitor. The top-most matching
313 parent with highest uptime is selected, so this form of check is most
314 useful if the process name is unique. Pid-file should be used where
315 possible as it defines expected PID exactly. You can test if a process
316 match a pattern from the command-line using "monit procmatch
317 "regex-pattern"". This will lists all processes matching or not, the
318 regex-pattern.
319
320 File
321
322 CHECK FILE <unique name> PATH <path>
323
324 <path> is the absolute path to the file. If the file does not exist,
325 Monit will call the entry's start method if defined, if <path> does not
326 point to a regular file type (for instance a directory), Monit will
327 disable monitoring of this entry. If Monit runs in passive mode or the
328 start method is not defined, Monit will just send an alert on error.
329
330 Fifo
331
332 CHECK FIFO <unique name> PATH <path>
333
334 <path> is the absolute path to the fifo. If the fifo does not exist,
335 Monit will call the entry's start method if defined, if <path> does not
336 point to a fifo type (for instance a directory), Monit will disable
337 monitoring of this entry. If Monit runs in passive mode or the start
338 method is not defined, Monit will just send an alert on error.
339
340 Filesystem
341
342 CHECK FILESYSTEM <unique name> PATH <string>
343
344 <path> is the path to the device/disk, mount point or NFS/CIFS/FUSE
345 connection string. If the filesystem becomes unavailable, Monit will
346 call the service's start method if defined. If Monit runs in passive
347 mode or the start method is not defined, Monit will just send an alert
348 on error.
349
350 Directory
351
352 CHECK DIRECTORY <unique name> PATH <path>
353
354 <path> is the absolute path to the directory. If the directory does not
355 exist, Monit will call the entry's start method if defined. If <path>
356 does not point to a directory, monit will disable monitoring of this
357 entry. If Monit runs in passive mode or the start methods is not
358 defined, Monit will just send an alert on error.
359
360 Remote host
361
362 CHECK HOST <unique name> ADDRESS <host>
363
364 The host address can be specified as a hostname string or as an IP-
365 address string on a dotted decimal format. Such as, "tildeslash.com" or
366 "64.87.72.95".
367
368 System
369
370 CHECK SYSTEM <unique name>
371
372 The unique name is usually the local host name, but any descriptive
373 name can be used. If you use the variable $HOST as the name, it will
374 expand to the hostname. This check allows one to monitor general system
375 resources such as CPU usage, total memory usage or load average. The
376 unique name is used as the system hostname in mail alerts and as the
377 initial name of the host entry in M/Monit.
378
379 Program
380
381 CHECK PROGRAM <unique name> PATH <executable file> [TIMEOUT <number> SECONDS]
382
383 <path> is the absolute path to the executable program or script. The
384 status test allows one to check the program's exit status. If the
385 program does not finish executing within <number> seconds, Monit will
386 terminate it. The default program timeout is 300 seconds (5 minutes).
387 The output of the program is recorded and made available in the User
388 Interface and in alerts, by default up to 512 bytes. You can change the
389 output limit using the set limits statement).
390
391 Network
392
393 CHECK NETWORK <unique name> <ADDRESS <ipaddress> | INTERFACE <name>>
394
395 <ipaddress> is the IPv4 or IPv6 address of the monitored network
396 interface. It is also possible to use interface name, such as "eth0" on
397 Linux.
398
400 Monit will log status and error messages to a file or via syslog. Use
401 the set log statement in the monitrc control file.
402
403 To setup Monit to log to its own file, use e.g. set log
404 /var/log/monit.log. Note, the previous set logfile statement is
405 deprecated, but can alternatively be used.
406
407 If syslog is given as a value for the "-l" command-line switch or the
408 keyword set log syslog is found in the control file, Monit will use the
409 syslog system daemon to log messages with a priority assigned to each
410 message based on the context.
411
412 To turn off logging, simply do not set the log in the control file (and
413 of course, do not use the -l switch)
414
415 The format for log file is:
416
417 [date] priority : message
418
419 for example:
420
421 [CET Jan 5 18:49:29] info : 'localhost' Monit started
422
424 Monit uses ANSI escape sequences to colorise important parts of the
425 command-line output, if the terminal supports colors, and UTF-8 box
426 characters for tabular output.
427
428 If you want to process the monit CLI output in a script, you can use
429 either the -B option or use the following statement in the monit
430 configuration file to disable tabular output and colors completely:
431
432 SET TERMINAL BATCH
433
435 Use
436
437 SET DAEMON <seconds>
438 [[WITH] START DELAY <seconds>]
439
440 to specify Monit's poll cycle length and run Monit in daemon mode. You
441 must specify a numeric argument which is a polling interval in seconds.
442
443 In daemon mode, Monit detaches from the console, puts itself in the
444 background and runs continuously, monitoring each specified service and
445 then goes to sleep for the given poll interval, wakes up and start
446 monitoring again in an endless cycle.
447
448 Alternatively, you can use the "-d" command line switch to set the poll
449 interval, but it is strongly recommended to set the poll interval in
450 your ~/.monitrc file, by using set daemon.
451
452 Monit will then always start in daemon mode. If you do not use this
453 statement and do not start monit with the -d option, Monit will just
454 run through the service checks once and then exit. This might be useful
455 in some situations, but Monit is primarily designed to run as a daemon
456 process.
457
458 Calling "monit" with a Monit daemon running in the background sends a
459 wake-up signal to the daemon, forcing it to check services immediately.
460 Calling "monit" with the quit argument will kill a running Monit daemon
461 process instead of waking it up.
462
463 The start delay option can be used to wait (once) before Monit starts
464 checking services. This can be useful for example when the system
465 boots. Monit will by default start checking services immediately at
466 startup.
467
469 The "set init" statement prevents Monit from transforming itself into a
470 daemon process. Instead Monit will run as a foreground process. (You
471 should still use "set daemon" to specify the poll cycle).
472
473 This is required to run Monit from init. Using init to start Monit is
474 probably the best way to run Monit if you want to be certain that you
475 always have a running Monit daemon on your system. Another option is to
476 run Monit from crontab. In any case, you should make sure that the
477 control file does not have any syntax errors before you start Monit
478 from init or crontab (use "monit -t" to check).
479
480 To setup Monit to run from init, you can either use the "set init"
481 statement in Monit's control file or use the "-I" option from the
482 command line. Here is what you must add to "/etc/inittab":
483
484 # Run Monit in standard run-levels
485 mo:2345:respawn:/usr/local/bin/monit -Ic /etc/monitrc
486
487 After you have modified init's configuration file, you can run the
488 following command to re-examine /etc/inittab and start Monit:
489
490 telinit q
491
492 For systems without telinit:
493
494 kill -1 1
495
496 If Monit is used to monitor services that are also started at boot time
497 (e.g. services started via SYSV init rc scripts or via inittab) then,
498 in some cases, a race condition could occur. That is; if a service is
499 slow to start, Monit can assume that the service is not running and
500 possibly try to start it and raise an alert, while, in fact the service
501 is already about to start or already in its startup sequence. Please
502 see the FAQ for a solution to this problem. The short version is to
503 start Monit on a higher run-level after system processes.
504
506 The Monit control file, "monitrc", can include additional configuration
507 files. This feature helps one to organise configuration into separate
508 files instead of having everything in one file, if you like this kind
509 of thing. Include statements can be placed at virtually any place in
510 "monitrc" though the convention is at the bottom. The syntax is the
511 following:
512
513 INCLUDE <globstring>
514
515 The globstring is any kind of string as defined in glob(7). Thus, you
516 can refer to a single file or you can load several files at once. If
517 you want to use whitespace in your string the globstring needs to be
518 embedded into quotes (') or double quotes ("). If the globstring
519 matches a directory instead of a file, it is silently ignored.
520
521 Any include statements in an included file are parsed as in the main
522 control file.
523
524 If the globstring matches several results, the files are included in a
525 non sorted manner. If you need to rely on a certain order, you should
526 avoid wild-card globbing and instead specify the full path of files
527 included.
528
529 An example,
530
531 include /etc/monit.d/*.cfg
532
533 This will load any file matching the globstring. That is, all files in
534 /etc/monit.d that ends with the prefix .cfg.
535
537 Common SSL/TLS options can be set using the following statement and
538 will apply to all SSL connections made through Monit:
539
540 SET <SSL | TLS> [OPTIONS] {
541 VERSION: <AUTO | SSLV2 | SSLV3 | TLSV1 | TLSV11 | TLSV12 | TLSV13>
542 VERIFY: <ENABLE | DISABLE>
543 SELFSIGNED: <ALLOW | REJECT>
544 CIPHERS: <string>
545 PEMFILE: <path>
546 CLIENTPEMFILE: <path>
547 CACERTIFICATEFILE: <path>
548 CACERTIFICATEPATH: <path>
549 }
550
551 VERSION set the specific SSL/TLS version to use. By default Monit uses
552 AUTO. In AUTO mode, only TLS is used, SSLv2 and SSLv3 is considered
553 obsolete. If you have to use SSLv2 or SSLv3, you must explicitly set
554 the version.
555
556 VERIFY enable SSL server certificate verification. This will verify and
557 report an error if the server certificate is not trusted, not valid or
558 has expired. By default certificate verification is disabled, though we
559 recommend enabling it, otherwise there is no guarantee that Monit
560 speaks with the server you think it speaks with.
561
562 SELFSIGNED self-signed certificates are rejected by default. Use this
563 option to allow self-signed certificates. Warning: not recommended in
564 production for security reasons, as in such case the client cannot
565 verify it talks to the correct server and attack types like man-in-the-
566 middle or DNS hijacking are possible).
567
568 CIPHERS override default SSL/TLS ciphers.
569
570 PEMFILE set the path to the SSL server certificate "database-file" in
571 PEM format. This options has effect only for the monit HTTP interface.
572
573 CLIENTPEMFILE set the path to the PEM encoded SSL client certificates
574 database file. If set, a client certificate authentication is enabled.
575
576 CACERTIFICATEFILE set the path to the PEM encoded file containing
577 Certificate Authority (CA) certificates. Monit uses OpenSSL's default
578 CA certificates if this options is not used (openssl version -d can be
579 used to get the default CA certificates). Many distributions comes with
580 SSL and CA certificates already setup and using this option is normally
581 not necessary.
582
583 CACERTIFICATEPATH set the path to the directory containing Certificate
584 Authority (CA) certificates. Monit uses OpenSSL's default CA
585 certificates if this options is not used. Many distributions comes with
586 SSL and CA certificates already setup and using this option is normally
587 not necessary.
588
589 The SSL options statement will globally apply to all SSL/TLS connection
590 made through Monit. SSL options can also be set in a local check, in
591 mailserver settings or in the mmonit statement, and will then override
592 or extend the global settings.
593
594 To set global SSL options, put this statement near the top of your
595 .monitrc file:
596
597 set ssl options {...}
598
599 Here is an example of setting both global and local SSL options:
600
601 # Enable certificate verification for all SSL connections
602 # Self-signed certificates are not allowed by default
603 set ssl options {
604 verify: enable
605 }
606
607 # Verify certificate (via global setting)
608 # Allow self-signed certificate for this check
609 check host example with address example.com
610 if failed
611 port 443
612 protocol https
613 with ssl options {selfsigned: allow}
614 then alert
615
616 # Do not verify example2.com's certificate (override global setting)
617 check host example2 with address example2.com
618 if failed
619 port 443
620 protocol https
621 with ssl options {verify: disable}
622 then alert
623
625 To enable FIPS mode (provided your OpenSSL library supports it), add
626 this statement to Monit control file:
627
628 SET FIPS
629
631 If specified in the control file, Monit will start with HTTP support.
632 You can then use Monit CLI to start and stop services, disable or
633 enable service monitoring as well as view the status of each service.
634
635 If HTTP support is enabled over TCP rather than over a Unix Socket, you
636 can also view Monit's informative dashboard in your web browser.
637
638 Note that if HTTP support is disabled, the Monit CLI interface will
639 have reduced functionality, as most CLI commands (such as "monit
640 status") needs to communicate with the Monit background process via the
641 HTTP interface. We strongly recommend having HTTP support enabled. If
642 security is a concern, bind the HTTP interface to local host only or
643 use Unix Socket so Monit is not accessible from the outside.
644
645 UNIX SOCKET
646 Syntax for Unix Socket:
647
648 SET HTTPD UNIXSOCKET <path>
649 [UID <uid | username>]
650 [GID <gid | groupname>]
651 [PERMISSION <octal number>]
652 ALLOW <user:password>+
653
654 Example:
655
656 set httpd unixsocket /var/run/monit.sock
657 allow username:password
658
659 UNIXSOCKET set the path to the Unix Socket Monit should bind to and
660 listen on.
661
662 UID Socket owner (optional, defaults to the user who executes Monit)
663
664 GID Socket group (optional, defaults to primary group of the user who
665 executes Monit)
666
667 PERMISSION Socket permissions - absolute octal mode (optional, process
668 UMASK is applied by default)
669
670 TCP PORT
671 Syntax for TCP port:
672
673 SET HTTPD PORT <number>
674 [ADDRESS <hostname | IP-address>]
675 [[with] SSL {pemfile: <path>}]
676 ALLOW <user:password | IP-address | IP-range>+
677
678 PORT set the port Monit should bind to and listen on. Monit is usually
679 setup on port 2812. Example:
680
681 set httpd port 2812
682 allow username:password
683
684 You can now use <http://localhost:2812/> to access Monit's web
685 interface from a browser, after you have entered username and password
686 as credentials. You might need to use double quotes around the password
687 if it cointains special chars such as "p@ssw:r#".
688
689 ADDRESS make Monit listen on a specific interface only. For example if
690 you don't want to expose Monit's web interface to the network, bind it
691 to localhost only. Monit will accept connections on any addresses if
692 the ADDRESS option is not used:
693
694 set httpd
695 port 2812
696 use address 127.0.0.1
697 allow username:password
698
699 Monit HTTP over TCP supports both IP version 4 and 6. Support is
700 transparent and does not require any special configuration. If the bind
701 address is not specified as in this example:
702
703 set httpd
704 port 2812
705 allow ...
706
707 Monit will bind to and listen on port 2812 on all interfaces, both IPv4
708 and IPv6 if available. To force Monit HTTP to only listen on and accept
709 connections over IP version 6, specify an IPv6 address:
710
711 set httpd
712 port 2812
713 use address "fe80::222:19ff:fe53:6c59"
714 allow ...
715
716 Likewise, to force Monit HTTP to only listen on and accept connections
717 over IP version 4, specify an IPv4 address:
718
719 set httpd
720 port 2812
721 use address 62.109.39.247
722 allow ...
723
724 SSL settings
725
726 SSL enable SSL/TLS for Monit's web interface. See options for full
727 list of SSL options.
728
729 PEMFILE set the path to the PEM encoded file, which contains the
730 server's private key and certificate. This file should be stored in a
731 safe place on the filesystem and should have strict permissions, no
732 more than 0700.
733
734 Example:
735
736 set httpd
737 port 2812
738 with ssl {
739 pemfile: /etc/ssl/certs/monit.pem
740 }
741 allow myuser:mypassword
742
743 You can now use <https://localhost:2812/> to access the Monit web
744 server over a TLS encrypted connection.
745
746 Self-signed server certificates note: The Monit CLI works on a client-
747 server basis and uses the Monit HTTP GUI to collect status from the
748 Monit daemon and pass commands like start/stop to it. As self-signed
749 certificates are rejected by default for security reasons, the CLI
750 won't work unless you explicitly allow it by using the SELFSIGNED:
751 ALLOW option:
752
753 set httpd
754 port 2812
755 with ssl {
756 pemfile: /etc/ssl/certs/monit.pem
757 selfsigned: allow
758 }
759 allow myuser:mypassword
760
761 CLIENTPEMFILE enables a client certificate based authentication and
762 sets the path to a PEM encoded database file, that contains a list of
763 allowed client certificates. A connecting client has to provide a
764 certificate known to Monit (listed in clientpemfile), otherwise it is
765 rejected. This file must also include all necessary CA certificates. By
766 default self-signed client certificates are rejected for security
767 reasons, if you want to allow self-signed client certificates
768 (recommended only for testing), you have to allow it explicitly using
769 the SELFSIGNED: ALLOW option (see the example above). See your
770 browser's documentation for how to import client certificate to it.
771
772 Example:
773
774 set httpd
775 port 2812
776 with SSL {
777 pemfile: /etc/ssl/certs/monit.pem
778 clientpemfile: /etc/ssl/certs/monit-client.pem
779 }
780
781 Monit version signature
782 SIGNATURE can be used to hide Monit version from the HTTP response
783 header and error pages. For example:
784
785 set httpd
786 port 2812
787 signature disable
788 allow myuser:mypassword
789
790 Authentication
791 Access to the Monit web interface is controlled primarily via the ALLOW
792 option which is used to specify authentication and authorise only
793 specific clients to connect.
794
795 If the Monit command line interface is being used, at least one
796 cleartext password is necessary (see below), otherwise the Monit
797 command line interface will not be able to connect to the Monit web
798 interface.
799
800 Clients that try to connect to Monit, but submit a wrong username
801 and/or password are logged with their IP-address.
802
803 Client certificates
804
805 This authentication method is a strong authentication mechanism and
806 employ HTTPS client certificates to verify the authenticity of a
807 connecting client. Clients must posses a Public Key Certificate known
808 by Monit. The client must connect to Monit over SSL and Monit will ask
809 the client to send its certificate. Upon receiving the certificate
810 Monit compares the certificate to certificates located in the
811 CLIENTPEMFILE file. Access is granted if the client certificate is in
812 this file. See SSL settings for details.
813
814 Basic Authentication
815
816 Monit supports Basic Authentication as described in RFC 2617.
817
818 In short; a server challenge a client (e.g. a Browser) to send
819 authentication information (username and password) and if accepted, the
820 server will allow the client access to the requested document.
821
822 The biggest weakness with Basic Authentication is that username and
823 password is sent in clear-text over the network (i.e. base64 encoded).
824 It is therefor recommended that you do not use this authentication
825 method unless you run Monit with ssl support. With ssl, it is safe to
826 use Basic Authentication since all HTTP data, including Basic
827 Authentication headers will be encrypted.
828
829 Cleartext user and password
830
831 Monit will use Basic Authentication if an allow statement contains a
832 username and a password separated with a single ':' character.
833
834 Note: Special characters can be used, but for non-alphanumerics the
835 password has to be quoted.
836
837 Syntax:
838
839 ALLOW <username>:<password>
840
841 Host and network allow list
842
843 Monit maintains an access-control list of hosts and networks allowed to
844 connect. You can add as many hosts as you want to, but only hosts with
845 a valid domain name or its IP address are allowed.
846
847 Monit will query a name server to check any hosts trying to connect. If
848 a host (client) is trying to connect, but cannot be found in the access
849 list or cannot be resolved, Monit will shutdown the connection to the
850 client promptly.
851
852 Control file example:
853
854 set httpd port 2812
855 allow localhost
856 allow my.other.work.machine.com
857 allow 10.1.1.1
858 allow 192.168.1.0/255.255.255.0
859 allow 10.0.0.0/8
860
861 Clients, not mentioned in the allow list and trying to connect to Monit
862 will be denied access and are logged with their IP-address.
863
864 PAM
865
866 PAM is supported on platforms which provide PAM (such as Linux, Mac OS
867 X, FreeBSD, NetBSD).
868
869 Syntax:
870
871 ALLOW @<group>
872
873 where "group" is the group name allowed to access Monit's web
874 interface. Monit uses a PAM service called monit for PAM
875 authentication, see the PAM manual page for detailed instructions on
876 how to set the PAM service and PAM authentication plugins.
877
878 Sample PAM service for Monit on Mac OS X (store as "/etc/pam.d/monit"
879 file):
880
881 # monit: auth account password session
882 auth sufficient pam_securityserver.so
883 auth sufficient pam_unix.so
884 auth required pam_deny.so
885 account required pam_permit.so
886
887 A "monitrc" config which only allows group "admin" authenticated via
888 PAM to access the web interface:
889
890 set httpd
891 port 2812
892 allow @admin
893
894 htpasswd file
895
896 Alternatively you store credentials in a "htpasswd" formatted file (one
897 user:passwd entry per line), like so: allow [cleartext|crypt|md5] /path
898 [users]. The default is cleartext passwords. In case passwords are
899 digested it is necessary to specify the cryptographic method. If you do
900 not want all users in the password file to have access to Monit, you
901 can specify only those users that should have access in the allow
902 statement. Otherwise all users are added.
903
904 Example1:
905
906 set httpd port 2812
907 allow md5 /etc/httpd/htpasswd john paul ringo george
908
909 If you use this method together with a host list, then only clients
910 from the listed hosts will be allowed to connect to the Monit HTTP
911 server and each client will be asked to provide a username and a
912 password.
913
914 Example2:
915
916 set httpd port 2812
917 allow localhost
918 allow 10.1.1.1
919 allow hauk:"passw@rd"
920
921 If you only want to use Basic Authentication, then just provide allow
922 entries with username and password or password files as in example 1
923 above.
924
925 Read-only users
926
927 Finally it is possible to define some users as read-only. A read-only
928 user can read the Monit web pages but will not get access to push-
929 buttons and cannot change a service from the web interface.
930
931 set httpd port 2812
932 allow admin:password
933 allow hauk:password read-only
934 allow @admins
935 allow @users read-only
936
937 A user is set to read-only by using the read-only keyword after
938 username:password. In the above example the user hauk is defined as a
939 read-only user, while the admin user has all access rights.
940
942 Monit will raise an alert in the following situations:
943
944 o A service does not exist (e.g. process is not running)
945 o Cannot read service data (e.g. cannot get filesystem usage)
946 o Execution of a service related script failed (e.g. start failed)
947 o Invalid service type (e.g. if path points to directory instead of file)
948 o Custom test script returned error
949 o Ping test failed
950 o TCP/UDP connection and/or port test failed
951 o Resource usage test failed (e.g. cpu usage too high)
952 o Checksum mismatch or change (e.g. file changed)
953 o File size test failed (e.g. file too large)
954 o Timestamp test failed (e.g. file is older then expected)
955 o Permission test failed (e.g. file mode doesn't match)
956 o An UID test failed (e.g. file owned by different user)
957 o A GID test failed (e.g. file owned by different group)
958 o A process' PID changed out of Monit's control
959 o A process' PPID changed out of Monit control
960 o Too many service recovery attempts failed
961 o A file content test found a match
962 o Filesystem flags changed
963 o A service action was performed by administrator
964 o A network link failed
965 o A network link capacity changed
966 o A network link saturation failed
967 o A network link upload/download rate failed
968 o Monit was started, stopped or reloaded
969
970 To get an alert via e-mail, set the alert target using the global "set
971 alert" statement (for all services) or the "alert" statement in the
972 context of a service entry (for a single service).
973
974 Setting an alert recipient
975 If an event occurs, Monit will send an alert. There are two kinds of
976 alert statement: global and local.
977
978 Global syntax:
979
980 SET ALERT mail-address [[NOT] {event, ...}] [REMINDER cycles]
981
982 Example:
983
984 set alert foo@bar
985
986 will send a default email to the address foo@bar whenever any event
987 occurs on any service.
988
989 If you want to send alert messages to more email addresses, add a "set
990 alert 'email'" statement for each address.
991
992 It is also possible to use the local alert statement in the context of
993 a service check to enable alert for the given service only:
994
995 ALERT mail-address [[NOT] {event, ...}] [REMINDER cycles]
996
997 Local alert example:
998
999 check host myhost with address 1.2.3.4
1000 if failed port 3306 protocol mysql then alert
1001 if failed port 80 protocol http then alert
1002 alert foo@baz # Local service alert
1003
1004 You can combine global and local alert statements. If there is a
1005 conflict, the local alert has precedence and overrides the global
1006 statement.
1007
1008 Setting an event filter
1009
1010 If you only want an alert message sent for certain events, list them in
1011 an "{event, ...}" block, e.g.:
1012
1013 set alert foo@bar only on { timeout, nonexist }
1014
1015 The event list can also be negated to send alerts for all events except
1016 those which are listed, by prepending the list with the word "not". For
1017 example, to receive all alerts except notification about Monit program
1018 start and stop:
1019
1020 set alert foo@bar but not on { instance }
1021
1022 Here is a list of all possible event types emitted by Monit. Values
1023 from the first column can be used in the event filter list mentioned
1024 above:
1025
1026 Event: | Failure state: | Success state:
1027 ---------------------------------------------------------------------
1028 action | "Action failed" | "Action done"
1029 checksum | "Checksum failed" | "Checksum succeeded"
1030 bytein | "Download bytes exceeded" | "Download bytes ok"
1031 byteout | "Upload bytes exceeded" | "Upload bytes ok"
1032 connection | "Connection failed" | "Connection succeeded"
1033 content | "Content failed", | "Content succeeded"
1034 data | "Data access error" | "Data access succeeded"
1035 exec | "Execution failed" | "Execution succeeded"
1036 fsflags | "Filesystem flags failed" | "Filesystem flags succeeded"
1037 gid | "GID failed" | "GID succeeded"
1038 icmp | "Ping failed" | "Ping succeeded"
1039 instance | "Monit instance changed" | "Monit instance changed not"
1040 invalid | "Invalid type" | "Type succeeded"
1041 link | "Link down" | "Link up"
1042 nonexist | "Does not exist" | "Exists"
1043 packetin | "Download packets exceeded" | "Download packets ok"
1044 packetout | "Upload packets exceeded" | "Upload packets ok"
1045 permission | "Permission failed" | "Permission succeeded"
1046 pid | "PID failed" | "PID succeeded"
1047 ppid | "PPID failed" | "PPID succeeded"
1048 resource | "Resource limit matched" | "Resource limit succeeded"
1049 saturation | "Saturation exceeded" | "Saturation ok"
1050 size | "Size failed" | "Size succeeded"
1051 speed | "Speed failed" | "Speed ok"
1052 status | "Status failed" | "Status succeeded"
1053 timeout | "Timeout" | "Timeout recovery"
1054 timestamp | "Timestamp failed" | "Timestamp succeeded"
1055 uid | "UID failed" | "UID succeeded"
1056 uptime | "Uptime failed" | "Uptime succeeded"
1057
1058 Each alert recipient can have it's own filter, for example:
1059
1060 set alert foo@bar { nonexist, timeout, resource, icmp, connection }
1061 set alert security@bar on { checksum, permission, uid, gid }
1062 set alert admin@bar
1063
1064 Setting an error reminder
1065
1066 Monit by default sends just one notification if a service failed and
1067 another when/if it recovers. If you want to be notified that the
1068 service is still in a failed state, you can use the reminder option in
1069 the alert statement:
1070
1071 SET ALERT mail-address [WITH] REMINDER [ON] number [CYCLES]
1072
1073 For example if you want to be notified each tenth cycle if a service
1074 remains in a failed state, you can use:
1075
1076 alert foo@bar with reminder on 10 cycles
1077
1078 Likewise if you want to be notified on each failed cycle, you can use:
1079
1080 alert foo@bar with reminder on 1 cycle
1081
1082 Disabling alerts for some service
1083 To suppress alerts for some user and service, add the "noalert"
1084 statement in the context of a service check.
1085
1086 NOALERT mail-address
1087
1088 Example (send all alerts to foo@bar except for service p3):
1089
1090 set alert foo@bar
1091
1092 check process p1 with pidfile /var/run/p1.pid
1093
1094 check process p2 with pidfile /var/run/p2.pid
1095
1096 check process p3 with pidfile /var/run/p3.pid
1097 noalert foo@bar
1098
1099 Message format
1100 The alert message format can be modified by using the "set mail-format"
1101 statement:
1102
1103 SET MAIL-FORMAT {mail-format}
1104
1105 Example:
1106
1107 set mail-format {
1108 from: Monit Support <monit@foo.bar>
1109 reply-to: support@domain.com
1110 subject: $SERVICE $EVENT at $DATE
1111 message: Monit $ACTION $SERVICE at $DATE on $HOST: $DESCRIPTION.
1112 Yours sincerely,
1113 monit
1114 }
1115
1116 The from: option is the sender's email address for Monit alerts. A
1117 sender's name is optional, but if used, requires that the subsequent
1118 email-address is enclosed in angle brackets as in the example above.
1119
1120 The reply-to: option can be used to set the reply-to mail header,
1121 optionally with a name.
1122
1123 The subject: option sets the message subject and must be on only one
1124 line.
1125
1126 The message: option sets the mail body. This option should always be
1127 the last in a mail-format statement. The mail body can be as long as
1128 needed, but must not contain the block-closing '}' character.
1129
1130 You need not use all options, only the option which you want to
1131 override. For example to globally change the sender address only:
1132
1133 set mail-format { from: bofh@foo.bar }
1134
1135 The subject and body may contain $NAME variables, which are expanded by
1136 Monit. Here is a list of variables that can be used when composing an
1137 alert message.
1138
1139 · $EVENT
1140
1141 A string describing the event that occurred.
1142
1143 · $SERVICE
1144
1145 The service name
1146
1147 · $DATE
1148
1149 The current time and date (RFC 822 date style).
1150
1151 · $HOST
1152
1153 The name of the host Monit is running on
1154
1155 · $ACTION
1156
1157 The name of the action which was done by Monit.
1158
1159 · $DESCRIPTION
1160
1161 The description of the error condition
1162
1163 Setting a mail server for alert delivery
1164 The mail server Monit should use to send alert messages is defined with
1165 a "set mailserver" statement:
1166
1167 SET MAILSERVER
1168 <hostname|ip-address>
1169 [PORT number]
1170 [USERNAME string] [PASSWORD string]
1171 [using SSL [with options {...}]
1172 [CERTIFICATE CHECKSUM [MD5|SHA1] <hash>],
1173 ...
1174 [with TIMEOUT X SECONDS]
1175 [using HOSTNAME hostname]
1176
1177 Multiple mail servers can be set by using a comma separated list. If
1178 Monit cannot connect to the first server, it will try the next in the
1179 list and so on.
1180
1181 The port statement allows one to override the default SMTP port (465
1182 for SSL, or 25 for TLS and non secure connection).
1183
1184 Monit supports AUTH PLAIN and AUTH LOGIN for SMTP authentication. You
1185 can set a username and a password using the USERNAME and PASSWORD
1186 options.
1187
1188 You can set SSL/TLS options for the connection and also check a SSL
1189 certificate checksum.
1190
1191 The default connection timeout is 5 seconds. You can rise this limit
1192 using the TIMEOUT option.
1193
1194 Example (setting two mail servers for failover):
1195
1196 set mailserver smtp.gmail.com, smtp.other.host
1197
1198 By default, Monit uses the local host name in SMTP HELO/EHLO and in the
1199 Message-ID header. You can override this using the HOSTNAME option.
1200
1201 Event queue
1202 If no mail server is available, Monit can queue events in the local
1203 file-system for retry until the mail server recovers.
1204
1205 If Monit is used with M/Monit, the event queue provides a safe event
1206 store for M/Monit in the case of temporary problems.
1207
1208 The event queue is persistent across Monit restarts and provided that
1209 the back-end filesystem is persistent, across system restart as well.
1210
1211 By default, the queue is disabled and if the alert handler fails, Monit
1212 will simply drop the alert message.
1213
1214 To enable the event queue, add the following statement:
1215
1216 SET EVENTQUEUE BASEDIR <path> [SLOTS <number>]
1217
1218 The <path> is the path to the directory where events will be stored.
1219
1220 Optionally if you want to limit the queue size, use the slots option to
1221 only store up to number event messages.
1222
1223 Example:
1224
1225 set eventqueue basedir /var/monit slots 5000
1226
1227 If you are running more then one Monit instance on the same machine,
1228 you must use separated event queue directories.
1229
1231 Each service can have associated start, stop and restart methods which
1232 Monit can use to execute action on the service.
1233
1234 Syntax:
1235
1236 <START | STOP | RESTART> [PROGRAM] = "program"
1237 [[AS] UID <number | string>]
1238 [[AS] GID <number | string>]
1239 [[WITH] TIMEOUT <number> SECOND(S)]
1240
1241 If the "program" is a shell script it must begin with "#!" and the
1242 remainder of the first line must specify an interpreter for the
1243 program. e.g. "#!/bin/sh"
1244
1245 The "program" must also be executable (for example mode 0755).
1246
1247 It's possible to write scripts directly into the program this way:
1248
1249 stop = "/bin/sh -c 'kill -s SIGTERM `cat /var/run/process.pid`'"
1250
1251 By default the program is executed as the user under which Monit is
1252 running. If Monit is running as root, you may optionally specify the
1253 UID and GID the executed program should switch to.
1254
1255 Example:
1256
1257 check process mmonit with pidfile /usr/local/mmonit/mmonit/logs/mmonit.pid
1258 start program = "/usr/local/mmonit/bin/mmonit" as uid "mmonit" and gid "mmonit"
1259 stop program = "/usr/local/mmonit/bin/mmonit stop" as uid "mmonit" and gid "mmonit"
1260
1261 In the case of a process check, Monit will wait up to 30 seconds for
1262 the start/stop action to finish before giving up and report an error.
1263 You can override this timeout using the TIMEOUT option or globally
1264 using the set limits.
1265
1266 Example:
1267
1268 check process foobar with pidfile /var/run/foobar.pid
1269 start program = "/etc/init.d/foobar start" with timeout 60 seconds
1270 stop program = "/etc/init.d/foobar stop"
1271
1273 Services are checked regularly in an interval defined by the "set
1274 daemon n" statement. Checks are performed in the same order as they are
1275 written in the ".monitrc" file, except if dependencies are setup
1276 between services, where pre-requisite services are tested first.
1277
1278 It is possible to modify a service check schedule by using the "every"
1279 statement.
1280
1281 There are three variants:
1282
1283 1. A poll cycle multiple
1284 EVERY [number] CYCLES
1285
1286 2. Cron-style
1287 EVERY [cron]
1288
1289 3. Negative Cron-style (do-not-check)
1290 NOT EVERY [cron]
1291
1292 A cron-style string consist of 5 fields separated with white-space.
1293 All fields are required:
1294
1295 Name: | Allowed values: | Special characters:
1296 ---------------------------------------------------------------
1297 Minutes | 0-59 | * - ,
1298 Hours | 0-23 | * - ,
1299 Day of month | 1-31 | * - ,
1300 Month | 1-12 (1=jan, 12=dec) | * - ,
1301 Day of week | 0-6 (0=sunday, 6=saturday) | * - ,
1302
1303 The special characters:
1304
1305 Character: | Description:
1306 ---------------------------------------------------------------
1307 * (asterisk) | The asterisk indicates that the expression will
1308 | match for all values of the field; e.g., using
1309 | an asterisk in the 4th field (month) would
1310 | indicate every month.
1311 - (hyphen) | Hyphens are used to define ranges. For example,
1312 | 8-9 in the hour field indicate between 8AM and
1313 | 9AM. Note that range is from start time until and
1314 | including end time. That is, from 8AM and until
1315 | 10AM unless minutes are set. Another example,
1316 | 1-5 in the weekday field, specify from monday to
1317 | friday (including friday).
1318 , (comma) | Comma are used to specify a sequence. For example
1319 | 17,18 in the day field indicate the 17th and 18th
1320 | day of the month. A sequence can also include
1321 | ranges. For example, using 1-5,0 in the weekday
1322 | field indicate monday to friday and sunday.
1323
1324 Example 1: Check once per two cycles
1325
1326 check process nginx with pidfile /var/run/nginx.pid
1327 every 2 cycles
1328
1329 Example 2: Check every workday between 8AM to 7PM
1330
1331 check program checkOracleDatabase
1332 with path /var/monit/programs/checkoracle.pl
1333 every "* 8-19 * * 1-5"
1334
1335 Example 3: Do not run the check in the backup window on Sunday between
1336 0AM to 3AM, otherwise run the check with the regular poll cycle
1337 frequency.
1338
1339 check process mysqld with pidfile /var/run/mysqld.pid
1340 not every "* 0-3 * * 0"
1341
1342 Limitations:
1343
1344 The current scheduler is poll cycle based. If a service check is
1345 scheduled with the every cron statement, Monit will check if the
1346 current time match the cron-string pattern. If it does, then the check
1347 is performed otherwise it is skipped. The cron specification does not
1348 guarantee when exactly the test will run, this depends on the default
1349 poll time and the length of the check cycle. In other words, we cannot
1350 guarantee that Monit will run on a specific time. Therefor we strongly
1351 recommend to use an asterix in the minute field or at minimum a range,
1352 e..g. 0-15. Never use a specific minute as Monit may not run on that
1353 minute.
1354
1355 We will address this limitation in a future release and convert the
1356 scheduler from serial polling into a parallel non-blocking scheduler
1357 where checks are guaranteed to run on time and with seconds resolution.
1358
1360 Service entries in the control file, monitrc, can be grouped together
1361 by the group statement. The syntax is simply (keyword in capital):
1362
1363 GROUP groupname
1364
1365 With this statement it is possible to group similar service entries
1366 together and manage them as a whole. Monit provides functions to start,
1367 stop, restart, monitor and unmonitor a group of services, like so:
1368
1369 To start a group of services from the console:
1370
1371 monit -g <groupname> start
1372
1373 To stop a group of services:
1374
1375 monit -g <groupname> stop
1376
1377 To restart a group of services:
1378
1379 monit -g <groupname> restart
1380
1381 A service can be added to multiple groups by using more than one group
1382 statement:
1383
1384 group www
1385 group filesystem
1386
1388 Monit supports two monitoring modes: active and passive.
1389
1390 Syntax:
1391
1392 MODE <ACTIVE | PASSIVE>
1393
1394 In active mode, Monit will pro-actively monitor a service and in case
1395 of problems raise alerts and restart the service. Active is the default
1396 mode.
1397
1398 The passive mode is similar to the active mode, except if the service
1399 fails, monit will not try to fix a problem by restarting the service
1400 and will raise alerts only.
1401
1403 Monit supports three reboot modes: start, nostart and laststate.
1404
1405 Syntax:
1406
1407 ONREBOOT <START | NOSTART | LASTSTATE>
1408
1409 In start mode, Monit will always start the service automatically on
1410 reboot, even if it was stopped before restart. This is the default mode
1411 and used if onreboot is not specified.
1412
1413 In nostart mode, the service is never started automatically after
1414 reboot. This mode is intended for a high-availability solutions with
1415 active/passive clusters. For example, a service group HA, consisting of
1416 e.g. a mobile IP alias and an application server, is started on host
1417 H1, host H2 is backup and heartbeat is in place between both hosts.
1418 The service group HA must be started on one node only. If H1 dies, H2
1419 takes over the HA group. If H1 reboots, it is important that it won't
1420 try to start the HA group also. Even though the group was active on H1
1421 before it crashed, as HA is running on H2 now.
1422
1423 In laststate mode, a service's monitoring state is persistent across
1424 reboot. For instance, if a service was started before reboot, it will
1425 be started after reboot. If it was stopped before reboot, it will not
1426 be started after and so on.
1427
1428 The default ONREBOOT START mode can be overridden globally:
1429
1430 SET ONREBOOT <START | NOSTART | LASTSTATE>
1431
1433 Monit provides a restart limit mechanism for situations where a service
1434 simply refuses to start or respond over a longer period.
1435
1436 The restart limit mechanism is based on number of service restarts and
1437 number of poll-cycles. For example, if a service had x restarts within
1438 y poll-cycles (where x <= y) then Monit will perform an action (for
1439 example unmonitor the service). If a timeout occurs, Monit will send an
1440 alert message if you have register interest for this event.
1441
1442 The syntax for the timeout statement is as follows (keywords are in
1443 capital):
1444
1445 IF <number> RESTART <number> CYCLE(S) THEN <action>
1446
1447 The action value is either one of common actions or TIMEOUT (for
1448 backward compatibility, equals to UNMONITOR action).
1449
1450 Here is an example where Monit will unmonitor the service if it was
1451 restarted 2 times within 3 cycles:
1452
1453 if 2 restarts within 3 cycles then unmonitor
1454
1455 To have Monit check the service again after monitoring was disabled,
1456 run "monit monitor servicename" from the command line.
1457
1458 Example for setting custom exec on timeout:
1459
1460 if 5 restarts within 5 cycles then exec "/foo/bar"
1461
1462 Example for stopping the service:
1463
1464 if 7 restarts within 10 cycles then stop
1465
1467 If specified in the control file, Monit can do dependency checking
1468 before start, stop, monitoring or unmonitoring of services. The
1469 dependency statement may be used within any service entries in the
1470 Monit control file.
1471
1472 The syntax for the depend statement is simply:
1473
1474 DEPENDS on service[, service [,...]]
1475
1476 Where service is a check service entry name used in your ".monitrc"
1477 file, for instance apache or datafs.
1478
1479 You may add more than one service name of any type or use more than one
1480 depend statement in an entry.
1481
1482 Services specified in a depend statement will be checked during
1483 stop/start/monitor/unmonitor operations.
1484
1485 If a service is stopped or unmonitored it will stop/unmonitor any
1486 services that depends on itself.
1487
1488 If the service is started, all services which this service depends on
1489 will be started before starting this service. if start of some service
1490 failed, the service with prerequisites will NOT be started and the, but
1491 will remember that it should start and will retry next cycle.
1492
1493 If a service is restarted, it will first stop any active services that
1494 depend on it and after it is started, start all depending services that
1495 were active before the restart again.
1496
1497 Here is an example where we set up an apache service entry to depend on
1498 the underlying apache binary. If the binary should change an alert is
1499 sent and apache is not monitored anymore. The rationale is security and
1500 that Monit should not execute a possibly cracked apache binary.
1501
1502 (1) check process apache with pidfile "/var/run/httpd.pid"
1503 (2) depends on httpd
1504 (3) ...
1505 (4)
1506 (5) check file httpd with path /usr/bin/httpd
1507 (6) if failed checksum then stop
1508
1509 The first entry is the process entry for apache. The second line sets
1510 up a dependency between this entry and the service entry named httpd in
1511 line 5. A dependency tree works as follows, if an action is conducted
1512 in a lower branch it will propagate upward in the tree and for every
1513 dependent entry execute the same action. In this case, if the checksum
1514 should fail in line 6 then an stop action is executed and apache binary
1515 is not checked anymore. But since the apache process entry depends on
1516 the httpd entry this entry will also execute the stop action. In short,
1517 if the checksum test for the httpd binary file should fail, both the
1518 check file httpd and the check process apache entry are stopped.
1519
1520 A dependency tree is a general construct and can be used between all
1521 types of service entries and span many levels and propagate any
1522 supported action (except the exec action which will not propagate
1523 upward in a dependency tree for obvious reasons).
1524
1525 Here is another different example. Consider the following common server
1526 setup:
1527
1528 WEB-SERVER -> APPLICATION-SERVER -> DATABASE -> FILESYSTEM
1529 (a) (b) (c) (d)
1530
1531 You can set dependencies so that the web-server depends on the
1532 application server to run before the web-server starts and the
1533 application server depends on the database server and the database
1534 depends on the filesystem to be mounted before it starts. See also the
1535 example section below for examples using the depend statement.
1536
1537 Here we describe how Monit will function with the above dependencies:
1538
1539 If no services are running
1540 Monit will start the servers in the following order: d, c, b, a
1541
1542 If all servers are running
1543 When you run 'monit stop all' this is the stop order: a, b, c, d.
1544 If you run 'Monit stop d' then a, b and c are also stopped because
1545 they depend on d and finally d is stopped.
1546
1547 If a does not run
1548 Monit will start a
1549
1550 If b does not run
1551 Monit will first stop a then start b and finally start a if b is up
1552 again.
1553
1554 If c does not run
1555 Monit will first stop a and b then start c and finally start b then
1556 a.
1557
1558 If d does not run
1559 Monit will first stop a, b and c then start d and finally start c,
1560 b then a.
1561
1562 If the control file contains a depend loop.
1563 A depend loop is for example; a->b and b->a or a->b->c->a.
1564
1565 When Monit starts it will check for such loops and complain and
1566 exit if a loop was found. It will also exit with a complaint if a
1567 depend statement was used that does not point to a service in the
1568 control file.
1569
1571 LIMITS
1572
1573 You can configure and set various limits to tweak buffer sizes and
1574 timeouts used by Monit. In most situations the default values are fine.
1575 If needed, below are the limits you can currently modify in Monit.
1576
1577 Syntax:
1578
1579 SET LIMITS {
1580 PROGRAMOUTPUT: <number> <unit>,
1581 SENDEXPECTBUFFER: <number> <unit>,
1582 FILECONTENTBUFFER: <number> <unit>,
1583 HTTPCONTENTBUFFER: <number> <unit>,
1584 NETWORKTIMEOUT: <number> <timeunit>
1585 PROGRAMTIMEOUT: <number> <timeunit>
1586 STOPTIMEOUT: <number> <timeunit>
1587 STARTTIMEOUT: <number> <timeunit>
1588 RESTARTTIMEOUT: <number> <timeunit>
1589 }
1590
1591 Where:
1592 unit is "B" (byte), "kB" (kilobyte) or "MB" (megabyte)
1593 timeunit is "MS" (millisecond) or "S" (second)
1594
1595 Options legend:
1596
1597 ----------------------------------------------------------------------------------
1598 | Option | Description | Default |
1599 ----------------------------------------------------------------------------------
1600 | programOutput | limit for check program output (truncated after) | 512 B |
1601 | sendExpectBuffer | limit for send/expect protocol test | 256 B |
1602 | fileContentBuffer | limit for file content test (line) | 512 B |
1603 | httpContentBuffer | limit for HTTP content test (response body) | 1 MB |
1604 | networkTimeout | timeout for network I/O | 5 s |
1605 | programTimeout | timeout for check program | 300 s |
1606 | stopTimeout | timeout for service stop | 30 s |
1607 | startTimeout | timeout for service start | 30 s |
1608 | restartTimeout | timeout for service restart | 30 s |
1609 ----------------------------------------------------------------------------------
1610
1611 GENERAL SYNTAX
1612
1613 Monit offers several if-tests you can use in a 'check' statement to
1614 test various aspects of a service.
1615
1616 You can test both for a predefined value or for a range and take
1617 actions if the value changes.
1618
1619 General syntax for testing a specific value or range:
1620
1621 IF <test> THEN <action> [ELSE IF SUCCEEDED THEN <action>]
1622
1623 The action is evaluated each time the <TEST> condition is true. Success
1624 action is optional and executed only when the state changes from
1625 failure to success. If success action is not set, Monit will send a
1626 recovery alert by default.
1627
1628 General syntax for a value change test:
1629
1630 IF CHANGED <test> THEN <action>
1631
1632 The action is executed each time the value changes. Monit will remember
1633 the new value and will trigger event if the value change again.
1634
1635 ACTION
1636
1637 In each test you must select the action to be executed from this list:
1638
1639 · ALERT sends the user an alert event on each state change.
1640
1641 · RESTART restarts the service and send an alert. Restart is
1642 performed by calling the service's registered restart method or by
1643 first calling the stop method followed by the start method if
1644 restart is not set.
1645
1646 · START starts the service by calling the service's registered start
1647 method and send an alert.
1648
1649 · STOP stops the service by calling the service's registered stop
1650 method and send an alert. If Monit stops a service it will not be
1651 checked by Monit anymore nor restarted again later. To reactivate
1652 monitoring of the service again you must explicitly enable
1653 monitoring from the web interface or from the console.
1654
1655 · EXEC can be used to execute an arbitrary program and send an alert.
1656 If you choose this action you must state the program to be executed
1657 and if the program requires arguments you must enclose the program
1658 and its arguments in a quoted string. You may optionally specify
1659 the uid and gid the executed program should switch to upon start.
1660 The program is executed only once if the test fails. You can enable
1661 execute repetition if the error persists for a given number of
1662 cycles. For instance:
1663
1664 if failed <test> then exec "/usr/local/bin/sms.sh"
1665 as uid "nobody" and gid "nobody"
1666 repeat every 5 cycles
1667
1668 Remember, if Monit is run by root, then all programs executed by
1669 Monit will be started with superuser privileges unless the uid and
1670 gid extension is used.
1671
1672 · UNMONITOR will disable monitoring of the service and send an alert.
1673 The service will not be checked by Monit anymore nor restarted
1674 again later. To reactivate monitoring of the service you must
1675 explicitly enable monitoring from the web interface or from the
1676 console.
1677
1678 FAULT TOLERANCE
1679
1680 By default an action is executed if it matches and the corresponding
1681 service is set in an error state. However, you can require a test to
1682 fail more than once before the error event is triggered and the service
1683 state is changed to failed. This is useful to avoid getting alerts on
1684 spurious errors, which can happen, especially with network tests.
1685
1686 Syntax:
1687
1688 FOR <X> CYCLES ...
1689
1690 or:
1691
1692 <X> [TIMES WITHIN] <Y> CYCLES ...
1693
1694 The condition can be used both for failure and success action.
1695
1696 The first, simpler and recommended format requires "X" consecutive
1697 events before switching the state:
1698
1699 if failed
1700 port 80
1701 for 3 cycles
1702 then alert
1703
1704 The second format is more advanced and allows one to tolerate
1705 intermittent issues, but still catch excessive problems, where the
1706 service is flapping between error and success states frequently.
1707
1708 For example if every second cycle fails (1-0-1-0-1-0-...), then "for 2
1709 cycles" condition will never match, despite the service having
1710 problems. The following statement will catch such a state:
1711
1712 if failed
1713 port 80
1714 for 3 times within 5 cycles
1715 then alert
1716
1717 Example which sets multiple error levels and actions:
1718
1719 check filesystem rootfs with path /dev/hda1
1720 if space usage > 80% for 5 times within 15 cycles then alert
1721 if space usage > 90% for 5 cycles then exec '/try/to/free/the/space'
1722
1723 Note: the maximum value for cycles is 64.
1724
1725 EXISTENCE TESTS
1726 This test allows one to trigger an action based on the monitored object
1727 existence. It is supported for process, file, directory, filesystem and
1728 fifo services.
1729
1730 If no existence test is defined, the implicit non-existence test with
1731 restart action is activated, so for example if the process stops, Monit
1732 will restart it.
1733
1734 There are two types of existence tests:
1735
1736 NON-EXIST
1737
1738 This test will trigger an action if the object does not exist. It can
1739 be used for example to make sure apache is running, data filesystem is
1740 mounted, etc.
1741
1742 IF [DOES] NOT EXIST THEN <action>
1743
1744 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
1745 "UNMONITOR".
1746
1747 Example: Exec a script if a filesystem does NOT exist:
1748
1749 check filesystem disk1 with path /dev/sda1
1750 if does not exist then exec "/sbin/mount..."
1751
1752 EXIST
1753
1754 This test is the inverse of the non-existence test: it will trigger an
1755 action if the object DOES exist. It can be used for example to kill a
1756 process which shouldn't be running.
1757
1758 IF [DOES] EXIST THEN <action>
1759
1760 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
1761 "UNMONITOR".
1762
1763 Example: kill a process that should not run:
1764
1765 check process vmware matching "vmware"
1766 if exist then exec "/usr/bin/pkill -9 vmware"
1767
1768 Example: Alert if a file exist which shouldn't
1769
1770 check file x with path /some/path/x
1771 if exist then alert
1772
1773 RESOURCE TESTS
1774 Monit can examine how much resources a service is using. This test can
1775 only be used within a system or process service entry in the Monit
1776 control file.
1777
1778 Depending on system or process characteristics, services can be stopped
1779 or restarted and alerts can be generated. Thus it is possible to
1780 utilise systems which are idle and to spare system under high load.
1781
1782 Syntax:
1783
1784 IF <resource> <operator> <value> THEN <action>
1785
1786 operator is a choice of "<", ">", "!=", "==" in C notation, "gt", "lt",
1787 "eq", "ne" in shell sh notation and "greater", "less", "equal",
1788 "notequal" in human readable form (if not specified, default is EQUAL).
1789
1790 value is either an integer or a real number.
1791
1792 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
1793 "UNMONITOR".
1794
1795 resource set depends on the service type:
1796
1797 System resource tests
1798
1799 LOADAVG([1min|5min|15min]) refers to the system's load average. The
1800 load average is the number of processes in the system run queue,
1801 averaged over the specified time period. Example:
1802
1803 if loadavg (1min) > 90 for 15 cycles then alert
1804 if loadavg (5min) > 80 for 10 cycles then alert
1805 if loadavg (15min) > 70 for 8 cycles then alert
1806
1807 CPU([user|system|wait]) is the percent of time the system spend in user
1808 or kernel space and I/O. The user/system/wait modifier is optional, if
1809 not used, the total system cpu usage is tested. Example:
1810
1811 if cpu usage > 95% for 10 cycles then alert
1812
1813 MEMORY is the system memory usage [%] or absolute value [B, kB, MB,
1814 GB]. Example:
1815
1816 if memory usage > 75% for 5 cycles then alert
1817
1818 SWAP is the swap usage of the system [%] or absolute [B, kB, MB, GB].
1819 Example:
1820
1821 if swap usage > 20% for 10 cycles then alert
1822
1823 Process resource tests
1824
1825 CPU is the CPU usage of the process itself [%]. Monit calculates the
1826 CPU usage based on number of threads vs. available CPU cores. If the
1827 process has one thread, the 100% CPU usage equals to 100% utilization
1828 of one CPU core. If it has 2 threads, 100% CPU usage is reported when
1829 it uses 2 CPU cores on 100%, etc. If the process has more threads then
1830 the machine's available CPU cores, then the 100% CPU usage corresponds
1831 to utilization of all available CPU cores. Example:
1832
1833 if cpu > 10% for 5 cycles then restart
1834
1835 TOTAL CPU is the total CPU usage of the process and its children in
1836 (percent). You will want to use TOTAL CPU typically for services like
1837 Apache web server where one master process forks child processes as
1838 workers. Example:
1839
1840 if total cpu > 50% for 10 cycles then restart
1841
1842 THREADS is the number of processes' threads. Example:
1843
1844 if threads > 3 then alert
1845
1846 CHILDREN is the number of child processes of the process. Example:
1847
1848 if children > 10 then alert
1849
1850 MEMORY is the memory usage of the process itself, [%] or absolute value
1851 [B, kB, MB, GB]. Example:
1852
1853 if memory usage > 8 MB then alert
1854
1855 TOTAL MEMORY is the memory usage of the process and its child processes
1856 in either percent or as an amount [B, kB, MB, GB]. Example:
1857
1858 if total memory usage > 1% for 10 cycles then alert
1859
1860 PROCESS DISK I/O TEST
1861 Monit can test process' filesystem read and write activity. This test
1862 can only be used in the context of a process service type. Monit will
1863 normally need to run as the root user to access this metrics.
1864
1865 The OS usually supports the per-process I/O metrics by bytes or by
1866 operations.
1867
1868 Per-process I/O activity statistics by platform:
1869
1870 -----------------------------------
1871 | Platform | Operation | Byte |
1872 -----------------------------------
1873 | AIX | x | |
1874 | DragonFlyBSD | x | |
1875 | FreeBSD | x | |
1876 | Linux | | x |
1877 | MacOS | | x |
1878 | NetBSD | x | |
1879 | OpenBSD | x | |
1880 | Solaris | x | |
1881 -----------------------------------
1882
1883 Read: bytes per second
1884
1885 Syntax:
1886
1887 IF DISK READ [RATE] <operator> <number> <unit>/S THEN action
1888
1889 operator is a choice of "<",">","!=","==" in c notation, "gt", "lt",
1890 "eq", "ne" in shell sh notation and "greater", "less", "equal",
1891 "notequal" in human readable form (if not specified, default is EQUAL).
1892
1893 unit is a choice of "B","KB","MB","GB" or long alternatives "byte",
1894 "kilobyte", "megabyte", "gigabyte", "percent".
1895
1896 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
1897 "UNMONITOR".
1898
1899 Example:
1900
1901 check process p...
1902 if disk read > 1 MB/s then alert
1903
1904 Read: operations per second
1905
1906 Syntax:
1907
1908 IF DISK READ <operator> <number> operations/S THEN action
1909
1910 operator is a choice of "<",">","!=","==" in c notation, "gt", "lt",
1911 "eq", "ne" in shell sh notation and "greater", "less", "equal",
1912 "notequal" in human readable form (if not specified, default is EQUAL).
1913
1914 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
1915 "UNMONITOR".
1916
1917 Example:
1918
1919 check process p...
1920 if disk read rate > 500 operations/s then alert
1921
1922 Write: bytes per second
1923
1924 Syntax:
1925
1926 IF DISK WRITE <operator> <number> <unit>/S THEN action
1927
1928 operator is a choice of "<",">","!=","==" in c notation, "gt", "lt",
1929 "eq", "ne" in shell sh notation and "greater", "less", "equal",
1930 "notequal" in human readable form (if not specified, default is EQUAL).
1931
1932 unit is a choice of "B","KB","MB","GB" or long alternatives "byte",
1933 "kilobyte", "megabyte", "gigabyte", "percent".
1934
1935 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
1936 "UNMONITOR".
1937
1938 Example:
1939
1940 check process p...
1941 if disk write rate > 1 MB/s then alert
1942
1943 Write: operations per second
1944
1945 Syntax:
1946
1947 IF DISK WRITE <operator> <number> operations/S THEN action
1948
1949 operator is a choice of "<",">","!=","==" in c notation, "gt", "lt",
1950 "eq", "ne" in shell sh notation and "greater", "less", "equal",
1951 "notequal" in human readable form (if not specified, default is EQUAL).
1952
1953 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
1954 "UNMONITOR".
1955
1956 Example:
1957
1958 check process p...
1959 if disk write rate > 500 operations/s then alert
1960
1961 FILE CHECKSUM TEST
1962 The checksum statement may only be used in a file service entry and can
1963 be used to check the file's MD5 or SHA1 checksum.
1964
1965 Check specific checksum:
1966
1967 IF FAILED [MD5|SHA1] CHECKSUM [EXPECT checksum] THEN action
1968
1969 Check any file changes:
1970
1971 IF CHANGED [MD5|SHA1] CHECKSUM THEN action
1972
1973 The choice of MD5 or SHA1 is optional. MD5 features a 128 bits checksum
1974 (32 bytes hex encoded string) and SHA1 a 160 bits checksum (40 bytes
1975 hex encoded string). If this option is omitted, Monit will try to guess
1976 the method from the EXPECT string or use MD5 as the default checksum.
1977
1978 "expect" is optional and if used, specifies the md5 or sha1 string
1979 Monit should expect when testing a file's checksum. Monit will then not
1980 compute an initial checksum for the file, but instead use the string
1981 you submit. For example:
1982
1983 if failed
1984 checksum expect 8f7f419955cefa0b33a2ba316cba3659
1985 then alert
1986
1987 You can, for example, use the GNU utility md5sum(1) or sha1sum(1) to
1988 create a checksum string for a file and use this string in the expect-
1989 statement.
1990
1991 Reloading a server if its configuration file was changed:
1992
1993 check file apache_conf with path /etc/apache/httpd.conf
1994 if changed checksum then exec "/usr/bin/apachectl graceful"
1995
1996 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
1997 "UNMONITOR".
1998
1999 TIMESTAMP TEST
2000 The timestamp statement may only be used in a file, fifo or directory
2001 service entry.
2002
2003 Relative timestamp syntax:
2004
2005 IF <ACCESS TIME | ATIME | MODIFICATION TIME | MTIME | CHANGE TIME | CTIME | TIME[STAMP]> <operator> <value> [unit] THEN <action>
2006
2007 Timestamp change syntax:
2008
2009 IF CHANGED <ACCESS TIME | ATIME | MODIFICATION TIME | MTIME | CHANGE TIME | CTIME | TIME[STAMP]> THEN action
2010
2011 There are four timestamp test types:
2012
2013 ACCESS (ATIME)
2014 Test the timestamp which is updated whenever the object is
2015 accessed, for example the file is read. Filesystem usually
2016 allows one to disable atime updates using mount options, so
2017 this test will work only if the filesystem performs atime
2018 updates.
2019
2020 CHANGE (CTIME)
2021 Test the timestamp which is updated whenever the object
2022 metadata such as owner, group, permissions or hard link
2023 count are changed.
2024
2025 MODIFICATION (MTIME)
2026 Test the timestamp which is updated whenever the object
2027 content is modified. The file modification timestamp is
2028 updated whenever the file is truncated or written to. The
2029 directory modification timestamp is updated whenever some
2030 files/subdirectories were added to the directory or removed
2031 from that directory.
2032
2033 DEFAULT (LATEST OF CHANGE AND MODIFICATION TIMES)
2034 If no specific timestamp type is set, the latest of change
2035 and modification timestamps is checked. This test allows
2036 for simple testing of any object modification (data and
2037 metadata).
2038
2039 operator is a choice of "<", ">", "!=", "==" in C notation, "GT", "LT",
2040 "EQ", "NE" in shell sh notation and "NEWER, "OLDER", "GREATER", "LESS",
2041 "EQUAL", "NOTEQUAL" in human readable form (if not specified, default
2042 is EQUAL).
2043
2044 value is a time watermark.
2045
2046 unit is either "SECOND(S)", "MINUTE(S)", "HOUR(S)" or "DAY(S)".
2047
2048 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
2049 "UNMONITOR".
2050
2051 For example to reload apache if the configuration file changed:
2052
2053 check file apache_conf with path /etc/apache/httpd.conf
2054 if changed timestamp then exec "/usr/bin/apachectl graceful"
2055
2056 For example to test directory for file addition or removal:
2057
2058 check directory bar path /foo/bar
2059 if changed timestamp then alert
2060
2061 Example for sending alert if a log file is not updated for more than 1
2062 hour:
2063
2064 if timestamp is older than 1 hour then alert
2065
2066 FILE SIZE TEST
2067 The size statement may only be used in a check file service entry. If
2068 specified in the control file, Monit will compute a size for a file.
2069
2070 Testing specific size or range:
2071
2072 IF SIZE [[operator] value [unit]] THEN action
2073
2074 Testing size changes:
2075
2076 IF CHANGED SIZE THEN action
2077
2078 operator is a choice of "<", ">", "!=", "==" in C notation, "GT", "LT",
2079 "EQ", "NE" in shell sh notation and "GREATER", "LESS", "EQUAL",
2080 "NOTEQUAL" in human readable form (if not specified, default is EQUAL).
2081
2082 value is a size watermark.
2083
2084 unit is a choice of "B","KB","MB","GB" or long alternatives "byte",
2085 "kilobyte", "megabyte", "gigabyte". If it is not specified, "byte" unit
2086 is assumed by default.
2087
2088 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
2089 "UNMONITOR".
2090
2091 For example to send an alert if the file is too large:
2092
2093 check file mydb with path /data/mydatabase.db
2094 if size > 1 GB then alert
2095
2096 FILE CONTENT TEST
2097 The content statement can be used to incrementally test the content of
2098 a text file by using regular expressions.
2099
2100 Syntax:
2101
2102 IF CONTENT <operator> <regex|path> THEN action
2103
2104 operator is either a "=" for match or "!=" for no-match.
2105
2106 regex is a string containing the extended regular expression. See also
2107 regex(7).
2108
2109 path is an absolute path to a file containing extended regular
2110 expression on every line. See also regex(7).
2111
2112 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
2113 "UNMONITOR".
2114
2115 On startup the read position is set to the end of the file and Monit
2116 continues to scan to the end of the file on each cycle.
2117
2118 If the file size should decrease or inode changed, the read position is
2119 set to the start of the file.
2120
2121 Only lines ending with a newline character are inspected.
2122
2123 By default only the first 511 characters of a line are inspected. You
2124 can increase the limit using the set limits statement.
2125
2126 IGNORE CONTENT <operator> <regex|path>
2127
2128 Lines matching an IGNORE are not inspected during later evaluations.
2129 IGNORE CONTENT has always precedence over IF CONTENT.
2130
2131 All IGNORE CONTENT statements are evaluated first, in the order of
2132 their appearance. Thereafter, all the IF CONTENT statements are
2133 evaluated.
2134
2135 For example:
2136
2137 check file syslog with path /var/log/syslog
2138 ignore content = "monit"
2139 if content = "^mrcoffee" then alert
2140
2141 FILESYSTEM MOUNT FLAGS TEST
2142 Monit can test the filesystem mount flags for changes. This test is
2143 implicit and Monit will send alert in case of failure by default.
2144
2145 This test is useful for detecting changes of filesystem flags such as
2146 if the filesystem become read-only (on disk error) or mount flags were
2147 changed (such as nosuid).
2148
2149 The syntax for the fsflags statement is:
2150
2151 IF CHANGED FSFLAGS THEN action
2152
2153 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
2154 "UNMONITOR".
2155
2156 Example:
2157
2158 check filesystem rootfs with path /
2159 if changed fsflags then exec "/my/script"
2160
2161 SPACE USAGE TEST
2162 Monit can test a filesystem or a disk for space usage. This test may
2163 only be used in the context of a filesystem service type.
2164
2165 Filesystems usually have some space reserved for the root user (ca.
2166 1-5%), so non-superusers cannot write to a nearly full filesystem. If
2167 you set a limit for the filesystem which is used by non-root users you
2168 might want to consider these reserved blocks when setting the limit.
2169 You can use Monit itself to view the reserved blocks percentage by
2170 using the CLI status command or the HTTP interface for the given
2171 filesystem.
2172
2173 Syntax:
2174
2175 IF SPACE operator value unit THEN action
2176
2177 or:
2178
2179 IF SPACE FREE operator value unit THEN action
2180
2181 operator is a choice of "<",">","!=","==" in c notation, "gt", "lt",
2182 "eq", "ne" in shell sh notation and "greater", "less", "equal",
2183 "notequal" in human readable form (if not specified, default is EQUAL).
2184
2185 unit is a choice of "B","KB","MB","GB", "%" or long alternatives
2186 "byte", "kilobyte", "megabyte", "gigabyte", "percent".
2187
2188 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
2189 "UNMONITOR".
2190
2191 Example:
2192
2193 check filesystem rootfs with path /
2194 if space usage > 90% then alert
2195
2196 INODE USAGE TEST
2197 Monit can test filesystem inode usage. This test may only be used in
2198 the context of a filesystem service type.
2199
2200 Syntax:
2201
2202 IF INODE(S) operator value [unit] THEN action
2203
2204 or:
2205
2206 IF INODE(S) FREE operator value [unit] THEN action
2207
2208 operator is a choice of "<",">","!=","==" in c notation, "gt", "lt",
2209 "eq", "ne" in shell sh notation and "greater", "less", "equal",
2210 "notequal" in human readable form (if not specified, default is EQUAL).
2211
2212 unit is optional. If not specified, the value is an absolute count of
2213 inodes. You can use the "%" character or the longer alternative
2214 "percent" as a unit.
2215
2216 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
2217 "UNMONITOR".
2218
2219 Example:
2220
2221 check filesystem rootfs with path /
2222 if inode usage > 90% then alert
2223
2224 DISK I/O TEST
2225 Monit can test a filesystem read and write activity. This test may only
2226 be used in the context of a filesystem service type.
2227
2228 The available I/O metrics depends on the platform and filesystem. Some
2229 platforms allows us to get I/O activity for specific partition, others
2230 just for the whole disk. Some allows us to get metrics for network
2231 filesystems, others just for block devices.
2232
2233 Platforms I/O metrics granularity and filesystem support in Monit:
2234
2235 ---------------------------------------------------------------------------------------
2236 | Platform | Granularity | Supported filesystems | TBD |
2237 ---------------------------------------------------------------------------------------
2238 | AIX | per-disk | Disk io monitoring currently not supported | JFSx |
2239 | DragonFlyBSD | per-disk | UFS | HAMMER |
2240 | FreeBSD | per-disk | UFS | ZFS |
2241 | Linux | per-filesystem | EXTx, XFS, BTRFS, ZFS, NFS, CIFS | |
2242 | MacOS | per-disk | HFS | |
2243 | NetBSD | per-disk | FFS | NFS |
2244 | OpenBSD | per-disk | FFS | |
2245 | Solaris | per-filesystem | ZFS, UFS, NFS | |
2246 ---------------------------------------------------------------------------------------
2247
2248 Read: bytes per second
2249
2250 Syntax:
2251
2252 IF READ [RATE] <operator> <number> <unit>/S THEN action
2253
2254 operator is a choice of "<",">","!=","==" in c notation, "gt", "lt",
2255 "eq", "ne" in shell sh notation and "greater", "less", "equal",
2256 "notequal" in human readable form (if not specified, default is EQUAL).
2257
2258 unit is a choice of "B","KB","MB","GB" or long alternatives "byte",
2259 "kilobyte", "megabyte", "gigabyte", "percent".
2260
2261 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
2262 "UNMONITOR".
2263
2264 Example:
2265
2266 check filesystem disk1...
2267 if read rate > 1 MB/s then alert
2268
2269 Read: operations per second
2270
2271 Syntax:
2272
2273 IF READ [RATE] <operator> <number> operations/S THEN action
2274
2275 operator is a choice of "<",">","!=","==" in c notation, "gt", "lt",
2276 "eq", "ne" in shell sh notation and "greater", "less", "equal",
2277 "notequal" in human readable form (if not specified, default is EQUAL).
2278
2279 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
2280 "UNMONITOR".
2281
2282 Example:
2283
2284 check filesystem disk1...
2285 if read rate > 500 operations/s then alert
2286
2287 Write: bytes per second
2288
2289 Syntax:
2290
2291 IF WRITE [RATE] <operator> <number> <unit>/S THEN action
2292
2293 operator is a choice of "<",">","!=","==" in c notation, "gt", "lt",
2294 "eq", "ne" in shell sh notation and "greater", "less", "equal",
2295 "notequal" in human readable form (if not specified, default is EQUAL).
2296
2297 unit is a choice of "B","KB","MB","GB" or long alternatives "byte",
2298 "kilobyte", "megabyte", "gigabyte", "percent".
2299
2300 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
2301 "UNMONITOR".
2302
2303 Example:
2304
2305 check filesystem disk1...
2306 if write rate > 1 MB/s then alert
2307
2308 Write: operations per second
2309
2310 Syntax:
2311
2312 IF WRITE [RATE] <operator> <number> operations/S THEN action
2313
2314 operator is a choice of "<",">","!=","==" in c notation, "gt", "lt",
2315 "eq", "ne" in shell sh notation and "greater", "less", "equal",
2316 "notequal" in human readable form (if not specified, default is EQUAL).
2317
2318 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
2319 "UNMONITOR".
2320
2321 Example:
2322
2323 check filesystem disk1...
2324 if write rate > 500 operations/s then alert
2325
2326 Service time per operation
2327
2328 Service Time is the time taken to complete a read or a write operation.
2329 This is a fairly important metric. If it grows, it means that the disk
2330 is not able to handle the operations fast enough. Growth charts are
2331 available in M/Monit.
2332
2333 Syntax:
2334
2335 IF SERVICE TIME <operator> <number> <unit> THEN action
2336
2337 operator is a choice of "<",">","!=","==" in c notation, "gt", "lt",
2338 "eq", "ne" in shell sh notation and "greater", "less", "equal",
2339 "notequal" in human readable form (if not specified, default is EQUAL).
2340
2341 unit is "MS" (millisecond) or "S" (second)
2342
2343 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
2344 "UNMONITOR".
2345
2346 Example:
2347
2348 if service time > 10 milliseconds
2349 for 3 times within 5 cycles
2350 then alert
2351
2352 PERMISSION TEST
2353 Monit can test the permissions of file objects. This test may only be
2354 used in the context of a file, fifo, directory or filesystem service
2355 types.
2356
2357 Syntax for testing specific permissions:
2358
2359 IF FAILED PERM(ISSION) octalnumber THEN action
2360
2361 Syntax for testing any permission change:
2362
2363 IF CHANGED PERM(ISSION) THEN action
2364
2365 octalnumber defines permissions for a file, a directory or a filesystem
2366 as four octal digits (0-7). Valid range is 0000 - 7777 (you can omit
2367 the leading zeros, Monit will add the zeros to the left. For example,
2368 "640" is a valid value and matches "0640").
2369
2370 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
2371 "UNMONITOR".
2372
2373 Example:
2374
2375 check file shadow with path /etc/shadow
2376 if failed permission 0640 then alert
2377
2378 UID TEST
2379 Monit can monitor the owner user id (uid) of a file, fifo, directory or
2380 owner and effective user of a process.
2381
2382 Syntax:
2383
2384 IF FAILED [E]UID <value> THEN action
2385
2386 value defines a user id either in numeric or in string form.
2387
2388 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
2389 "UNMONITOR".
2390
2391 Example:
2392
2393 check file shadow with path /etc/shadow
2394 if failed uid "root" then alert
2395
2396 GID TEST
2397 Monit can monitor the owner group id (gid) of a file, fifo, directory
2398 or process.
2399
2400 Syntax:
2401
2402 IF FAILED GID <value> THEN action
2403
2404 value defines a group id either in numeric or in string form.
2405
2406 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
2407 "UNMONITOR".
2408
2409 Example:
2410
2411 check file shadow with path /etc/shadow
2412 if failed gid "shadow" then alert
2413
2414 PID TEST
2415 Monit can test the process' PID. This test is implicit and Monit will
2416 send an alert in case the PID changed outside of Monit's control.
2417
2418 Syntax:
2419
2420 IF CHANGED PID THEN action
2421
2422 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
2423 "UNMONITOR".
2424
2425 This test is useful to detect possible process restarts which has
2426 occurred in the timeframe between two Monit testing cycles.
2427
2428 For example if someone changes sshd configuration and did sshd restart
2429 outside of Monit's control you will be notified that the process was
2430 replaced by a new instance:
2431
2432 check process sshd with pidfile /var/run/sshd.pid
2433 if changed pid then alert
2434
2435 PPID TEST
2436 Monit can test the process' parent PID (PPID) for changes. This test is
2437 implicit and Monit will send alert in the case that the PPID changed
2438 outside of Monit control.
2439
2440 The syntax for the ppid statement is:
2441
2442 IF CHANGED PPID THEN action
2443
2444 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
2445 "UNMONITOR".
2446
2447 Example:
2448
2449 check process myproc with pidfile /var/run/myproc.pid
2450 if changed ppid then exec "/my/script"
2451
2452 UPTIME TEST
2453 The uptime statement may only be used in a process and system service
2454 type context.
2455
2456 Syntax:
2457
2458 IF UPTIME [[operator] value [unit]] THEN action
2459
2460 operator is a choice of "<", ">", "!=", "==" in C notation, "GT", "LT",
2461 "EQ", "NE" in shell sh notation and "GREATER", "LESS", "EQUAL",
2462 "NOTEQUAL" in human readable form (if not specified, default is EQUAL).
2463
2464 value is a uptime watermark.
2465
2466 unit is either "SECOND", "MINUTE", "HOUR" or "DAY" (it is also possible
2467 to use "SECONDS", "MINUTES", "HOURS", or "DAYS").
2468
2469 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
2470 "UNMONITOR".
2471
2472 Example of restarting the process every three days:
2473
2474 check process myapp with pidfile /var/run/myapp.pid
2475 start program = "/etc/init.d/myapp start"
2476 stop program = "/etc/init.d/myapp stop"
2477 if uptime > 3 days then restart
2478
2479 SECURITY ATTRIBUTE TEST
2480 The security attribute statement may only be used in a process context.
2481
2482 Syntax:
2483
2484 IF FAILED SECURITY ATTRIBUTE <string> THEN <action>
2485
2486 string expected security attribute value
2487
2488 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
2489 "UNMONITOR".
2490
2491 Example for SELinux:
2492
2493 check process ntpd matching "ntpd"
2494 if failed security attribute "system_u:system_r:ntpd_t:s0" then alert
2495
2496 Example for AppArmor:
2497
2498 check process ntpd matching "ntpd"
2499 if failed security attribute "/usr/sbin/ntpd (enforce)" then alert
2500
2501 PROGRAM STATUS TEST
2502 You can check the exit status of a program or a script. This test may
2503 only be used within a check program service entry in the Monit control
2504 file.
2505
2506 Syntax for testing specific exit value:
2507
2508 IF STATUS operator value THEN action
2509
2510 Syntax for testing any exit value change:
2511
2512 IF CHANGED STATUS THEN action
2513
2514 operator is a choice of "<",">","!=","==" in c notation, "gt", "lt",
2515 "eq", "ne" in shell sh notation and "greater", "less", "equal",
2516 "notequal" in human readable form (if not specified, default is EQUAL).
2517
2518 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
2519 "UNMONITOR".
2520
2521 Example:
2522
2523 check program myscript with path /usr/local/bin/myscript.sh
2524 if status != 0 then alert
2525
2526 Sample script for the above example (/usr/local/bin/myscript.sh):
2527
2528 #!/bin/sh
2529 echo test
2530 exit $?
2531
2532 You can also send parameters with the program:
2533
2534 check program list-files with path "/bin/ls -lrt /tmp/"
2535 if status != 0 then alert
2536
2537 Arguments to the program or script is a sequence of whitespace
2538 separated strings. In the above example the strings '-lrt' and '/tmp/'
2539 are arguments to the program '/bin/ls'. If arguments are used, it is
2540 recommended to use quotes " to enclose the string, otherwise, if no
2541 arguments are used, quotes are not needed.
2542
2543 Notes: If the program is a script, the interpreter is required in the
2544 first line. The program or script must also be executable.
2545
2546 If Monit is run as the super user, you can optionally run the program
2547 as a different user and/or group. In this example we run the ls program
2548 as user www and as group staff:
2549
2550 check program ls with path "/bin/ls /tmp" as uid "www"
2551 and gid "staff"
2552 if status != 0 then alert
2553
2554 Monit will execute the program periodically and if the exit status of
2555 the program does not match the expected result, Monit can perform an
2556 action. In the example above, Monit will raise an alert if the exit
2557 value is different from 0. By convention, 0 means the program exited
2558 normally.
2559
2560 Program checks are asynchronous. Meaning that Monit will not wait for
2561 the program to exit, but instead, Monit will start the program in the
2562 background and immediately continue checking the next service entry in
2563 monitrc. At the next cycle, Monit will check if the program has
2564 finished and if so, collect the program's exit status. If the status
2565 indicate a failure, Monit will raise an alert message containing the
2566 program's error (stderr) output, if any. If the program has not exited
2567 after the first cycle, Monit will wait another cycle and so on. If the
2568 program is still running after 5 minutes, Monit will kill it and
2569 generate a program timeout event. It is possible to override the
2570 default timeout (see the syntax below).
2571
2572 The asynchronous nature of the program check allows for non-blocking
2573 behaviour in the current Monit design, but it comes with a side-effect:
2574 when the program has finished executing and is waiting for Monit to
2575 collect the result, it becomes a so-called "zombie" process. A zombie
2576 process does not consume any system resources (only the PID remains in
2577 use) and it is under Monit's control and the zombie process is removed
2578 from the system as soon as Monit collects the exit status. This means
2579 that every "check program" will be associated with either a running
2580 process or a temporary zombie. This unwanted zombie side-effect will be
2581 removed in a later release of Monit.
2582
2583 Multiple status tests can be used, for example:
2584
2585 check program hwtest with path /usr/local/bin/hwtest.sh
2586 with timeout 500 seconds
2587 if status = 1 then alert
2588 if status = 3 for 5 cycles then exec "/usr/local/bin/emergency.sh"
2589
2590 NETWORK LINK STATUS TEST
2591 You can check the network link state. This test may only be used within
2592 a check network service entry in the Monit control file.
2593
2594 Syntax:
2595
2596 IF FAILED LINK THEN action
2597
2598 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
2599 "UNMONITOR".
2600
2601 The test will fail if the link/interface is down or link errors were
2602 detected.
2603
2604 Example:
2605
2606 check network eth0 with interface eth0
2607 if failed link then alert
2608
2609 In case a link failed you can add a start and stop program to
2610 automatically restart the interface which might help. (Substitute with
2611 the relevant network commands for your system)
2612
2613 check network eth0 with interface eth0
2614 start program = '/sbin/ipup eth0'
2615 stop program = '/sbin/ipdown eth0'
2616 if failed link then restart
2617
2618 NETWORK LINK CAPACITY TEST
2619 You can check the network link mode capacity for changes. This test may
2620 only be used within a check network service entry in the Monit control
2621 file.
2622
2623 Syntax:
2624
2625 IF CHANGED LINK [CAPACITY] THEN action
2626
2627 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
2628 "UNMONITOR".
2629
2630 The test will match if the link mode has changed (e.g. maximum speed
2631 dropped) or if the duplex mode has changed.
2632
2633 NOTE: not all interface types allow for capacity monitoring. Pseudo
2634 interfaces such as loopback device or VMWare interfaces does not have a
2635 speed attribute.
2636
2637 Example:
2638
2639 check network eth0 with interface eth0
2640 if changed link capacity then alert
2641
2642 NETWORK SATURATION TEST
2643 You can check the network link saturation. Monit then computes the link
2644 utilisation based on the current transfer rate vs. link capacity. This
2645 test may only be used within a check network service entry in the Monit
2646 control file.
2647
2648 Syntax:
2649
2650 IF SATURATION operator value% THEN action
2651
2652 operator is a choice of "<",">","!=","==" in c notation, "gt", "lt",
2653 "eq", "ne" in shell sh notation and "greater", "less", "equal",
2654 "notequal" in human readable form (if not specified, default is EQUAL).
2655
2656 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
2657 "UNMONITOR".
2658
2659 NOTE: this test depends on the availability of the speed attribute and
2660 not all interface types have this attribute. See the LINK SPEED test
2661 description.
2662
2663 Example:
2664
2665 check network eth0 with interface eth0
2666 if saturation > 90% then alert
2667
2668 NETWORK BANDWIDTH TEST
2669 You can check a network link upload and download bandwidth usage,
2670 current transfer speed and total data transferred in the last 24 hours.
2671 This test may only be used within a check network service entry in the
2672 Monit control file.
2673
2674 Upload speed test syntax (per second):
2675
2676 IF UPLOAD operator value unit/S THEN action
2677
2678 Download speed test syntax (per second):
2679
2680 IF DOWNLOAD operator value unit/S THEN action
2681
2682 Total upload data test syntax:
2683
2684 IF TOTAL UPLOADED operator value unit IN LAST number time-unit THEN action
2685
2686 Total download data test syntax:
2687
2688 IF TOTAL DOWNLOADED operator value unit IN LAST number time-unit THEN action
2689
2690 operator is a choice of "<",">","!=","==" in c notation, "gt", "lt",
2691 "eq", "ne" in shell sh notation and "greater", "less", "equal",
2692 "notequal" in human readable form (if not specified, default is EQUAL).
2693
2694 unit is a choice of "B","KB","MB","GB" or long alternatives "byte",
2695 "kilobyte", "megabyte", "gigabyte".
2696
2697 time-unit is a choice of "MINUTE(S)", "HOUR(S)", "DAY". NOTE: Monit
2698 maintains a rolling count of total uploaded and downloaded bytes for
2699 the last 24 hours only. The value of time-unit can therefor not specify
2700 a range wider than one day.
2701
2702 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
2703 "UNMONITOR".
2704
2705 Examples:
2706
2707 check network eth0 with interface eth0
2708 if upload > 500 kB/s then alert
2709 if total downloaded > 1 GB in last 2 hours then alert
2710 if total downloaded > 10 GB in last day then alert
2711
2712 NETWORK PACKETS TEST
2713 You can check the network link upload and download packets count,
2714 current transfer rate and total data transferred in last 24 hours. This
2715 test may only be used within a check network service entry in the Monit
2716 control file.
2717
2718 Current upload bandwidth rate test syntax:
2719
2720 IF UPLOAD operator value PACKETS/S THEN action
2721
2722 Current download bandwidth rate test syntax:
2723
2724 IF DOWNLOAD operator value PACKETS/S THEN action
2725
2726 Total upload test syntax:
2727
2728 IF TOTAL UPLOADED operator value PACKETS IN LAST number time-unit THEN action
2729
2730 Total download test syntax:
2731
2732 IF TOTAL DOWNLOADED operator value PACKETS IN LAST number time-unit THEN action
2733
2734 operator is a choice of "<",">","!=","==" in c notation, "gt", "lt",
2735 "eq", "ne" in shell sh notation and "greater", "less", "equal",
2736 "notequal" in human readable form (if not specified, default is EQUAL).
2737
2738 time-unit is a choice of "MINUTE(S)", "HOUR(S)", "DAY". NOTE: Monit
2739 keeps total upload/download statistics only for the last 24 hours. The
2740 time-unit value cannot therefor span more than one day.
2741
2742 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
2743 "UNMONITOR".
2744
2745 Examples:
2746
2747 check network eth0 with interface eth0
2748 if upload > 1000 packets/s then alert
2749 if total uploaded > 900000 packets in last hour then alert
2750
2751 NETWORK PING TEST
2752 Monit can perform a network ping test by sending ICMP echo request
2753 datagram packets to a host and wait for the reply. This test can only
2754 be used within a check host statement. Monit must also run as the root
2755 user in order to be able to perform the ping test (because the ping
2756 test must use raw sockets which usually only the super user is allowed
2757 to).
2758
2759 Syntax:
2760
2761 IF FAILED PING[4|6]
2762 [COUNT number]
2763 [SIZE number]
2764 [TIMEOUT number SECONDS]
2765 [ADDRESS string]
2766 THEN action
2767
2768 If a DNS host name was used in the check host statement and the host
2769 name resolve to several addresses (either IPv4 or IPv6), Monit will
2770 ping the first available address and continue with the next address
2771 until one connection succeed or until there are no more addresses left
2772 to try. You can force Monit to only ping IPv4 or IPv6 addresses by
2773 using the PING4 or the PING6 keyword instead of PING.
2774
2775 The COUNT parameter specifies how many consecutive ping requests will
2776 be sent to the host in one cycle at maximum. The default value is 3.
2777
2778 The SIZE parameter specifies the ping request data size. Default is 64
2779 bytes.
2780
2781 If no reply arrive within TIMEOUT seconds, Monit reports an error. If
2782 at least one reply was received, the ping test is considered a success.
2783
2784 The ADDRESS parameter specifies source IP address.
2785
2786 Monit will, by default, send up to three ping request packets in one
2787 cycle to prevent false alarm (i.e. up to 66% packet loss is tolerated).
2788 You can set the COUNT option to a value between 1 and 20 to send more
2789 or fewer packets. If you require 100% ping success, set the count to 1
2790 (i.e. just one request will be sent, and if the packet was lost an
2791 error will be reported).
2792
2793 Note that many ISPs have started to filter out ping or ICMP packets
2794 now, in which case there will be no reply from the host.
2795
2796 If a ping test is used in a check host entry, this test is run first
2797 and if the test should fail, we assume that the connection to the host
2798 is down and Monit will not continue with any subsequent port tests.
2799
2800 Example:
2801
2802 check host mmonit.com with address mmonit.com
2803 if failed ping then alert # IPv4 or IPv6
2804
2805 check host mmonit.com with address 62.109.39.247
2806 if failed ping then alert # Address is IPv4 so IPv4 is preferred
2807
2808 or test that the system is explicit accessible via IPv4 and IPv6:
2809
2810 check host mmonit.com with address mmonit.com
2811 if failed ping4 then alert # IPv4 only
2812 if failed ping6 then alert # IPv6 only
2813
2814 or with all parameters; Send five 128 byte pings to mmonit.com and wait
2815 for up to 10 seconds for a reply
2816
2817 check host mmonit.com with address mmonit.com
2818 if failed ping count 5 size 128 with timeout 10 seconds then alert
2819
2820 CONNECTION TESTS
2821 Monit can perform connection testing via network ports or via Unix
2822 sockets. A connection test may only be used within a process or host
2823 service type context.
2824
2825 If a service listens on one or more sockets, Monit can connect to the
2826 port (using TCP or UDP) and verify that the service will accept a
2827 connection and that it is possible to write and read from the socket.
2828 If a connection is not accepted or if there is a problem with socket
2829 I/O, Monit will execute a specified action.
2830
2831 TCP/UDP port test syntax:
2832
2833 IF FAILED
2834 [HOST string]
2835 <PORT number>
2836 [ADDRESS string]
2837 [IPV4 | IPV6]
2838 [TYPE <TCP|UDP>]
2839 [<SSL|TLS> [with options {...}]
2840 [CERTIFICATE CHECKSUM [MD5|SHA1] string]
2841 [CERTIFICATE VALID for number DAYS]
2842 [PROTOCOL protocol | <SEND|EXPECT> "string",...]
2843 [TIMEOUT number SECONDS]
2844 [RETRY number]
2845 THEN action
2846
2847 Unix socket test syntax:
2848
2849 IF FAILED
2850 <UNIXSOCKET path>
2851 [TYPE <TCP|UDP>]
2852 [PROTOCOL protocol | <SEND|EXPECT> "string",...]
2853 [TIMEOUT number SECONDS]
2854 [RETRY number]
2855 THEN action
2856
2857 Examples:
2858
2859 if failed port 80 then alert
2860
2861 if failed port 53 type udp protocol dns then alert
2862
2863 if failed unixsocket /var/run/sophie then alert
2864
2865 Options:
2866
2867 HOST hostname. Optionally specify the host to connect to. If the host
2868 is not given then localhost is assumed if this test is used inside a
2869 process entry. If this test is used inside a remote host entry then the
2870 entry's remote host is assumed.
2871
2872 PORT number. The port number to connect to
2873
2874 UNIXSOCKET path. Specifies the path to a Unix socket (local machine
2875 only).
2876
2877 ADDRESS string. The source IP address to use.
2878
2879 IPV4 | IPV6 . Optionally specify the IP version Monit should use when
2880 trying to connect to the port. If not used, Monit will try to connect
2881 to the first available address (IPv4 or IPv6). If multiple addresses
2882 are available and connection to one address failed, Monit will try the
2883 next address and so on until a connection succeed or until there are no
2884 more addresses left to try.
2885
2886 TYPE <TCP | UDP]>. Optionally specify the socket type Monit should use
2887 when trying to connect to the port. The different socket types are: TCP
2888 or UDP, where TCP is a regular stream based socket, UDP, a datagram
2889 socket. The default socket type is TCP.
2890
2891 [SSL | TLS] [with options {...}]. Set SSL/TLS options and override
2892 global/default SSL options. You can set the SSL/TLS version to use,
2893 whether to verify certificates, trust self-signed certificates or set
2894 the SSL client certificates database-file for client certificate
2895 authentication.
2896
2897 CERTIFICATE CHECKSUM [MD5|SHA1] hash. Verify the SSL server certificate
2898 by checking its checksum. You can use either MD5 or SHA1 checksum (if
2899 you don't specify the type, Monit will determine the digest based on
2900 the hash length). You can use the openssl command line tool to get the
2901 checksum value for your certificate, which you can then use in Monit's
2902 control file:
2903
2904 openssl x509 -fingerprint -sha1 -in server.crt | head -1 | cut -f2 -d'='
2905
2906 Example:
2907
2908 if failed
2909 port 443
2910 protocol https
2911 and certificate checksum = "1ED948A6F4258ACAB964227EF4EB19FCC453B0F8"
2912 then alert
2913
2914 CERTIFICATE VALID for number DAYS. Send an alert if the certificate
2915 will expire in the given number of days. This test is pretty useful to
2916 get a notification when it is time to renew your SSL certificate.
2917
2918 Example:
2919
2920 if failed
2921 port 443
2922 protocol https
2923 and certificate valid > 30 days
2924 then alert
2925
2926 PROTOCOL protocol. Optionally specify the protocol Monit should speak
2927 when a connection is established. At the moment Monit knows how to
2928 speak:
2929 APACHE-STATUS
2930 DNS
2931 DWP
2932 FAIL2BAN
2933 FTP
2934 GPS
2935 HTTP
2936 HTTPS
2937 IMAP
2938 IMAPS
2939 CLAMAV
2940 LDAP2
2941 LDAP3
2942 LMTP
2943 MEMCACHE
2944 MONGODB
2945 MYSQL
2946 NNTP
2947 NTP3
2948 PGSQL
2949 POP
2950 POPS
2951 POSTFIX-POLICY
2952 RADIUS
2953 RDATE
2954 REDIS
2955 RSYNC
2956 SIEVE
2957 SIP
2958 SMTP
2959 SMTPS
2960 SPAMASSASSIN
2961 SSH
2962 TNS
2963 WEBSOCKET
2964
2965 If the target server's protocol is not found in this list, simply do
2966 not specify the protocol and Monit will use a default connection test.
2967
2968 TIMEOUT number SECONDS. Optionally specifies the connect and read
2969 timeout for the connection. If Monit cannot connect to the server
2970 within this time it will assume that the connection failed and execute
2971 the specified action. The default connect timeout is 5 seconds.
2972
2973 RETRY number. Optionally specifies the number of consecutive retries
2974 within the same testing cycle in the case that the connection failed.
2975 The default is fail on first error.
2976
2977 action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC" or
2978 "UNMONITOR".
2979
2980 Specific protocol test options
2981
2982 GENERIC (SEND/EXPECT)
2983
2984 If Monit does not support the protocol spoken by the server, you can
2985 write your own protocol-test using send and expect strings. The SEND
2986 statement sends a string to the server port and the EXPECT statement
2987 compares a string read from the server with the string given in the
2988 expect statement.
2989
2990 Syntax:
2991
2992 [<SEND|EXPECT> "string"]+
2993
2994 Monit will send a string as it is, and you must remember to include CR
2995 and LF in the string sent to the server if the protocol expects such
2996 characters to terminate a string (most text based protocols used over
2997 Internet do).
2998
2999 Monit will by default read up to 255 bytes from the server and use this
3000 string when comparing the EXPECT string. You can override the default
3001 value using the set limits statement.
3002
3003 You can use non-printable characters in a SEND string if needed. Use
3004 the hex notation, \0xHEXHEX to send any char in the range \0x00-\0xFF,
3005 that is, 0-255 in decimal. For example, to test a Quake 3 server:
3006
3007 send "\0xFF\0xFF\0xFF\0xFFgetstatus"
3008 expect "sv_floodProtect|sv_maxPing"
3009
3010 If your system supports POSIX regular expressions, you can use regular
3011 expressions in the EXPECT string, see regex(7) to learn more about the
3012 types of regular expressions you can use in an expect string.
3013
3014 Since both regex and string compare operates on a zero terminated
3015 string, you cannot test for '\0' in an EXPECT buffer since this
3016 character marks the end of the buffer. However, we escape '\0' in the
3017 expect buffer as "\0" which you can test for. That is, '\' followed by
3018 the ascii value for 0. For instance, here is how to test for an expect
3019 string that starts with zero followed by any number of characters.
3020
3021 expect "^[\\]0.*"
3022
3023 Here is a simple SMTP protocol example:
3024
3025 if failed
3026 port 25 and
3027 expect "^220.*"
3028 send "HELO localhost.localdomain\r\n"
3029 expect "^250.*"
3030 send "QUIT\r\n"
3031 then alert
3032
3033 SEND/EXPECT can be used with any socket type, such as TCP sockets, UNIX
3034 sockets and UDP sockets.
3035
3036 HTTP
3037
3038 Syntax:
3039
3040 PROTO(COL) HTTP
3041 [USERNAME "string"]
3042 [PASSWORD "string"]
3043 [REQUEST "string"]
3044 [METHOD <GET|HEAD>]
3045 [STATUS operator number]
3046 [CHECKSUM checksum]
3047 [HTTP HEADERS list of headers]
3048 [CONTENT < "=" | "!=" > STRING]
3049
3050 USERNAME is an optional username for Basic authentication
3051
3052 PASSWORD is an optional password for Basic authentication
3053
3054 REQUEST option can set an URL string specifying a document on the HTTP
3055 server. If the request statement isn't specified, the default "/" page
3056 will be requested.
3057
3058 For example:
3059
3060 if failed
3061 port 80
3062 protocol http
3063 request "/data/show?a=b&c=d"
3064 then restart
3065
3066 METHOD set the HTTP request method. If not specified, Monit prefers the
3067 HTTP HEAD request method to save bandwidth, unless a response content
3068 or response checksum is tested. As some webservers may not support the
3069 HEAD method, one may want to set the method explicitly.
3070
3071 STATUS option can be used to explicitly test the HTTP status code
3072 returned by the HTTP server. If not used, the HTTP protocol test will
3073 fail if the status code returned is greater than or equal to 400. You
3074 can override this behaviour by using the status qualifier.
3075
3076 For example to test that a page does not exist (the HTTP server should
3077 return 404 in this case):
3078
3079 if failed
3080 port 80
3081 protocol http
3082 request "/non/existent.php"
3083 status = 404
3084 then alert
3085
3086 CHECKSUM You can test the checksum of documents returned by a HTTP
3087 server. Either MD5 or SHA1 hash can be used. Monit will not test the
3088 checksum for a document if the server does not set the HTTP Content-
3089 Length header. A HTTP server should set this header when it server a
3090 static document (i.e. a file). There are no limitation on the document
3091 size, but keep in mind that Monit will use time to download the
3092 document over the network to compute the checksum.
3093
3094 Example:
3095
3096 if failed
3097 port 80
3098 protocol http
3099 request "/page.html"
3100 checksum 8f7f419955cefa0b33a2ba316cba3659
3101 then alert
3102
3103 HTTP HEADERS can be used to send a list of HTTP headers when using the
3104 HTTP protocol test. For instance, the host header. If the host header
3105 is not set, Monit will use the hostname or IP-address of the host as
3106 specified in the check host statement. Specifying a host header is
3107 useful if you want to connect to and test a name-based virtual host.
3108 The syntax for setting HTTP headers is
3109
3110 http headers [name:value, name:value,..]
3111
3112 where each name:value pair is separated with ','. If you need to use
3113 ':' in the value string, for instance to set port number for a host
3114 header, you must enclose the value in quotes. For example,
3115
3116 http headers [Host: "mmonit.com:443"]
3117
3118 In a check host context, using this statement might look like
3119
3120 check host mmonit.com with address mmonit.com
3121 if failed
3122 port 80 protocol http
3123 with http headers [Host: mmonit.com, Cache-Control: no-cache,
3124 Cookie: csrftoken=nj1bI3CnMCaiNv4beqo8ZaCfAQQvpgLH]
3125 and request /monit/ with content = "Monit [0-9.]+"
3126 then alert
3127
3128 Setting HTTP headers is associated with the HTTP protocol test and must
3129 come before request as in the example above.
3130
3131 The CONTENT option sets the pattern which is expected in the data
3132 returned by the server. If the pattern doesn't match, the test fails.
3133 In the example above, if the server does not return a page with the
3134 name Monit followed by a version number the test will fail.
3135
3136 By default, at maximum 1MB of content is inspected. You can increase
3137 this limit using the set limits statement.
3138
3139 For example:
3140
3141 if failed
3142 port 80
3143 protocol http
3144 content = "foobar [0-9.]+"
3145 then alert
3146
3147 APACHE-STATUS
3148
3149 The APACHE-STATUS test allows one to check server performance by
3150 examination of the status page generated by Apache's mod_status, which
3151 is expected to be at its default address of
3152 http://www.example.com/server-status.
3153
3154 Syntax:
3155
3156 PROTOCOL APACHE-STATUS [PATH <path>] [USERNAME <string>] [PASSWORD <string>] [<property> <operator> <number>]+
3157
3158 PATH is an optional path to apache status ("/server-status" by default)
3159
3160 USERNAME is an optional username for Basic authentication
3161
3162 PASSWORD is an optional password for Basic authentication
3163
3164 property is acronym for child status:
3165
3166 (1) logging (loglimit)
3167 (2) closing connections (closelimit)
3168 (3) performing DNS lookups (dnslimit)
3169 (4) in keepalive with a client (keepalivelimit)
3170 (5) replying to a client (replylimit)
3171 (6) receiving a request (requestlimit)
3172 (7) initialising (startlimit)
3173 (8) waiting for incoming connections (waitlimit)
3174 (9) gracefully closing down (gracefullimit)
3175 (10) performing cleanup procedures (cleanuplimit)
3176
3177 operator is one of "<", "=", ">".
3178
3179 number is percentile numeric limit.
3180
3181 Each of these limits can be compared against a value relative to the
3182 total number of active Apache child processes.
3183
3184 You can combine all of these tests into one expression or you can
3185 choose to test a certain limit only. If you combine the limits you must
3186 connect them together using the OR keyword.
3187
3188 Example:
3189
3190 if failed port 80 protocol apache-status
3191 loglimit > 10% or
3192 dnslimit > 50% or
3193 waitlimit < 20%
3194 then alert
3195
3196 MYSQL
3197
3198 Syntax:
3199
3200 PROTOCOL MYSQL [USERNAME string PASSWORD string]
3201
3202 USERNAME MySQL username (maximum 16 characters).
3203
3204 PASSWORD MySQL password (special characters can be used, but for non-
3205 alphanumerics the password has to be quoted).
3206
3207 Username and password (credentials) are optional and if not set, Monit
3208 will perform the test using anonymous login. This can cause an
3209 authentication error to be logged in your MySQL log, depending on your
3210 MySQL configuration.
3211
3212 If credentials are set, Monit will login and perform a MySQL ping test.
3213 Monit does not require any database privileges, it just needs the
3214 database user. You might want to create standalone user for Monit to
3215 use when testing, for example:
3216
3217 CREATE USER 'monit'@'host_from_which_monit_performs_testing' IDENTIFIED BY 'mysecretpassword';
3218 FLUSH PRIVILEGES;
3219
3220 Example:
3221
3222 check process mysql with pidfile /var/run/mysqld/mysqld.pid
3223 start program = "/sbin/start mysql"
3224 stop program = "/sbin/stop mysql"
3225 if failed
3226 port 3306
3227 protocol mysql username "foo" password "bar"
3228 then alert
3229
3230 or with unix-socket and OS X start/stop commands
3231
3232 check process mysql with pidfile /var/run/mysqld/mysqld.pid
3233 start program = "/usr/local/mysql/support-files/mysql.server start"
3234 stop program = "/usr/local/mysql/support-files/mysql.server stop"
3235 if failed
3236 unixsocket /tmp/mysql.sock
3237 protocol mysql username "foo" password "bar"
3238 then alert
3239
3240 RADIUS
3241
3242 Syntax:
3243
3244 PROTOCOL RADIUS [SECRET string]
3245
3246 SECRET you may specify an alternative secret, default is "testing123".
3247
3248 For example:
3249
3250 check process radiusd with pidfile /var/run/radiusd.pid
3251 start program = "/etc/init.d/freeradius start"
3252 stop program = "/etc/init.d/freeradius stop"
3253 if failed
3254 host 127.0.0.1 port 1812 type udp protocol radius
3255 secret pingpong
3256 then alert
3257
3258 SIP
3259
3260 The SIP protocol is used by communication platform servers such as
3261 Asterisk and FreeSWITCH.
3262
3263 Syntax:
3264
3265 PROTOCOL SIP [TARGET valid@uri] [MAXFORWARD n]
3266
3267 TARGET you may specify an alternative recipient for the message, by
3268 adding a valid sip uri after this keyword.
3269
3270 MAXFORWARD Limit the number of proxies or gateways that can forward the
3271 request to the next server. It's value is an integer in the range
3272 0-255, set by default to 70. If max-forward = 0, the next server may
3273 respond 200 OK (test succeeded) or send a 483 Too Many Hops (test
3274 failed)
3275
3276 For example:
3277
3278 check host openser_all with address 127.0.0.1
3279 if failed
3280 port 5060 type udp protocol sip
3281 with target "localhost:5060" and maxforward 6
3282 then alert
3283
3284 SMTP
3285
3286 Syntax:
3287
3288 PROTOCOL SMTP[S] [USERNAME string PASSWORD string]
3289
3290 USERNAME SMTP username.
3291
3292 PASSWORD SMTP password (special characters can be used, but for non-
3293 alphanumerics the password has to be quoted).
3294
3295 Credentials are optional and when used will perform authentication
3296 during testing so you can test that authentication also works. We
3297 recommend using smtps if authentication is to be used to encrypt the
3298 communication. If no credentials are set, Monit will just perform a
3299 basic protocol test.
3300
3301 Example:
3302
3303 check process postfix with pidfile /var/spool/postfix/pid/master.pid
3304 start program = "/etc/init.d/postfix start"
3305 stop program = "/etc/init.d/postfix stop"
3306 if failed
3307 port 25
3308 protocol smtp
3309 then alert
3310
3311 Example using authentication and STARTTLS/SMTPS:
3312
3313 check process postfix with pidfile /var/spool/postfix/pid/master.pid
3314 start program = "/etc/init.d/postfix start"
3315 stop program = "/etc/init.d/postfix stop"
3316 if failed
3317 port 25
3318 protocol smtps
3319 username "foo"
3320 password "bar"
3321 then alert
3322
3323 WEBSOCKET
3324
3325 Syntax:
3326
3327 PROTOCOL WEBSOCKET
3328 [REQUEST string]
3329 [HOST string]
3330 [ORIGIN string]
3331 [VERSION number]
3332
3333 HOST you may specify an alternative Host header
3334
3335 REQUEST you may specify an alternative request, default is "/"
3336
3337 ORIGIN you may specify an alternative origin, default is
3338 "http://www.mmonit.com"
3339
3340 VERSION you may specify an alternative version, default is "0"
3341
3342 For example:
3343
3344 check host websocket.org with address "echo.websocket.org"
3345 if failed
3346 port 80 protocol websocket
3347 host "echo.websocket.org"
3348 request "/"
3349 origin 'http://websocket.com'
3350 version 13
3351 then alert
3352
3354 M/Monit <https://mmonit.com> expands on Monit's capabilities and
3355 provides monitoring and management of all your Monit enabled hosts.
3356
3357 M/Monit uses Monit as an agent. With regular intervals, Monit sends a
3358 status message to M/Monit with a snapshot of the host it is running on.
3359
3360 M/Monit presents the collected data in charts and event logs and give
3361 you the option to view key performance data of all your hosts in a
3362 modern, clean and well designed user interface which also works on
3363 mobile devices.
3364
3365 From M/Monit, you can also start, stop and restart services on your
3366 hosts running Monit.
3367
3368 To send data to M/Monit, add the following statement to your Monit
3369 control file:
3370
3371 SET MMONIT <url>
3372 [TIMEOUT <number> SECONDS]
3373 [REGISTER WITHOUT CREDENTIALS]
3374
3375 Example:
3376
3377 set mmonit https://monit:monit@192.168.1.10:8443/collector
3378
3379 Monit will register itself in M/Monit and will start sending status and
3380 event messages to M/Monit. We recommend using https as in the example
3381 above to ensure that the communication between Monit and M/Monit is
3382 secure.
3383
3384 The password should be URL encoded if it contains URL-significant
3385 characters like ":", "?", "@".
3386
3387 The default timeout is 5 seconds, you can customise the timeout using
3388 the TIMEOUT option.
3389
3390 When Monit registers itself in M/Monit it sends credentials that can be
3391 used to perform service actions from M/Monit. You can disable sending
3392 credentials by using REGISTER WITHOUT CREDENTIALS and instead manually
3393 add credentials in M/Monit.
3394
3396 The simplest form is just the check statement. In this example we check
3397 to see if our web server is running and raise an alert if not:
3398
3399 check process nginx with pidfile /var/run/nginx.pid
3400
3401 To have Monit start the server if it's not running, add a start
3402 statement:
3403
3404 check process nginx with pidfile /var/run/nginx.pid
3405 start program = "/etc/init.d/nginx start"
3406
3407 Here's a more advanced example for monitoring an apache web-server
3408 listening on the default port number for HTTP and HTTPS. In this
3409 example Monit will restart apache if it's not accepting connections at
3410 the port numbers. The method Monit use for restart is to first execute
3411 the stop-program, then wait (up to 30s) for the process to stop and
3412 then execute the start-program and wait (30s) for it to start. The
3413 length of start or stop wait can be overridden using the 'timeout'
3414 option. If Monit was unable to stop or start the service a failed alert
3415 message will be sent if you have requested alert messages to be sent.
3416
3417 check process apache with pidfile /var/run/httpd.pid
3418 start program = "/etc/init.d/httpd start" with timeout 60 seconds
3419 stop program = "/etc/init.d/httpd stop"
3420 if failed port 80 for 2 cycles then restart
3421 if failed port 443 for 2 cycles then restart
3422
3423 This example demonstrate how you can run a program as a specified user
3424 (uid) and with a specified group (gid). Many daemon programs can do the
3425 uid and gid switch by themselves, but for those programs that does not
3426 (e.g. Java programs), monit's ability to start a program as a certain
3427 user can be very useful. In this example we start the Tomcat Java
3428 Servlet Engine as the standard nobody user and group. Please note that
3429 Monit can only switch uid and gid for the program if the super-user is
3430 running Monit, otherwise Monit will simply ignore the request to change
3431 uid and gid.
3432
3433 check process tomcat with pidfile /var/run/tomcat.pid
3434 start program = "/etc/init.d/tomcat start"
3435 as uid "nobody" and gid "nobody"
3436 stop program = "/etc/init.d/tomcat stop"
3437 # You can also use id numbers instead and write:
3438 as uid 99 and with gid 99
3439 if failed port 8080 then alert
3440
3441 In this example we use udp for connection testing to check if the name-
3442 server is running:
3443
3444 check process named with pidfile /var/run/named.pid
3445 start program = "/etc/init.d/named start"
3446 stop program = "/etc/init.d/named stop"
3447 if failed port 53 use type udp protocol dns then restart
3448
3449 The following example illustrates how to check if the service 'sophie'
3450 is answering connections on its Unix domain socket:
3451
3452 check process sophie with pidfile /var/run/sophie.pid
3453 start program = "/etc/init.d/sophie start"
3454 stop program = "/etc/init.d/sophie stop"
3455 if failed unix /var/run/sophie then restart
3456
3457 In this example we check an apache web-server running on localhost
3458 which answers for several IP-based virtual hosts or vhosts, hence the
3459 host statement before port:
3460
3461 check process apache with pidfile /var/run/httpd.pid
3462 start "/etc/init.d/httpd start"
3463 stop "/etc/init.d/httpd stop"
3464 if failed host www.sol.no port 80 then alert
3465 if failed host shop.sol.no port 443 then alert
3466 if failed host chat.sol.no port 80 then alert
3467
3468 To make sure that Monit is communicating with a HTTP server a protocol
3469 test can be added:
3470
3471 check process apache with pidfile /var/run/httpd.pid
3472 start "/etc/init.d/httpd start"
3473 stop "/etc/init.d/httpd stop"
3474 if failed
3475 host www.sol.no port 80 protocol http
3476 then alert
3477
3478 This example demonstrate a different way to check a web-server using
3479 the send/expect mechanism:
3480
3481 check process apache with pidfile /var/run/httpd.pid
3482 start "/etc/init.d/httpd start"
3483 stop "/etc/init.d/httpd stop"
3484 if failed
3485 host www.sol.no port 80 and
3486 send "GET / HTTP/1.1\r\nHost: www.sol.no\r\n\r\n"
3487 expect "HTTP/[0-9\.]{3} 200.*"
3488 then alert
3489
3490 Here we ping a remote host to check if it is up and if not, send an
3491 alert:
3492
3493 check host www.tildeslash.com with address www.tildeslash.com
3494 if failed ping then alert
3495
3496 In the following example we ask Monit to compute and verify the
3497 checksum for the underlying apache binary used by the start and stop
3498 programs. If the checksum test should fail, monitoring will be disabled
3499 to prevent possibly restarting a compromised binary:
3500
3501 check process apache with pidfile /var/run/httpd.pid
3502 start program = "/etc/init.d/httpd start"
3503 stop program = "/etc/init.d/httpd stop"
3504 if failed host www.tildeslash.com port 80 then restart
3505 depends on apache_bin
3506
3507 check file apache_bin with path /usr/local/apache/bin/httpd
3508 if failed checksum then unmonitor
3509
3510 In this example we ask Monit to test a document's checksum on a remote
3511 server. If the checksum was changed we send an alert:
3512
3513 check host mmonit.com with address mmonit.com
3514 if failed
3515 port 80 protocol http and
3516 request "/monit/dist/monit-5.7.tar.gz"
3517 with checksum f9d26b8393736b5dfad837bb13780786
3518 then alert
3519
3520 Here are a couple of tests for some popular communication servers,
3521 using the SIP protocol. First we test a FreeSWITCH server and then an
3522 Asterisk server
3523
3524 check process freeswitch
3525 with pidfile /usr/local/freeswitch/log/freeswitch.pid
3526 start program = "/usr/local/freeswitch/bin/freeswitch -nc -hp"
3527 stop program = "/usr/local/freeswitch/bin/freeswitch -stop"
3528 if total memory > 1000.0 MB for 5 cycles then alert
3529 if total memory > 1500.0 MB for 5 cycles then alert
3530 if total memory > 2000.0 MB for 5 cycles then restart
3531 if cpu > 60% for 5 cycles then alert
3532 if failed
3533 port 5060 type udp protocol SIP
3534 target me@foo.bar and maxforward 10
3535 then restart
3536
3537 check process asterisk
3538 with pidfile /var/run/asterisk/asterisk.pid
3539 start program = "/usr/sbin/asterisk"
3540 stop program = "/usr/sbin/asterisk -r -x 'shutdown now'"
3541 if total memory > 1000.0 MB for 5 cycles then alert
3542 if total memory > 1500.0 MB for 5 cycles then alert
3543 if total memory > 2000.0 MB for 5 cycles then restart
3544 if cpu > 60% for 5 cycles then alert
3545 if failed
3546 port 5060 type udp protocol SIP
3547 and target me@foo.bar maxforward 10
3548 then restart
3549
3550 Some servers are slow starters, like for example Java based Application
3551 Servers. If we want to keep the poll-cycle low (i.e. < 60 seconds) but
3552 allow some services to take its time to start, the every statement is
3553 handy:
3554
3555 check process dynamo with pidfile /etc/dynamo.pid every 2 cycles
3556 start program = "/etc/init.d/dynamo start"
3557 stop program = "/etc/init.d/dynamo stop"
3558 if failed port 8840 then alert
3559
3560 Here is an example where we group together two database entries so you
3561 can manage them together, e.g.; 'Monit -g database start all'. The mode
3562 statement is also illustrated in the first entry and have the effect
3563 that Monit will not try to (re)start this service if it is not running:
3564
3565 check process sybase with pidfile /var/run/sybase.pid
3566 start = "/etc/init.d/sybase start"
3567 stop = "/etc/init.d/sybase stop"
3568 mode passive
3569 group database
3570
3571 check process oracle with pidfile /var/run/oracle.pid
3572 start program = "/etc/init.d/oracle start"
3573 stop program = "/etc/init.d/oracle stop"
3574 if failed
3575 port 9001 protocol tns
3576 then restart
3577 group database
3578
3579 This resource checks example will send an alert if CPU usage of the
3580 Apache's HTTP daemon and its child processes goes beyond 60% for two
3581 cycles. Apache is restarted if the CPU usage is over 80% for five
3582 cycles or the memory usage is over 100Mb for five cycles:
3583
3584 check process apache with pidfile /var/run/httpd.pid
3585 start program = "/etc/init.d/httpd start"
3586 stop program = "/etc/init.d/httpd stop"
3587 if cpu > 40% for 2 cycles then alert
3588 if total cpu > 60% for 2 cycles then alert
3589 if total cpu > 80% for 5 cycles then restart
3590 if mem > 100 MB for 5 cycles then stop
3591
3592 This examples demonstrate the timestamp statement with exec and how you
3593 may restart apache if its configuration file was changed.
3594
3595 check file httpd.conf with path /etc/httpd/httpd.conf
3596 if changed timestamp
3597 then exec "/etc/init.d/httpd graceful"
3598
3599 In this example we demonstrate usage of the extended alert statement
3600 and a file check dependency:
3601
3602 check process apache with pidfile /var/run/httpd.pid
3603 start = "/etc/init.d/httpd start"
3604 stop = "/etc/init.d/httpd stop"
3605 alert admin@bar on {nonexist, timeout}
3606 with mail-format {
3607 from: bofh@$HOST
3608 subject: apache $EVENT - $ACTION
3609 message: This event occurred on $HOST at $DATE.
3610 Your faithful employee,
3611 monit
3612 }
3613 if failed host www.tildeslash.com port 80 then restart
3614 depend httpd_bin
3615 group apache
3616
3617 check file httpd_bin with path /usr/local/apache/bin/httpd
3618 alert security@bar on {checksum, timestamp,
3619 permission, uid, gid}
3620 with mail-format {subject: Alaaarrm! on $HOST}
3621 if failed checksum
3622 and expect 8f7f419955cefa0b33a2ba316cba3659
3623 then unmonitor
3624 if failed permission 755 then unmonitor
3625 if failed uid "root" then unmonitor
3626 if failed gid "root" then unmonitor
3627 if changed timestamp then alert
3628 group apache
3629
3630 In this example, we demonstrate usage of the depend statement. In this
3631 case, we want to start oracle and apache. However, we've set up apache
3632 to use oracle as a back end, and if oracle is restarted, apache must be
3633 restarted as well.
3634
3635 check process apache with pidfile /var/run/httpd.pid
3636 start = "/etc/init.d/httpd start"
3637 stop = "/etc/init.d/httpd stop"
3638 depends on oracle
3639
3640 check process oracle with pidfile /var/run/oracle.pid
3641 start = "/etc/init.d/oracle start"
3642 stop = "/etc/init.d/oracle stop"
3643 if failed port 9001 for 5 cycles then restart
3644
3645 Next, we have 2 services, oracle-import and oracle-export that need to
3646 be restarted if oracle is restarted, but are independent of each other.
3647
3648 check process oracle with pidfile /var/run/oracle.pid
3649 start = "/etc/init.d/oracle start"
3650 stop = "/etc/init.d/oracle stop"
3651 if failed port 9001 for 3 cycles then restart
3652
3653 check process oracle-import
3654 with pidfile /var/run/oracle-import.pid
3655 start = "/etc/init.d/oracle-import start"
3656 stop = "/etc/init.d/oracle-import stop"
3657 depends on oracle
3658
3659 check process oracle-export
3660 with pidfile /var/run/oracle-export.pid
3661 start = "/etc/init.d/oracle-export start"
3662 stop = "/etc/init.d/oracle-export stop"
3663 depends on oracle
3664
3666 ~/.monitrc
3667 Default run control file
3668
3669 /etc/monitrc
3670 If the control file is not found in the default
3671 location and /etc contains a monitrc file, this
3672 file will be used instead.
3673
3674 ./monitrc
3675 If the control file is not found in either of the
3676 previous two locations, and the current working
3677 directory contains a monitrc file, this file is
3678 used instead.
3679
3680 ~/.monit.pid
3681 Lock file to help prevent concurrent runs (non-root
3682 mode).
3683
3684 /run/monit.pid
3685 Lock file to help prevent concurrent runs (root mode,
3686 Linux systems, if /run directory is available).
3687
3688 /var/run/monit.pid
3689 Lock file to help prevent concurrent runs (root mode,
3690 Linux systems).
3691
3692 /etc/monit.pid
3693 Lock file to help prevent concurrent runs (root mode,
3694 systems without /var/run).
3695
3696 ~/.monit.state
3697 Monit saves its state to this file and utilises
3698 information found in this file to recover from
3699 a crash. This is a binary file and its content is
3700 only of interest to monit. You may set the location
3701 of this file in the Monit control file or by using
3702 the -s switch when Monit is started.
3703
3704 ~/.monit.id
3705 Monit save its unique id to this file.
3706
3708 No environment variables are used by Monit. However, when Monit
3709 executes a start/stop/restart program or an exec action, it will set
3710 several environment variables which can be utilised by the executable
3711 to get information about the event, which triggered the action.
3712
3713 The following environment variable is set for every program executed by
3714 monit, including check program:
3715
3716 MONIT_SERVICE
3717 The name of the service (from monitrc) for which the program is
3718 executed.
3719
3720 The following environment variables are only available in the service
3721 start/stop/restart program and exec action context:
3722
3723 MONIT_EVENT
3724 The event that occurred on the service
3725
3726 MONIT_DESCRIPTION
3727 A description of the error condition
3728
3729 MONIT_DATE
3730 The time and date (RFC 822 style) the event occurred
3731
3732 MONIT_HOST
3733 The host the event occurred on
3734
3735 The following environment variables are only available in the check
3736 process start/stop/restart program and exec action context:
3737
3738 MONIT_PROCESS_PID
3739 The process pid. This may be 0 if the process was (re)started,
3740
3741 MONIT_PROCESS_MEMORY
3742 Process memory. This may be 0 if the process was (re)started,
3743
3744 MONIT_PROCESS_CHILDREN
3745 Process children. This may be 0 if the process was (re)started,
3746
3747 MONIT_PROCESS_CPU_PERCENT
3748 Process cpu%. This may be 0 if the process was (re)started,
3749
3750 The following environment variables are only available for check
3751 program start/stop/restart program and exec action context:
3752
3753 MONIT_PROGRAM_STATUS
3754 The program status (exit value).
3755
3757 If a Monit daemon is running, SIGUSR1 wakes it up from its sleep phase
3758 and forces a poll of all services. SIGTERM and SIGINT will gracefully
3759 terminate a Monit daemon. The SIGTERM signal is sent to a Monit daemon
3760 if Monit is started with the quit action argument.
3761
3762 Sending a SIGHUP signal to a running Monit daemon will force the daemon
3763 to reinitialise itself, specifically it will reread configuration,
3764 close and reopen log files.
3765
3766 Running Monit in foreground while a background Monit daemon is running
3767 will wake up the daemon.
3768
3770 This is a very silent program. Use the -v switch if you want to see
3771 what Monit is doing, and tail -f the log file. Optionally for testing
3772 purposes; you can start Monit with the -Iv switch. Monit will then
3773 print debug information to the console, to stop monit in this mode,
3774 simply press CTRL^C (i.e. SIGINT) in the same console.
3775
3776 The syntax (and parser) of the control file was inspired by Eric S.
3777 Raymond et al.'s excellent fetchmail program. Some portions of this man
3778 page also receive inspiration from the same authors.
3779
3781 Copyright (C) 2001-2017 by Tildeslash Ltd. All Rights Reserved. This
3782 product is distributed in the hope that it will be useful, but WITHOUT
3783 any warranty; without even the implied warranty of MERCHANTABILITY or
3784 FITNESS for a particular purpose.
3785
3787 GNU text utilities; md5sum(1); sha1sum(1); openssl(1); glob(7);
3788 regex(7); http://www.mmonit.com/
3789
3790
3791
37925.25.1 www.mmonit.com MONIT(1)