monit(1)

1MONIT(1)                         User Commands                        MONIT(1)
2
3
4

NAME

6       monit - utility for monitoring services on a Unix system
7

SYNOPSIS

9       monit [options] {arguments}
10

DESCRIPTION

12       monit is a utility for managing and monitoring processes, files, direc‐
13       tories and devices on a Unix system. Monit conducts automatic mainte‐
14       nance and repair and can execute meaningful causal actions in error
15       situations. E.g. monit can start a process if it does not run, restart
16       a process if it does not respond and stop a process if it uses too much
17       resources. You may use monit to monitor files, directories and devices
18       for changes, such as timestamps changes, checksum changes or size
19       changes.
20
21       Monit is controlled via an easy to configure control file based on a
22       free-format, token-oriented syntax. Monit logs to syslog or to its own
23       log file and notifies you about error conditions via customizable alert
24       messages. Monit can perform various TCP/IP network checks, protocol
25       checks and can utilize SSL for such checks. Monit provides a http(s)
26       interface and you may use a browser to access the monit program.
27

GENERAL OPERATION

29       The behavior of monit is controlled by command-line options and a run
30       control file, ~/.monitrc, the syntax of which we describe in a later
31       section. Command-line options override .monitrc declarations.
32
33       The following options are recognized by monit. However, it is recom‐
34       mended that you set options (when applicable) directly in the .monitrc
35       control file.
36
37       General Options and Arguments
38
39       -c file
40          Use this control file
41
42       -d n
43          Run as a daemon once per n seconds
44
45       -g
46          Set group name for start, stop, restart and status
47
48       -l logfile
49          Print log information to this file
50
51       -p pidfile
52          Use this lock file in daemon mode
53
54       -s statefile
55          Write state information to this file
56
57       -I
58          Do not run in background (needed for run from init)
59
60       -t
61          Run syntax check for the control file
62
63       -v
64          Verbose mode, work noisy (diagnostic output)
65
66       -H [filename]
67          Print MD5 and SHA1 hashes of the file or of stdin if the
68          filename is omitted; monit will exit afterwards
69
70       -V
71          Print version number and patch level
72
73       -h
74          Print a help text
75
76       In addition to the options above, monit can be started with one of the
77       following action arguments; monit will then execute the action and exit
78       without transforming itself to a daemon.
79
80       start all
81          Start all services listed in the control file and
82          enable monitoring for them. If the group option is
83          set, only start and enable monitoring of services in
84          the named group.
85
86       start name
87          Start the named service and enable monitoring for
88          it. The name is a service entry name from the
89          monitrc file.
90
91       stop all
92          Stop all services listed in the control file and
93          disable their monitoring. If the group option is
94          set, only stop and disable monitoring of the services
95          in the named group.
96
97       stop name
98          Stop the named service and disable its monitoring.
99          The name is a service entry name from the monitrc
100          file.
101
102       restart all
103          Stop and start all services. If the group option
104          is set, only restart the services in the named group.
105
106       restart name
107          Restart the named service. The name is a service entry
108          name from the monitrc file.
109
110       monitor all
111          Enable monitoring of all services listed in the
112          control file. If the group option is set, only start
113          monitoring of services in the named group.
114
115       monitor name
116          Enable monitoring of the named service.  The name is
117          a service entry name from the monitrc file. Monit will
118          also enable monitoring of all services this service
119          depends on.
120
121       unmonitor all
122          Disable monitoring of all services listed in the
123          control file. If the group option is set, only disable
124          monitoring of services in the named group.
125
126       unmonitor name
127          Disable monitoring of the named service. The name is
128          a service entry name from the monitrc file. Monit
129          will also disable monitoring of all services that
130          depends on this service.
131
132       status
133           Print full status information for each service.
134
135       summary
136           Print short status information for each service.
137
138       reload
139           Reinitialize a running monit daemon, the daemon will
140           reread its configuration, close and reopen log files.
141
142       quit
143           Kill a monit daemon process
144
145       validate
146          Check all services listed in the control file. This
147          action is also the default behavior when monit runs
148          in daemon mode.
149

WHAT TO MONITOR

151       You may use monit to monitor daemon processes or similar programs run‐
152       ning on localhost. Monit is particular useful for monitoring daemon
153       processes, such as those started at system boot time from /etc/init.d/.
154       For instance sendmail, sshd, apache and mysql. In difference to many
155       monitoring systems, monit can act if an error situation should occur,
156       e.g.; if sendmail is not running, monit can start sendmail or if apache
157       is using too much system resources (e.g. if a DoS attack is in
158       progress) monit can stop or restart apache and send you an alert mes‐
159       sage. Monit does also monitor process characteristics, such as; if a
160       process has become a zombie and how much memory or cpu cycles a process
161       is using.
162
163       You may also use monit to monitor files, directories and devices on
164       localhost. Monit can monitor these items for changes, such as time‐
165       stamps changes, checksum changes or size changes. This is also useful
166       for security reasons - you can monitor the md5 checksum of files that
167       should not change.
168
169       You may even use monit to monitor remote hosts. First and foremost
170       monit is a utility for monitoring and mending services on localhost,
171       but if a service depends on a remote service, e.g. a database server or
172       an application server, it might by useful to be able to test a remote
173       host as well.
174
175       You may monitor the general system-wide resources such as cpu usage,
176       memory and load average.
177

HOW TO MONITOR

179       monit is configured and controlled via a control file called monitrc.
180       The default location for this file is ~/.monitrc. If this file does not
181       exist, monit will try /etc/monitrc, then @sysconfdir@/monitrc and
182       finally ./monitrc.
183
184       A monit control file consists of a series of service entries and global
185       option statements in a free-format, token-oriented syntax.  Comments
186       begin with a # and extend through the end of the line.  There are three
187       kinds of tokens in the control file: grammar keywords, numbers and
188       strings.
189
190       On a semantic level, the control file consists of three types of state‐
191       ments:
192
193       1. Global set-statements
194           A global set-statement starts with the keyword set and the item to
195           configure.
196
197       2. Global include-statement
198           The include statement consists of the keyword include and a glob
199           string.
200
201       3. One or more service entry statements.
202           A service entry starts with the keyword check followed by the ser‐
203           vice type.
204
205       This is the hello galaxy version of a monit control file:
206
207        #
208        # monit control file
209        #
210
211        set daemon 120 # Poll at 2-minute intervals
212        set logfile syslog facility log_daemon
213        set alert foo@bar.baz
214        set httpd port 2812 and use address localhost
215            allow localhost   # Allow localhost to connect
216            allow admin:monit # Allow Basic Auth
217
218        check system myhost.mydomain.tld
219           if loadavg (1min) > 4 then alert
220           if loadavg (5min) > 2 then alert
221           if memory usage > 75% then alert
222           if cpu usage (user) > 70% then alert
223           if cpu usage (system) > 30% then alert
224           if cpu usage (wait) > 20% then alert
225
226        check process apache
227           with pidfile "/usr/local/apache/logs/httpd.pid"
228           start program = "/etc/init.d/httpd start"
229           stop program = "/etc/init.d/httpd stop"
230           if 2 restarts within 3 cycles then timeout
231           if totalmem > 100 Mb then alert
232           if children > 255 for 5 cycles then stop
233           if cpu usage > 95% for 3 cycles then restart
234           if failed port 80 protocol http then restart
235           group server
236           depends on httpd.conf, httpd.bin
237
238        check file httpd.conf
239            with path /usr/local/apache/conf/httpd.conf
240            # Reload apache if the httpd.conf file was changed
241            if changed checksum
242               then exec "/usr/local/apache/bin/apachectl graceful"
243
244        check file httpd.bin
245            with path /usr/local/apache/bin/httpd
246            # Run /watch/dog in the case that the binary was changed
247            # and alert in the case that the checksum value recovered
248            # later
249            if failed checksum then exec "/watch/dog"
250               else if recovered then alert
251
252        include /etc/monit/mysql.monitrc
253        include /etc/monit/mail/*.monitrc
254
255       This example illustrate a service entry for monitoring the apache web
256       server process as well as related files. The meaning of the various
257       statements will be explained in the following sections.
258

LOGGING

260       monit will log status and error messages to a log file. Use the set
261       logfile statement in the monitrc control file. To setup monit to log to
262       its own logfile, use e.g. set logfile /var/log/monit.log. If syslog is
263       given as a value for the -l command-line switch (or the keyword set
264       logfile syslog is found in the control file) monit will use the syslog
265       system daemon to log messages. The priority is assigned to each message
266       based on the context. To turn off logging, simply do not set the log‐
267       file in the control file (and of course, do not use the -l switch)
268

DAEMON MODE

270       The -d interval command-line switch runs monit in daemon mode. You must
271       specify a numeric argument which is a polling interval in seconds.
272
273       In daemon mode, monit detaches from the console, puts itself in the
274       background and runs continuously, monitoring each specified service and
275       then goes to sleep for the given poll interval.
276
277              Simply invoking
278
279                     monit -d 300
280
281       will poll all services described in your ~/.monitrc file every 5 min‐
282       utes.
283
284       It is strongly recommended to set the poll interval in your ~/.monitrc
285       file instead, by using set daemon n, where n is an integer number of
286       seconds. If you do this, monit will always start in daemon mode (as
287       long as no action arguments are given).
288
289       Monit makes a per-instance lock-file in daemon mode. If you need more
290       monit instances, you will need more configuration files, each pointing
291       to its own lock-file.
292
293       Calling monit with a monit daemon running in the background sends a
294       wake-up signal to the daemon, forcing it to check services immediately.
295
296       The quit argument will kill a running daemon process instead of waking
297       it up.
298

INIT SUPPORT

300       Monit can run and be controlled from init. If monit should crash, init
301       will re-spawn a new monit process. Using init to start monit is proba‐
302       bly the best way to run monit if you want to be certain that you always
303       have a running monit daemon on your system. (It's obvious, but never
304       the less worth to stress; Make sure that the control file does not have
305       any syntax errors before you start monit from init. Also, make sure
306       that if you run monit from init, that you do not start monit from a
307       startup scripts as well).
308
309       To setup monit to run from init, you can either use the 'set init'
310       statement in monit's control file or use the -I option from the command
311       line and here is what you must add to /etc/inittab:
312
313         # Run monit in standard run-levels
314         mo:2345:respawn:/usr/local/bin/monit -Ic /etc/monitrc
315
316       After you have modified init's configuration file, you can run the fol‐
317       lowing command to re-examine /etc/inittab and start monit:
318
319         telinit q
320
321       For systems without telinit:
322
323         kill -1 1
324
325       If monit is used to monitor services that are also started at boot time
326       (e.g. services started via SYSV init rc scripts or via inittab) then,
327       in some cases, a race condition could occur. That is; if a service is
328       slow to start, monit can assume that the service is not running and
329       possibly try to start it and raise an alert, while, in fact the service
330       is already about to start or already in its startup sequence. Please
331       see the FAQ for solutions to this problem.
332

INCLUDE FILES

334       The monit control file, monitrc, can include additional configuration
335       files. This feature helps to maintain a certain structure or to place
336       repeating settings into one file. Include statements can be placed at
337       virtually any spot. The syntax is the following:
338
339         INCLUDE globstring
340
341       The globstring is any kind of string as defined in glob(7).  Thus, you
342       can refer to a single file or you can load several files at once.  In
343       case you want to use whitespace in your string the globstring need to
344       be embedded into quotes (') or double quotes ("). For example,
345
346        INCLUDE "/etc/monit/monit configuration files/printer.*.monitrc"
347
348       loads any file matching the single globstring.  If the globstring
349       matches a directory instead of a file, it is silently ignored.
350
351       INCLUDE statements in included files are parsed as in the main control
352       file.
353
354       If the globstring matches several results, the files are included in a
355       non sorted manner.  If you need to rely on a certain order, you might
356       need to use single include statements.
357

GROUP SUPPORT

359       Service entries in the control file, monitrc, can be grouped together
360       by the group statement. The syntax is simply (keyword in capital):
361
362         GROUP groupname
363
364       With this statement it is possible to group similar service entries
365       together and manage them as a whole. Monit provides functions to start,
366       stop and restart a group of services, like so:
367
368       To start a group of services from the console:
369
370         monit -g <groupname> start
371
372       To stop a group of services:
373
374         monit -g <groupname> stop
375
376       To restart a group of services:
377
378         monit -g <groupname> restart
379

MONITORING MODE

381       Monit supports three monitoring modes per service: active, passive and
382       manual. See also the example section below for usage of the mode state‐
383       ment.
384
385       In active mode, monit will monitor a service and in case of problems
386       monit will act and raise alerts, start, stop or restart the service.
387       Active mode is the default mode.
388
389       In passive mode, monit will passively monitor a service and specifi‐
390       cally not try to fix a problem, but it will still raise alerts in case
391       of a problem.
392
393       For use in clustered environments there is also a manual mode. In this
394       mode, monit will enter active mode only if a service was brought under
395       monit's control, for example by executing the following command in the
396       console:
397
398         monit start sybase
399         (monit will call sybase's start method and enable monitoring)
400
401       If a service was not started by monit or was stopped or disabled for
402       example by:
403
404         monit stop sybase
405         (monit will call sybase's stop method and disable monitoring)
406
407       monit will not monitor the service. This allows for having services
408       configured in monitrc and start it with monit only if it should run.
409       This feature can be used to build a simple failsafe cluster. To see
410       how, read more about how to setup a cluster with monit using the heart‐
411       beat system in the examples sections below.
412

ALERT MESSAGES

414       Monit will raise an email alert in the following situations:
415
416        o A service timed out
417        o A service does not exist
418        o A service related data access problem
419        o A service related program execution problem
420        o A service is of invalid object type
421        o A icmp problem
422        o A port connection problem
423        o A resource statement match
424        o A file checksum problem
425        o A file size problem
426        o A file/directory timestamp problem
427        o A file/directory/device permission problem
428        o A file/directory/device uid problem
429        o A file/directory/device gid problem
430
431       Monit will send an alert each time a monitored object changed.  This
432       involves:
433
434        o Monit started, stopped or reloaded
435        o A file checksum changed
436        o A file size changed
437        o A file content match
438        o A file/directory timestamp changed
439
440       You use the alert statement to notify monit that you want alert mes‐
441       sages sent to an email address. If you do not specify an alert state‐
442       ment, monit will not send alert messages.
443
444       There are two forms of alert statement:
445
446        o Global - common for all services
447        o Local  - per service
448
449       In both cases you can use more than one alert statement. In other
450       words, you can send many different emails to many different addresses.
451       (in case you now got a new business idea: monit is not really suitable
452       for sending spam).
453
454       Recipients in the global and in the local lists are alerted when a ser‐
455       vice failed, recovered or changed. If the same email address is in the
456       global and in the local list, monit will send only one alert. Local
457       (per service) defined alert email addresses override global addresses
458       in case of a conflict. Finally, you may choose to only use a global
459       alert list (recommended), a local per service list or both.
460
461       It is also possible to disable the global alerts localy for particular
462       service(s) and recipients.
463
464       Setting a global alert statement
465
466       If a change occurred on a monitored services, monit will send an alert
467       to all recipients in the global list who have registered interest for
468       the event type. Here is the syntax for the global alert statement:
469
470       SET ALERT mail-address [ [NOT] {events}] [MAIL-FORMAT {mail-format}]
471       [REMINDER number]
472
473       Simply using the following in the global section of monitrc:
474
475        set alert foo@bar
476
477       will send a default email to the address foo@bar whenever an event
478       occurred on any service. Such an event may be that a service timed out,
479       a service was doesn't exist or a service does exist (on recovery) and
480       so on. If you want to send alert messages to more email addresses, add
481       a set alert 'email' statement for each address.
482
483       For explanations of the events, MAIL-FORMAT and REMINDER keywords
484       above, please see below.
485
486       When you want to enable global alert recipient which will receive all
487       event alerts except some type, you can also use the NOT negation option
488       ahead of events list which allows you to set the recipient for "all but
489       specified events" (see bellow for more details).
490
491       Setting a local alert statement
492
493       Each service can also have its own recipient list.
494
495       ALERT mail-address [ [NOT] {events}] [MAIL-FORMAT {mail-format}]
496       [REMINDER number]
497
498       or
499
500       NOALERT mail-address
501
502       If you only want an alert message sent for certain events for certain
503       service(s), for example only for timeout events or only if a service
504       died, then postfix the alert-statement with a filter block:
505
506        check process myproc with pidfile /var/run/my.pid
507          alert foo@bar only on { timeout, nonexist }
508          ...
509
510       (only and on are noise keywords, ignored by monit. As a side note;
511       Noise keywords are used in the control file grammar to make an entry
512       resemble English and thus make it easier to read (or, so goes the phi‐
513       losophy). The full set of available noise keywords are listed below in
514       the Control File section).
515
516       You can also set the alert to send all events except specified using
517       the list negation - the word not ahead of the event list. For example
518       when you want to receive alerts for all events except the monit
519       instance related, you can write (note that the noise words 'but' and
520       'on' are optional):
521
522        check system myserver
523          alert foo@bar but not on { instance }
524          ...
525
526       instead of:
527
528          alert foo@bar on { change
529                             checksum
530                             data
531                             exec
532                             gid
533                             icmp
534                             invalid
535                             match
536                             nonexist
537                             permission
538                             size
539                             timeout
540                             timestamp }
541
542       This will enable all alerts for foo@bar, except the monit instance
543       related alerts.
544
545       Event filtering can be used to send a mail to different email addresses
546       depending on the events that occurred. For instance:
547
548        alert foo@bar { nonexist, timeout, resource, icmp, connection }
549        alert security@bar on { checksum, permission, uid, gid }
550        alert manager@bar
551
552       This will send an alert message to foo@bar whenever a nonexist, time‐
553       out, resource or connection problem occurs and a message to secu‐
554       rity@bar if a checksum, permission, uid or gid problem occurs. And
555       finally, a message to manager@bar whenever any error event occurs.
556
557       This is the list of events you can use in a mail-filter: uid, gid,
558       size, nonexist, data, icmp, instance, invalid, exec, changed, timeout,
559       resource, checksum, match, timestamp, connection, permission
560
561       You can also disable the alerts localy using the NOALERT statement.
562       This is useful for example when you have lot of services monitored,
563       used the global alert statement, but don't want  to receive alerts for
564       some minor subset of services:
565
566        noalert appadmin@bar
567
568       For example when you will place the noalert statement to the 'check
569       system', the given user won't receive the system related alerts (such
570       as monit instance started/stopped/reloaded alert, system overloaded
571       alert, etc.) but will receive the alerts for all other monitored ser‐
572       vices.
573
574       The following example will alert foo@bar on all events on all services
575       by default, except the service mybar which will send an alert only on
576       timeout. The trick is based on the fact that local definition of the
577       same recipient overrides the global setting (including registered
578       events and mail format):
579
580        set alert foo@bar
581
582        check process myfoo with pidfile /var/run/myfoo.pid
583          ...
584        check process mybar with pidfile /var/run/mybar.pid
585          alert foo@bar only on { timeout }
586
587       The 'instance' alert type report events related to monit internals,
588       such as when a monit instance was started, stopped or reloaded.
589
590       If the MTA (mailserver) for sending alerts is not available, monit can
591       queue events on the local file-system until the MTA recover. Monit will
592       then post queued events in order with their original timestamp so the
593       events are not lost. This feature is most useful if monit is used
594       together with e.g. m/monit and when event history is important.
595
596       Alert message layout
597
598       monit provides a default mail message layout that is short and to the
599       point. Here's an example of a standard alert mail sent by monit:
600
601        From: monit@tildeslash.com
602        Subject: monit alert -- Does not exist apache
603        To: hauk@tildeslash.com
604        Date: Thu, 04 Sep 2003 02:33:03 +0200
605
606        Does not exist Service apache
607
608               Date:   Thu, 04 Sep 2003 02:33:03 +0200
609               Action: restart
610               Host:   www.tildeslash.com
611
612        Your faithful employee,
613        monit
614
615       If you want to, you can change the format of this message with the
616       optional mail-format statement. The syntax for this statement is as
617       follows:
618
619        mail-format {
620             from: monit@localhost
621          subject: $SERVICE $EVENT at $DATE
622          message: Monit $ACTION $SERVICE at $DATE on $HOST: $DESCRIPTION.
623                   Yours sincerely,
624                   monit
625        }
626
627       Where the keyword from: is the email address monit should pretend it is
628       sending from. It does not have to be a real mail address, but it must
629       be a proper formated mail address, on the form: name@domain. The key‐
630       word subject: is for the email subject line. The subject must be on
631       only one line. The message: keyword denotes the mail body. If used,
632       this keyword should always be the last in a mail-format statement.  The
633       mail body can be as long as you want and must not contain the '}' char‐
634       acter.
635
636       All of these format keywords are optional but you must provide at least
637       one. Thus if you only want to change the from address monit is using
638       you can do:
639
640        set alert foo@bar with mail-format { from: bofh@bar.baz }
641
642       From the previous example you will notice that some special $XXX vari‐
643       ables was used. If used, they will be substituted and expanded into the
644       text with these values:
645
646       * $EVENT
647            A string describing the event that occurred. The values are
648            fixed and are:
649
650            Event:    | Failure state:          | Recovery state:
651            ---------------------------------------------------------------
652            CHANGED   | "Changed"               | "Changed back"
653            CHECKSUM  | "Checksum failed"       | "Checksum passed"
654            CONNECTION| "Connection failed"     | "Connection passed"
655            DATA      | "Data access error"     | "Data access succeeded"
656            EXEC      | "Execution failed"      | "Execution succeeded"
657            GID       | "GID failed"            | "GID passed"
658            ICMP      | "ICMP failed"           | "ICMP passed"
659            INSTANCE  | "Monit instance changed"| "Monit instance changed not"
660            INVALID   | "Invalid type"          | "Type passed"
661            MATCH     | "Regex match"           | "No regex match"
662            NONEXIST  | "Does not exist"        | "Exists"
663            PERMISSION| "Permission failed"     | "Permission passed"
664            RESOURCE  | "Resource limit matched"| "Resource limit passed"
665            SIZE      | "Size failed"           | "Size passed"
666            TIMEOUT   | "Timeout"               | "Timeout recovery"
667            TIMESTAMP | "Timestamp failed"      | "Timestamp passed"
668            UID       | "UID failed"            | "UID passed"
669
670       * $SERVICE
671            The service entry name in monitrc
672
673       * $DATE
674            The current time and date (RFC 822 date style).
675
676       * $HOST
677            The name of the host monit is running on
678
679       * $ACTION
680            The name of the action which was done. Action names are fixed
681            and are:
682
683            Action:  | Name:
684            --------------------
685            ALERT    | "alert"
686            EXEC     | "exec"
687            MONITOR  | "monitor"
688            RESTART  | "restart"
689            START    | "start"
690            STOP     | "stop"
691            UNMONITOR| "unmonitor"
692
693       * $DESCRIPTION
694            The description of the error condition
695
696       Setting a global mail format
697
698       It is possible to set a standard mail format with the following global
699       set-statement (keywords are in capital):
700
701       SET MAIL-FORMAT {mail-format}
702
703       Format set with this statement will apply to every alert statement that
704       does not have its own specified mail-format.  This statement is most
705       useful for setting a default from address for messages sent by monit,
706       like so:
707
708        set mail-format { from: monit@foo.bar.no }
709
710       Setting a error reminder
711
712       Monit by default sends just one error notification when the service
713       failed and another one when it has recovered. If you want to be noti‐
714       fied more then once in the case that the service remains failed, you
715       can use the reminder option of alert statement (keywords are in capi‐
716       tal):
717
718       ALERT ... [WITH] REMINDER [ON] number [CYCLES]
719
720       For example if you want to be notified each tenth cycle when the ser‐
721       vice remains failed, you can use:
722
723         alert foo@bar with reminder on 10 cycles
724
725       If you want to be notified on each failed cycle, you can use:
726
727         alert foo@bar with reminder on 1 cycle
728
729       Setting a mail server for alert messages
730
731       The mail server monit should use to send alert messages is defined with
732       a global set statement (keywords are in capital and optional statements
733       in [brackets]):
734
735        SET MAILSERVER {hostname|ip-address [PORT port]
736                       [USERNAME username] [PASSWORD password]
737                       [using SSLV2|SSLV3|TLSV1] [CERTMD5 checksum]}+
738                       [with TIMEOUT X SECONDS]
739
740       The port statement allows to use SMTP servers other then those listen‐
741       ing on port 25. If omitted, port 25 is used when ssl is not enabled or
742       tls is used, otherwise 465 is used by default (for ssl v2 and v3).
743
744       Monit support plain smtp authentication - you can set the username and
745       password using USERNAME and PASSWORD options.
746
747       To use the secure communication, use the SSLV2, SSLV3 or TLSV1 options,
748       you can also specify the server certificate checksum using CERTMD5
749       option.
750
751       As you can see, it is possible to set several SMTP servers. If monit
752       cannot connect to the first server in the list it will try the second
753       server and so on. Monit has a default 5 seconds connection timeout and
754       if the SMTP server is slow, monit could timeout when connecting or
755       reading from the server.  You can use the optional timeout statement to
756       explicit set the timeout to a higher value if needed. Here is an exam‐
757       ple for setting several mail servers:
758
759        set mailserver mail.tildeslash.com,
760                       mail.foo.bar port 10025 username "Rabbi" password "Loewe" using tlsv1,
761                       localhost
762                       with timeout 15 seconds
763
764       Here monit will first try to connect to the server
765       "mail.tildeslash.com", if this server is down monit will try
766       "mail.foo.bar" on port 10025 using the given credentials via tls and
767       finally "localhost". We do also set an explicit connect and read time‐
768       out; If monit cannot connect to the first SMTP server in the list
769       within 15 seconds it will try the next server and so on.  The set
770       mailserver .. statement is optional and if not defined monit defaults
771       to use localhost as the SMTP server.
772
773       Event queue
774
775       Monit provide optionally queueing of event alerts that cannot be sent.
776       For example, if no mail-server is available at the moment, monit can
777       store events in a queue and try to reprocess them at the next cycle. As
778       soon as the mail-server recover, monit will post the queued events. The
779       queue is persistent across monit restarts and provided that the back-
780       end filesystem is persistent too, across system restart as well.
781
782       By default, the queue is disabled and if the alert handler fails, monit
783       will simply drop the alert message. To enable the event queue, add the
784       following statement to the monit control file:
785
786        SET EVENTQUEUE BASEDIR <path> [SLOTS <number>]
787
788       The <path> is the path to the directory where events will be stored.
789       Optionally if you want to limit the queue size (maximum events count),
790       use the slots option. If the slots option is not used, monit will store
791       as many events as the backend filesystem allows.
792
793       Example:
794
795         set eventqueue
796             basedir /var/monit
797             slots 5000
798
799       The events are stored in binary format, one file per event. The file
800       size is ca. 130 bytes or a bit more (depending on the message length).
801       The file name is composed of the unix timestamp, underscore and the
802       service name, for example:
803
804        /var/monit/1131269471_apache
805
806       If you are running more then one monit instance on the same machine,
807       you must use separated event queue directories to avoid sending wrong
808       alerts to the wrong addresses.
809
810       If you want to purge the queue by hand (remove queued event-files),
811       monit should be stopped before the removal.
812

SERVICE TIMEOUT

814       monit provides a service timeout mechanism for situations where a ser‐
815       vice simply refuses to start or respond over a longer period. In cases
816       like this, and particularly if monit's poll-cycle is low, monit will
817       simply increase the machine load by trying to restart the service.
818
819       The timeout mechanism monit provides is based on two variables, i.e.
820       the number the service has been started and the number of poll-cycles.
821       For example, if a service had x restarts within y poll-cycles (where x
822       <= y) then monit will timeout and not (re)start the service on the next
823       cycle. If a timeout occurs monit will send you an alert message if you
824       have register interest for this event.
825
826       The syntax for the timeout statement is as follows (keywords are in
827       capital):
828
829       IF NUMBER RESTART NUMBER CYCLE(S) THEN TIMEOUT
830
831       Where the first number is the number of service restarts and the sec‐
832       ond, the number of poll-cycles. If the number of cycles was reached
833       without a timeout, the service start-counter is reset to zero. This
834       provides some granularity to catch exceptional cases and do a service
835       timeout, but let occasional service start and restarts happen without
836       having an accumulated timeout.
837
838       Here is an example where monit will timeout (not check the service) if
839       the service was restarted 2 times within 3 cycles:
840
841        if 2 restarts within 3 cycles then timeout
842
843       To have monit check the service again after a timeout, run 'monit moni‐
844       tor service' from the command line. This will remove the timeout lock
845       in the daemon and make the daemon start and check the service again.
846

SERVICE TESTS

848       Monit provides several tests you may utilize in a service entry to test
849       a service. Basically here are two classes of tests: variable and con‐
850       stant object tests.
851
852       Constant object tests are related to failed/passed state.  In the case
853       of error, monit will watch whether the failed parameter will recover -
854       in such case it will handle recovery related action. General format:
855
856       IF <TEST> [[<X>] [TIMES WITHIN] <Y> CYCLES] THEN ACTION [ELSE IF PASSED
857       [[<X>] [TIMES WITHIN] <Y> CYCLES] THEN ACTION]
858
859       For constant object tests if the <TEST> should validate to true, then
860       the selected action is executed each cycle the condition remains true.
861       The value for comparison is constant. Recovery action is evaluated only
862       once (on failed->passed state change only). The 'ELSE IF PASSED' part
863       is optional - if omitted, monit will do alert action on recovery by
864       default. The alert is delivered only once on each state change unless
865       overridden by 'reminder' alert option.
866
867       Variable object tests begins with 'IF CHANGED' statement and serves for
868       monitoring of object, which property can change legally - monit watches
869       whether the value will change again. You can use it just for alert or
870       to involve some automatic action, as for example to reload monitored
871       process after its configuration file was changed.  Variable tests are
872       supported for 'checksum', 'size', 'pid, 'ppid' and 'timestamp' tests
873       only, if you consider that other tests can be useful in variable form
874       too, please let us know.
875
876       IF CHANGED <TEST> [[<X>] [TIMES WITHIN] <Y> CYCLES] THEN ACTION
877
878       For variable object tests if the <TEST> should validate to true, then
879       the selected action is executed once and monit will watch for another
880       change. The value for comparison is a variable where the last result
881       becomes the actual value, which is compared in future cycles. The alert
882       is delivered each time the condition becomes true.
883
884       You can restrict the event ratio needed to change the state:
885
886       ... [[<X>] [TIMES WITHIN] <Y> CYCLES] ...
887
888       This part is optional and is supported by all testing rules.  It
889       defines how many event occurrences during how many cycles are needed to
890       trigger the following action. You can use it in several ways - the core
891       syntax is:
892
893        [<X>] <Y> CYCLES
894
895       It is possible to use filling words which give the rule better first-
896       sight sense. You can use any filling words such as: FOR, TIMES, WITHIN,
897       thus for example:
898
899        if failed port 80 for 3 times within 5 cycles then alert
900
901       or
902
903        if failed port 80 for 10 cycles then unmonitor
904
905       When you don't specify the <X>, it equals to <Y> by default, thus the
906       rule applies when <Y> consecutive cycles of inverse event occurred
907       (relatively to the current service state).
908
909       When you omit it at all, monit will by default change state on first
910       inverse event, which is equivalent to this notation:
911
912        1 times within 1 cycles
913
914       It is possible to use this option for failed, passed/recovered or
915       changed rules. More complex examples:
916
917        check device rootfs with path /dev/hda1
918         if space usage > 80% 5 times within 15 cycles
919            then alert
920            else if passed for 10 cycles then alert
921         if space usage > 90% for 5 cycles then
922            exec '/try/to/free/the/space'
923         if space usage > 99% then exec '/stop/processess'
924
925       Note that the maximal cycles count which can be used in the rule is
926       limited by the size of 'long long' data type on your platform.  This
927       provides 64 cycles on usual platforms currently. In the case that you
928       use unsupported value, the configuration parser will tell you the lim‐
929       its during monit startup.
930
931       You must select an action to be executed from this list:
932
933       ·   ALERT sends the user an alert event on each state change (for con‐
934           stant object tests) or on each change (for variable object tests).
935
936       ·   RESTART restarts the service and sends an alert. Restart is con‐
937           ducted by first calling the service's registered stop method and
938           then the service's start method.
939
940       ·   START starts the service by calling the service's registered start
941           method and send an alert.
942
943       ·   STOP stops the service by calling the service's registered stop
944           method and send an alert. If monit stops a service it will not be
945           checked by monit anymore nor restarted again later. To reactivate
946           monitoring of the service again you must explicitly enable monitor‐
947           ing from the web interface or from the console, e.g. 'monit monitor
948           apache'.
949
950       ·   EXEC may be used to execute an arbitrary program and send an alert.
951           If you choose this action you must state the program to be executed
952           and if the program require arguments you must enclose the program
953           and its arguments in a quoted string. You may optionally specify
954           the uid and gid the executed program should switch to upon start.
955           For instance:
956
957            exec "/usr/local/tomcat/bin/startup.sh"
958                 as uid nobody and gid nobody
959
960           This may be useful if the program to be started cannot change to a
961           lesser privileged user and group. This is typically needed for Java
962           Servers. Remember, if monit is run by the superuser, then all pro‐
963           grams executed by monit will be started with superuser privileges
964           unless the uid and gid extension was used.
965
966       ·   MONITOR will enable monitoring of the service and send an alert.
967
968       ·   UNMONITOR will disable monitoring of the service and send an alert.
969           The service will not be checked by monit anymore nor restarted
970           again later.  To reactivate monitoring of the service you must
971           explicitly enable monitoring from monit's web interface or from the
972           console using the monitor argument.
973
974       RESOURCE TESTING
975
976       Monit can examine how much system resources a services are using. This
977       test may only be used within a system or process service entry in the
978       monit control file.
979
980       Depending on the system or process characteristics, services can be
981       stopped or restarted and alerts can be generated. Thus it is possible
982       to utilize systems which are idle and to spare system under high load.
983
984       The full syntax for the resource-statements used for resource testing
985       is as follows (keywords are in capital and optional statements in
986       [brackets]),
987
988       IF resource operator value [[<X>] <Y> CYCLES] THEN action [ELSE IF
989       PASSED [[<X>] <Y> CYCLES] THEN action]
990
991       resource is a choice of "CPU", "CPU([user|system|wait])", "MEMORY",
992       "CHILDREN", "TOTALMEMORY", "LOADAVG([1min|5min|15min])".  Some
993       resources can be used inside of system service container, some in
994       process service container and some in both:
995
996       System only resource tests:
997
998       CPU([user|system|wait]) is the percent of time that the system spend in
999       user  or system/kernel space. Some systems such as linux 2.6 supports
1000       'wait' indicator as well.
1001
1002       Process only resource tests:
1003
1004       CPU is the CPU usage of the process and its children in parts of hun‐
1005       dred (percent).
1006
1007       CHILDREN is the number of child processes of the process.
1008
1009       TOTALMEMORY is the memory usage of the process and its child processes
1010       in either percent or as an amount (Byte, kB, MB, GB).
1011
1012       System and process resource tests:
1013
1014       MEMORY is the memory usage of the system or in the process context of
1015       the process without its child processes in either percent (of the sys‐
1016       tems total) or as an amount (Byte, kB, MB, GB).
1017
1018       LOADAVG([1min|5min|15min]) refers to the system's load average.  The
1019       load average is the number of processes in the system run queue, aver‐
1020       aged over the specified time period.
1021
1022       operator is a choice of "<", ">", "!=", "==" in C notation, "gt", "lt",
1023       "eq", "ne" in shell sh notation and "greater", "less", "equal", "note‐
1024       qual" in human readable form (if not specified, default is EQUAL).
1025
1026       value is either an integer or a real number (except for CHILDREN). For
1027       CPU, MEMORY and TOTALMEMORY you need to specify a unit.  This could be
1028       "%" or if applicable "B" (Byte), "kB" (1024 Byte), "MB" (1024 KiloByte)
1029       or "GB" (1024 MegaByte).
1030
1031       action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC",
1032       "MONITOR" or "UNMONITOR".
1033
1034       To calculate the cycles, a counter is raised whenever the expression
1035       above is true and it is lowered whenever it is false (but not below 0).
1036       All counters are reset in case of a restart.
1037
1038       The following is an example to check that the CPU usage of a service is
1039       not going beyond 50% during five poll cycles. If it does, monit will
1040       restart the service:
1041
1042        if cpu is greater than 50% for 5 cycles then restart
1043
1044       See also the example section below.
1045
1046       FILE CHECKSUM TESTING
1047
1048       The checksum statement may only be used in a file service entry. If
1049       specified in the control file, monit will compute a md5 or sha1 check‐
1050       sum for a file.
1051
1052       The checksum test in constant form is used to verify that a file does
1053       not change. Syntax (keywords are in capital):
1054
1055       IF FAILED [MD5|SHA1] CHECKSUM [EXPECT checksum] [[<X>] <Y> CYCLES] THEN
1056       action [ELSE IF PASSED [[<X>] <Y> CYCLES] THEN action]
1057
1058       The checksum test in variable form is used to watch for file changes.
1059       Syntax (keywords are in capital):
1060
1061       IF CHANGED [MD5|SHA1] CHECKSUM [[<X>] <Y> CYCLES] THEN action
1062
1063       The choice of MD5 or SHA1 is optional. MD5 features a 256 bit and SHA1
1064       a 320 bit checksum. If this option is omitted monit tries to guess the
1065       method from the EXPECT string or uses MD5 as default.
1066
1067       expect is optional and if used it specifies a md5 or sha1 string monit
1068       should expect when testing a file's checksum. If expect is used, monit
1069       will not compute an initial checksum for the file, but instead use the
1070       string you submit. For example:
1071
1072        if failed checksum and
1073           expect the sum 8f7f419955cefa0b33a2ba316cba3659
1074        then alert
1075
1076       You can, for example, use the GNU utility md5sum(1) or sha1sum(1) to
1077       create a checksum string for a file and use this string in the
1078       expect-statement.
1079
1080       action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC",
1081       "MONITOR" or "UNMONITOR".
1082
1083       The checksum statement in variable form may be used to check a file for
1084       changes and if changed, do a specified action.  For instance to reload
1085       a server if its configuration file was changed. The following illus‐
1086       trate this for the apache web server:
1087
1088        check file httpd.conf path /usr/local/apache/conf/httpd.conf
1089            if changed sha1 checksum
1090               then exec "/usr/local/apache/bin/apachectl graceful"
1091
1092       If you plan to use the checksum statement for security reasons, (a very
1093       good idea, by the way) and to monitor a file or files which should not
1094       change, then please use constant form and also read the DEPENDENCY TREE
1095       section below to see a detailed example on how to do this properly.
1096
1097       Monit can also test the checksum for files on a remote host via the
1098       HTTP protocol. See the CONNECTION TESTING section below.
1099
1100       TIMESTAMP TESTING
1101
1102       The timestamp statement may only be used in a file, fifo or directory
1103       service entry.
1104
1105       The timestamp test in constant form is used to verify various timestamp
1106       conditions. Syntax (keywords are in capital):
1107
1108       IF TIMESTAMP [[operator] value [unit]] [[<X>] <Y> CYCLES] THEN action
1109       [ELSE IF PASSED [[<X>] <Y> CYCLES] THEN action]
1110
1111       The timestamp statement in variable form is simply to test an existing
1112       file or directory for timestamp changes and if changed, execute an
1113       action. Syntax (keywords are in capital):
1114
1115       IF CHANGED TIMESTAMP [[<X>] <Y> CYCLES] THEN action
1116
1117       operator is a choice of "<", ">", "!=", "==" in C notation, "GT", "LT",
1118       "EQ", "NE" in shell sh notation and "GREATER", "LESS", "EQUAL", "NOTE‐
1119       QUAL" in human readable form (if not specified, default is EQUAL).
1120
1121       value is a time watermark.
1122
1123       unit is either "SECOND", "MINUTE", "HOUR" or "DAY" (it is also possible
1124       to use "SECONDS", "MINUTES", "HOURS", or "DAYS").
1125
1126       action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC",
1127       "MONITOR" or "UNMONITOR".
1128
1129       The variable timestamp statement is useful for checking a file for
1130       changes and then execute an action. This version was written particu‐
1131       larly with configuration files in mind. For instance, if you monitor
1132       the apache web server you can use this statement to reload apache if
1133       the httpd.conf (apache's configuration file) was changed. Like so:
1134
1135        check file httpd.conf with path /usr/local/apache/conf/httpd.conf
1136          if changed timestamp
1137             then exec "/usr/local/apache/bin/apachectl graceful"
1138
1139       The constant timestamp version is useful for monitoring systems able to
1140       report its state by changing the timestamp of certain state files. For
1141       instance the iPlanet Messaging server stored process system updates the
1142       timestamp of:
1143
1144        o stored.ckp
1145        o stored.lcu
1146        o stored.per
1147
1148       If a task should fail, the system keeps the timestamp. To report stored
1149       problems you can use the following statements:
1150
1151        check file stored.ckp with path /msg-foo/config/stored.ckp
1152          if timestamp > 1 minute then alert
1153
1154        check file stored.lcu with path /msg-foo/config/stored.lcu
1155          if timestamp > 5 minutes then alert
1156
1157        check file stored.per with path /msg-foo/config/stored.per
1158          if timestamp > 1 hour then alert
1159
1160       As mentioned above, you can also use the timestamp statement for moni‐
1161       toring directories for changes. If files are added or removed from a
1162       directory, its timestamp is changed:
1163
1164        check directory mydir path /foo/directory
1165         if timestamp > 1 hour then alert
1166
1167       or
1168
1169        check directory myotherdir path /foo/secure/directory
1170         if timestamp < 1 hour then alert
1171
1172       The following example is a hack for restarting a process after a cer‐
1173       tain time. Sometimes this is a necessary workaround for some third-
1174       party applications, until the vendor fix a problem:
1175
1176        check file server.pid path /var/run/server.pid
1177              if timestamp > 7 days
1178                 then exec "/usr/local/server/restart-server"
1179
1180       FILE SIZE TESTING
1181
1182       The size statement may only be used in a file service entry.  If speci‐
1183       fied in the control file, monit will compute a size for a file.
1184
1185       The size test in constant form is used to verify various size condi‐
1186       tions. Syntax (keywords are in capital):
1187
1188       IF SIZE [[operator] value [unit]] [[<X>] <Y> CYCLES] THEN action [ELSE
1189       IF PASSED [[<X>] <Y> CYCLES] THEN action]
1190
1191       The size statement in variable form is simply to test an existing file
1192       for size changes and if changed, execute an action. Syntax (keywords
1193       are in capital):
1194
1195       IF CHANGED SIZE [[<X>] <Y> CYCLES] THEN action
1196
1197       operator is a choice of "<", ">", "!=", "==" in C notation, "GT", "LT",
1198       "EQ", "NE" in shell sh notation and "GREATER", "LESS", "EQUAL", "NOTE‐
1199       QUAL" in human readable form (if not specified, default is EQUAL).
1200
1201       value is a size watermark.
1202
1203       unit is a choice of "B","KB","MB","GB" or long alternatives "byte",
1204       "kilobyte", "megabyte", "gigabyte". If it is not specified, "byte" unit
1205       is assumed by default.
1206
1207       action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC",
1208       "MONITOR" or "UNMONITOR".
1209
1210       The variable size test form is useful for checking a file for changes
1211       and send an alert or execute an action. Monit will register the size of
1212       the file at startup and monitor the file for changes. As soon as the
1213       value changed, monit will do specified action, reset the registered
1214       value to new result and continue to monitor, whether the size changed
1215       again.
1216
1217       One example of use for this statement is to conduct security checks,
1218       for instance:
1219
1220        check file su with path /bin/su
1221              if changed size then exec "/sbin/ifconfig eth0 down"
1222
1223       which will "cut the cable" and stop a possible intruder from compromis‐
1224       ing the system further. This test is just one of many you may use to
1225       increase the security awareness on a system. If you plan to use monit
1226       for security reasons we recommend that you use this test in combination
1227       with other supported tests like checksum, timestamp, and so on.
1228
1229       The constant size test form may be useful in similar or different con‐
1230       texts. It can, for instance, be used to test if a certain file size was
1231       exceeded and then alert you or monit may execute a certain action spec‐
1232       ified by you. An example is to use this statement to rotate log files
1233       after they have reached a certain size or to check that a database file
1234       does not grow beyond a specified threshold.
1235
1236       To rotate a log file:
1237
1238        check file myapp.log with path /var/log/myapp.log
1239           if size > 50 MB then
1240              exec "/usr/local/bin/rotate /var/log/myapp.log myapp"
1241
1242       where /usr/local/bin/rotate may be a simple script, such as:
1243
1244        #/bin/bash
1245        /bin/mv $1 $1.`date +%y-%m-%d`
1246        /usr/bin/pkill -HUP $2
1247
1248       Or you may use this statement to trigger the logrotate(8) program, to
1249       do an "emergency" rotate. Or to send an alert if a file becomes a known
1250       bottleneck if it grows behind a certain size because of limits in a
1251       database engine:
1252
1253        check file mydb with path /data/mydatabase.db
1254              if size > 1 GB then alert
1255
1256       This is a more restrictive form of the first example where the size is
1257       explicitly defined (note that the real su size is system dependent):
1258
1259        check file su with path /bin/su
1260              if size != 95564 then exec "/sbin/ifconfig eth0 down"
1261
1262       FILE CONTENT TESTING
1263
1264       The match statement allows you to test the content of a text file by
1265       using regular expressions. This is a great feature if you need to peri‐
1266       odically test files, such as log files, for certain patterns. If a pat‐
1267       tern match, monit defaults to raise an alert, other actions are also
1268       possible.
1269
1270       The syntax (keywords in capital) for using this function is:
1271
1272       IF [NOT] MATCH {regex|path} [[<X>] <Y> CYCLES] THEN action
1273
1274       regex is a string containing the extended regular expression.  See also
1275       regex(7).
1276
1277       path is an absolute path to a file containing extended regular expres‐
1278       sion on every line. See also regex(7).
1279
1280       action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC",
1281       "MONITOR" or "UNMONITOR".
1282
1283       You can use the NOT statement to invert a match.
1284
1285       The content is only being checked every cycle. If content is being
1286       added and removed between two checks they are unnoticed.
1287
1288       On startup the read position is set to the end of the file and monit
1289       continue to scan to the end of file on each cycle.  But if the file
1290       size should decrease or inode change the read position is set to the
1291       start of the file.
1292
1293       Only lines ending with a newline character are inspected. Thus, lines
1294       are being ignored until they have been completed with this character.
1295       Also note that only the first 511 characters of a line are inspected.
1296
1297       IGNORE [NOT] MATCH {regex|path}
1298
1299       Lines matching an IGNORE are not inspected during later evaluations.
1300       IGNORE MATCH has always precedence over IF MATCH.
1301
1302       All IGNORE MATCH statements are evaluated first, in the order of their
1303       appearance. Thereafter, all the IF MATCH statements are evaluated.
1304
1305       A real life example might look like this:
1306
1307         check file syslog with path /var/log/syslog
1308           ignore match
1309               "^\w{3} [ :0-9]{11} [._[:alnum:]-]+ monit\[[0-9]+\]:"
1310           ignore match /etc/monit/ignore.regex
1311           if match
1312               "^\w{3} [ :0-9]{11} [._[:alnum:]-]+ mrcoffee\[[0-9]+\]:"
1313           if match /etc/monit/active.regex then alert
1314
1315       FILESYSTEM FLAGS TESTING
1316
1317       monit tests the filesystem flags of devices for change. This test is
1318       implicit and monit will send alert in the case of failure by default.
1319
1320       You may override the default action using below rule (it may only be
1321       used within a device service entry in the monit control file).
1322
1323       This test is useful for detecting changes of the filesystem flags such
1324       as when the filesystem became read-only based on disk errors or the
1325       mount flags were changed (such as nosuid). Each platform provides dif‐
1326       ferent flags set. POSIX defined the RDONLY and NOSUID flags which
1327       should work on all platforms. Some platforms (such as FreeBSD) present
1328       another flags in addition.
1329
1330       The syntax for the fsflags statement is:
1331
1332       IF CHANGED FSFLAGS [[<X>] <Y> CYCLES] THEN action
1333
1334       action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC",
1335       "MONITOR" or "UNMONITOR".
1336
1337       Example:
1338
1339        check device rootfs with path /
1340              if changed fsflags then exec "/my/script"
1341              alert root@localhost
1342
1343       SPACE TESTING
1344
1345       Monit can test devices/file systems and check for space usage. This
1346       test may only be used within a device service entry in the monit con‐
1347       trol file.
1348
1349       Monit will check a device's total space usage. If you only want to
1350       check available space for non-superuser, you must set the watermark
1351       appropriately (i.e. total space minus reserved blocks for the supe‐
1352       ruser).
1353
1354       You can obtain (and set) the superuser's reserved blocks size, for
1355       example by using the tune2fs utility on Linux. On Linux 5% of available
1356       blocks are reserved for the superuser by default. To list the reserved
1357       blocks for the superuser:
1358
1359        [root@berry monit]# tune2fs -l /dev/hda1| grep "Reserved block"
1360        Reserved block count:     319994
1361        Reserved blocks uid:      0 (user root)
1362        Reserved blocks gid:      0 (group root)
1363
1364       On solaris 10% of the blocks are reserved. You can also use tunefs on
1365       solaris to change values on a live filesystem.
1366
1367       The full syntax for the space statement is:
1368
1369       IF SPACE operator value unit [[<X>] <Y> CYCLES] THEN action [ELSE IF
1370       PASSED [[<X>] <Y> CYCLES] THEN action]
1371
1372       operator is a choice of "<",">","!=","==" in c notation, "gt", "lt",
1373       "eq", "ne" in shell sh notation and "greater", "less", "equal", "note‐
1374       qual" in human readable form (if not specified, default is EQUAL).
1375
1376       unit is a choice of "B","KB","MB","GB", "%" or long alternatives
1377       "byte", "kilobyte", "megabyte", "gigabyte", "percent".
1378
1379       action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC",
1380       "MONITOR" or "UNMONITOR".
1381
1382       INODE TESTING
1383
1384       If supported by the file-system, you can use monit to test for inodes
1385       usage. This test may only be used within a device service entry in the
1386       monit control file.
1387
1388       If the device becomes unavailable, monit will call the entry's regis‐
1389       tered start method, if it is defined and if monit is running in active
1390       mode. If monit runs in passive mode or the start methods is not
1391       defined, monit will just send an error alert.
1392
1393       The syntax for the inode statement is:
1394
1395       IF INODE(S) operator value [unit] [[<X>] <Y> CYCLES] THEN action [ELSE
1396       IF PASSED [[<X>] <Y> CYCLES] THEN action]
1397
1398       operator is a choice of "<",">","!=","==" in c notation, "gt", "lt",
1399       "eq", "ne" in shell sh notation and "greater", "less", "equal", "note‐
1400       qual" in human readable form (if not specified, default is EQUAL).
1401
1402       unit is optional. If not specified, the value is an absolute count of
1403       inodes. You can use the "%" character or the longer alternative "per‐
1404       cent" as a unit.
1405
1406       action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC",
1407       "MONITOR" or "UNMONITOR".
1408
1409       PERMISSION TESTING
1410
1411       Monit can monitor the permissions. This test may only be used within a
1412       file, fifo, directory or device service entry in the monit control
1413       file.
1414
1415       The syntax for the permission statement is:
1416
1417       IF FAILED PERM(ISSION) octalnumber [[<X>] <Y> CYCLES] THEN action [ELSE
1418       IF PASSED [[<X>] <Y> CYCLES] THEN action]
1419
1420       octalnumber defines permissions for a file, a directory or a device as
1421       four octal digits (0-7). Valid range: 0000 - 7777 (you can ommit the
1422       leading zeros, monit will add the zeros to the left thus for example
1423       "640" is valid value and matches "0640").
1424
1425       action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC",
1426       "MONITOR" or "UNMONITOR".
1427
1428       The web interface will show a permission warning if the test failed.
1429
1430       We recommend that you use the UNMONITOR action in a permission state‐
1431       ment. The rationale for this feature is security and that monit does
1432       not start a possible cracked program or script. Example:
1433
1434        check file monit.bin with path "/usr/local/bin/monit"
1435              if failed permission 0555 then unmonitor
1436              alert foo@bar
1437
1438       If the test fails, monit will simply send an alert and stop monitoring
1439       the file and propagate an unmonitor action upward in a depend tree.
1440
1441       UID TESTING
1442
1443       monit can monitor the owner user id (uid). This test may only be used
1444       within a file, fifo, directory or device service entry in the monit
1445       control file.
1446
1447       The syntax for the uid statement is:
1448
1449       IF FAILED UID user [[<X>] <Y> CYCLES] THEN action [ELSE IF PASSED
1450       [[<X>] <Y> CYCLES] THEN action]
1451
1452       user defines a user id either in numeric or in string form.
1453
1454       action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC",
1455       "MONITOR" or "UNMONITOR".
1456
1457       The web interface will show a uid warning if the test should fail.
1458
1459       We recommend that you use the UNMONITOR action in a uid statement. The
1460       rationale for this feature is security and that monit does not start a
1461       possible cracked program or script. Example:
1462
1463        check file passwd with path /etc/passwd
1464              if failed uid root then unmonitor
1465              alert root@localhost
1466
1467       If the test fails, monit will simply send an alert and stop monitoring
1468       the file and propagate an unmonitor action upward in a depend tree.
1469
1470       GID TESTING
1471
1472       monit can monitor the owner group id (gid). This test may only be used
1473       within a file, fifo, directory or device service entry in the monit
1474       control file.
1475
1476       The syntax for the gid statement is:
1477
1478       IF FAILED GID user [[<X>] <Y> CYCLES] THEN action [ELSE IF PASSED
1479       [[<X>] <Y> CYCLES] THEN action]
1480
1481       user defines a group id either in numeric or in string form.
1482
1483       action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC",
1484       "MONITOR" or "UNMONITOR".
1485
1486       The web interface will show a gid warning if the test should fail.
1487
1488       We recommend that you use the UNMONITOR action in a gid statement. The
1489       rationale for this feature is security and that monit does not start a
1490       possible cracked program or script. Example:
1491
1492        check file shadow with path /etc/shadow
1493              if failed gid root then unmonitor
1494              alert root@localhost
1495
1496       If the test fails, monit will simply send an alert and stop monitoring
1497       the file and propagate an unmonitor action upward in a depend tree.
1498
1499       PID TESTING
1500
1501       monit tests the process id (pid) of processes for change. This test is
1502       implicit and monit will send alert in the case of failure by default.
1503
1504       You may override the default action using below rule (it may only be
1505       used within a process service entry in the monit control file).
1506
1507       The syntax for the pid statement is:
1508
1509       IF CHANGED PID [[<X>] <Y> CYCLES] THEN action
1510
1511       action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC",
1512       "MONITOR" or "UNMONITOR".
1513
1514       This test is useful to detect possible process restarts which has
1515       occurred in the timeframe between two monit testing cycles.  In the
1516       case that the restart was fast and the process provides expected ser‐
1517       vice (i.e. all tests passed) you will be notified that the process was
1518       replaced.
1519
1520       For example sshd daemon can restart very quickly, thus if someone
1521       changes its configuration and do sshd restart outside of monit control,
1522       you will be notified that the process was replaced by new instance (or
1523       you can optionaly do some other action such as preventively stop sshd).
1524
1525       Another example is MySQL Cluster which has its own watchdog with
1526       process restart ability. You can use monit for redundant monitoring.
1527       Monit will just send alert in the case that the MySQL cluster restarted
1528       the node quickly.
1529
1530       Example:
1531
1532        check process sshd with pidfile /var/run/sshd.pid
1533              if changed pid then exec "/my/script"
1534              alert root@localhost
1535
1536       PPID TESTING
1537
1538       monit tests the process parent id (ppid) of processes for change.  This
1539       test is implicit and monit will send alert in the case of failure by
1540       default.
1541
1542       You may override the default action using below rule (it may only be
1543       used within a process service entry in the monit control file).
1544
1545       The syntax for the ppid statement is:
1546
1547       IF CHANGED PPID [[<X>] <Y> CYCLES] THEN action
1548
1549       action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC",
1550       "MONITOR" or "UNMONITOR".
1551
1552       This test is useful for detecting changes of a process parent.
1553
1554       Example:
1555
1556        check process myproc with pidfile /var/run/myproc.pid
1557              if changed ppid then exec "/my/script"
1558              alert root@localhost
1559
1560       CONNECTION TESTING
1561
1562       Monit is able to perform connection testing via networked ports or via
1563       Unix sockets. A connection test may only be used within a process or
1564       within a host service entry in the monit control file.
1565
1566       If a service listens on one or more sockets, monit can connect to the
1567       port (using either tcp or udp) and verify that the service will accept
1568       a connection and that it is possible to write and read from the socket.
1569       If a connection is not accepted or if there is a problem with socket
1570       read/write, monit will assume that something is wrong and execute a
1571       specified action. If monit is compiled with openssl, then ssl based
1572       network services can also be tested.
1573
1574       The full syntax for the statement used for connection testing is as
1575       follows (keywords are in capital and optional statements in [brack‐
1576       ets]),
1577
1578       IF FAILED [host] port [type] [protocol|{send/expect}+] [timeout] [[<X>]
1579       <Y> CYCLES] THEN action [ELSE IF PASSED [[<X>] <Y> CYCLES] THEN action]
1580
1581       or for Unix sockets,
1582
1583       IF FAILED [unixsocket] [type] [protocol|{send/expect}+] [timeout]
1584       [[<X>] <Y> CYCLES] THEN action [ELSE IF PASSED [[<X>] <Y> CYCLES] THEN
1585       action]
1586
1587       host:HOST hostname. Optionally specify the host to connect to.  If the
1588       host is not given then localhost is assumed if this test is used inside
1589       a process entry. If this test was used inside a remote host entry then
1590       the entry's remote host is assumed.  Although host is intended for
1591       testing name based virtual host in a HTTP server running on local or
1592       remote host, it does allow the connection statement to be used to test
1593       a server running on another machine. This may be useful; For instance
1594       if you use Apache httpd as a front-end and an application-server as the
1595       back-end running on another machine, this statement may be used to test
1596       that the back-end server is running and if not raise an alert.
1597
1598       port:PORT number. The port number to connect to
1599
1600       unixsocket:UNIXSOCKET PATH. Specifies the path to a Unix socket.
1601       Servers based on Unix sockets, always runs on the local machine and
1602       does not use a port.
1603
1604       type:TYPE {TCP|UDP|TCPSSL}. Optionally specify the socket type monit
1605       should use when trying to connect to the port. The different socket
1606       types are; TCP, UDP or TCPSSL, where TCP is a regular stream based
1607       socket, UDP is a datagram socket and TCPSSL specify that monit should
1608       use a TCP socket with SSL when connecting to a port. The default socket
1609       type is TCP. If TCPSSL is used you may optionally specify the SSL/TLS
1610       protocol to be used and the md5 sum of the server's certificate. The
1611       TCPSSL options are:
1612
1613        TCPSSL [SSLAUTO|SSLV2|SSLV3|TLSV1] [CERTMD5 md5sum]
1614
1615       proto(col):PROTO {protocols}. Optionally specify the protocol monit
1616       should speak when a connection is established. At the moment monit
1617       knows how to speak:
1618        APACHE-STATUS
1619        DNS
1620        DWP
1621        FTP
1622        HTTP
1623        IMAP
1624        CLAMAV
1625        LDAP2
1626        LDAP3
1627        MYSQL
1628        NNTP
1629        NTP3
1630        POP
1631        POSTFIX-POLICY
1632        RDATE
1633        RSYNC
1634        SMTP
1635        SSH
1636        TNS
1637        PGSQL If you have compiled monit with ssl support, monit can also
1638       speak the SSL variants such as:
1639        HTTPS
1640        FTPS
1641        POPS
1642        IMAPS To use the SSL protocol support you need to define the socket as
1643       SSL and use the general protocol name (for example in the case of
1644       HTTPS) :
1645        TYPE TCPSSL PROTOCOL HTTP If the server's protocol is not found in
1646       this list, simply do not specify the protocol and monit will utilize a
1647       default test, including testing if it is possible to read and write to
1648       the port. This default test is in most cases more than good enough to
1649       deduce if the server behind the port is up or not.
1650
1651       The protocol statement is:
1652
1653        [PROTO(COL) {name} [REQUEST {"/path"} [with CHECKSUM checksum]]
1654
1655       As you can see, you may specify a request after the protocol, at the
1656       moment only the HTTP protocol supports the request option.  See also
1657       below for an example.
1658
1659       In addition to the standard protocols, the APACHE-STATUS protocol is a
1660       test of a specific server type, rather than a generic protocol. Server
1661       performance is examined using the status page generated by Apache's
1662       mod_status, which is expected to be at its default address of
1663       http://www.example.com/server-status.  Currently the APACHE-STATUS pro‐
1664       tocol examines the percentage of Apache child processes which are
1665
1666        o logging (loglimit)
1667        o closing connections (closelimit)
1668        o performing DNS lookups (dnslimit)
1669        o in keepalive with a client (keepalivelimit)
1670        o replying to a client (replylimit)
1671        o receiving a request (requestlimit)
1672        o initialising (startlimit)
1673        o waiting for incoming connections (waitlimit)
1674        o gracefully closing down (gracefullimit)
1675        o performing cleanup procedures (cleanuplimit)
1676
1677       Each of these quantities can be compared against a value relative to
1678       the total number of active Apache child processes. If the comparison
1679       expression is true the chosen action is performed.
1680
1681       The apache-status protocol statement is formally defined as (keywords
1682       in uppercase):
1683
1684        PROTO(COL) {limit} OP PERCENT [OR {limit} OP PERCENT]*
1685
1686       where {limit} is one or more of: loglimit, closelimit, dnslimit,
1687       keepalivelimit, replylimit, requestlimit, startlimit, waitlimit grace‐
1688       fullimit or cleanuplimit. The operator OP is one of: [<|=|>].
1689
1690       You can combine all of these test into one expression or you can choose
1691       to test a certain limit. If you combine the limits you must or' them
1692       together using the OR keyword.
1693
1694       Here's an example were we test for a loglimit more than 10 percent, a
1695       dnslimit over 25 percent and a wait limit less than 20 percent of pro‐
1696       cesses. See also more examples below in the example section.
1697
1698        protocol apache-status
1699                       loglimit > 10% or
1700                       dnslimit > 50% or
1701                       waitlimit < 20%
1702        then alert
1703
1704       Obviously, do not use this test unless the httpd server you are testing
1705       is Apache Httpd and mod_status is activated on the server.
1706
1707       send/expect: {SEND|EXPECT} "string" .... If monit does not support the
1708       protocol spoken by the server, you can write your own protocol-test
1709       using send and expect strings. The SEND statement sends a string to the
1710       server port and the EXPECT statement compares a string read from the
1711       server with the string given in the expect statement. If your system
1712       supports POSIX regular expressions, you can use regular expressions in
1713       the expect string, see regex(7) to learn more about the types of regu‐
1714       lar expressions you can use in an expect string. Otherwise the string
1715       is used as it is. The send/expect statement is:
1716
1717        [{SEND|EXPECT} "string"]+
1718
1719       Note that monit will send a string as it is, and you must remember to
1720       include CR and LF in the string sent to the server if the protocol
1721       expect such characters to terminate a string (most text based protocols
1722       used over Internet does). Likewise monit will read up to 256 bytes from
1723       the server and use this string when comparing the expect string. If the
1724       server sends strings terminated by CRLF, (i.e. "\r\n") you may remember
1725       to add the same terminating characters to the string you expect from
1726       the server.
1727
1728       You can use non-printable characters in a send string if needed. Use
1729       the hex notation, \0xHEXHEX to send any char in the range \0x00-\0xFF,
1730       that is, 0-255 in decimal. This may be useful when testing some network
1731       protocols, particularly those over UDP. An example, to test a quake 3
1732       server you can use the following,
1733
1734             send "\0xFF\0xFF\0xFF\0xFFgetstatus"
1735             expect "sv_floodProtect|sv_maxPing"
1736
1737       Finally, send/expect can be used with any socket type, such as TCP
1738       sockets, UNIX sockets and UDP sockets.
1739
1740       timeout:with TIMEOUT x SECONDS. Optionally specifies the connect and
1741       read timeout for the connection. If monit cannot connect to the server
1742       within this time it will assume that the connection failed and execute
1743       the specified action. The default connect timeout is 5 seconds.
1744
1745       action is a choice of "ALERT", "RESTART", "START", "STOP", "EXEC",
1746       "MONITOR" or "UNMONITOR".
1747
1748       Connection testing using the URL notation
1749
1750       You can test a HTTP server using the compact URL syntax. This test also
1751       allow you to use POSIX regular expressions to test the content returned
1752       by the HTTP server.
1753
1754       The full syntax for the URL statement is as follows (keywords are in
1755       capital and optional statements in [brackets]):
1756
1757         IF FAILED URL ULR-spec
1758            [CONTENT {==|!=} "regular-expression"]
1759            [TIMEOUT number SECONDS] [[<X>] <Y> CYCLES]
1760            THEN action
1761            [ELSE IF PASSED [[<X>] <Y> CYCLES] THEN action]
1762
1763       Where URL-spec is an URL on the standard form as specified in RFC 2396:
1764
1765        <protocol>://<authority><path>?<query>
1766
1767       Here is an example on an URL where all components are used:
1768
1769        http://user:password@www.foo.bar:8080/document/?querystring#ref
1770
1771       If a username and password is included in the URL monit will attempt to
1772       login at the server using Basic Authentication.
1773
1774       Testing the content returned by the server is optional. If used, you
1775       can test if the content match or does not match a regular expression.
1776       Here's an example on how the URL statement can be used in a check ser‐
1777       vice:
1778
1779        check host FOO with address www.foo.bar
1780             if failed url
1781                http://user:password@www.foo.bar:8080/?querystring
1782                and content == 'action="j_security_check"'
1783             then ...
1784
1785       Monit will look at the content-length header returned by the server and
1786       download this amount before testing the content. That is, if the con‐
1787       tent-length is more than 1Mb or this header is not set by the server
1788       monit will default to download up to 1 Mb and not more.
1789
1790       Only the http(s) protocol is supported in an URL statement. If the pro‐
1791       tocol is https monit will use SSL when connecting to the server.
1792
1793       Remote host ping test
1794
1795       In addition monit can perform ICMP Echo tests in remote host checks.
1796       The icmp test may only be used in a check host entry and monit must run
1797       with super user privileges, that is, the root user must run monit. The
1798       reason is that the icmp test utilize a raw socket to send the icmp
1799       packet and only the super user is allowed to create a raw socket.
1800
1801       The full syntax for the ICMP Echo statement used for ping testing is as
1802       follows (keywords are in capital and optional statements in [brack‐
1803       ets]):
1804
1805         IF FAILED ICMP TYPE ECHO
1806            [COUNT number] [WITH] [TIMEOUT number SECONDS]
1807              [[<X>] <Y> CYCLES]
1808            THEN action
1809            [ELSE IF PASSED [[<X>] <Y> CYCLES] THEN action]
1810
1811       The rules for action and timeout are the same as those mentioned above
1812       in the CONNECTION TESTING section. The count parameter specifies how
1813       many consecutive echo requests will be send to the host in one cycle.
1814       In the case that no reply came within timeout frame, monit reports
1815       error. When at least one reply was received, the test will pass. Monit
1816       sends by default three echo requests in one cycle to prevent the random
1817       packet loss from generating false alarm (i.e. up to 66% packet loss is
1818       tolerated). You can set the count option to a value between 1 and 20,
1819       which can serve as an error ratio. For example if you require 100% ping
1820       success, set the count to 1 (i.e. just one request will be sent, and if
1821       the packet was lost an error will be reported).
1822
1823       An icmp ping test is useful for testing if a host is up, before testing
1824       ports at the host. If an icmp ping test is used in a check host entry,
1825       this test is run first and if the ping test should fail we assume that
1826       the connection to the host is down and monit does not continue to test
1827       any ports. Here's an example:
1828
1829        check host xyzzy with address xyzzy.org
1830              if failed icmp type echo count 5 with timeout 15 seconds
1831                 then alert
1832              if failed port 80 proto http then alert
1833              if failed port 443 type TCPSSL proto http then alert
1834              alert foo@bar
1835
1836       In this case, if the icmp test should fail you will get one alert and
1837       only one alert as long as the host is down, and equally important,
1838       monit will not test port 80 and port 443. Likewise if the icmp ping
1839       test should succeed (again) monit will continue to test both port 80
1840       and 443.
1841
1842       Keep in mind though that some firewalls can block icmp packages and
1843       thus render the test useless.
1844
1845       Examples
1846
1847       To check a port connection and receive an alert if monit cannot connect
1848       to the port, use the following statement:
1849
1850         if failed port 80 then alert
1851
1852       In this case the machine in question is assumed to be the default host.
1853       For a process entry it's localhost and for a remote host entry it's the
1854       address of the remote host. Monit will conduct a tcp connection to the
1855       host at port 80 and use tcp by default.  If you want to connect with
1856       udp, you can specify this after the port-statement;
1857
1858        if failed port 53 type udp protocol dns then alert
1859
1860       Monit will stop trying to connect to the port after 5 seconds and
1861       assume that the server behind the port is down. You may increase or
1862       decrease the connect timeout by explicit add a connection timeout. In
1863       the following example the timeout is increased to 15 seconds and if
1864       monit cannot connect to the server within 15 seconds the test will fail
1865       and an alert message is sent.
1866
1867         if failed port 80 with timeout 15 seconds then alert
1868
1869       If a server is listening to a Unix socket the following statement can
1870       be used:
1871
1872        if failed unixsocket /var/run/sophie then alert
1873
1874       A Unix socket is used by some servers for fast (interprocess) communi‐
1875       cation on localhost only. A Unix socket is specified by a path and in
1876       the example above the path, /var/run/sophie, specifies a Unix socket.
1877
1878       If your machine answers for several virtual hosts you can prefix the
1879       port statement with a host-statement like so:
1880
1881        if failed host www.sol.no port 80 then alert
1882        if failed host 80.69.226.133 port 443 then alert
1883        if failed host kvasir.sol.no port 80 then alert
1884
1885       And as mentioned above, if you do not specify a host-statement, local‐
1886       host or address is assumed.
1887
1888       Monit also knows how to speak some of the more popular Internet proto‐
1889       cols. So, besides testing for connections, monit can also speak with
1890       the server in question to verify that the server works. For example,
1891       the following is used to test a http server:
1892
1893        if failed host www.tildeslash.com port 80 proto http
1894           then restart
1895
1896       Some protocols also support a request statement. This statement can be
1897       used to ask the server for a special document entity.
1898
1899       Currently only the HTTP protocol module supports the request statement,
1900       such as:
1901
1902        if failed host www.myhost.com port 80 protocol http
1903           and request "/data/show.php?a=b&c=d"
1904        then restart
1905
1906       The request must contain an URI string specifying a document from the
1907       http server. The string will be URL encoded by monit before it sends
1908       the request to the http server, so it's okay to use URL unsafe charac‐
1909       ters in the request. If the request statement isn't specified, the
1910       default web server page will be requested.
1911
1912       You can also test the checksum for documents returned by a http server.
1913       You can use either MD5 sums:
1914
1915        if failed port 80 protocol http
1916           and request "/page.html"
1917               with checksum 8f7f419955cefa0b33a2ba316cba3659
1918        then alert
1919
1920       Or you can use SHA1 sums:
1921
1922        if failed port 80 protocol http
1923           and request "/page.html"
1924               with checksum e428302e260e0832007d82de853aa8edf19cd872
1925        then alert
1926
1927       monit will compute a checksum (either MD5 or SHA1 is used, depending on
1928       length of the hash) for the document (in the above case, /page.html)
1929       and compare the computed checksum with the expected checksum. If the
1930       sums does not match then the if-tests action is performed, in this case
1931       alert. Note that monit will not test the checksum for a document if the
1932       server does not set the HTTP Content-Length header. A HTTP server
1933       should set this header when it server a static document (i.e. a file).
1934       A server will often use chunked transfer encoding instead when serving
1935       dynamic content (e.g. a document created by a CGI-script or a Servlet),
1936       but to test the checksum for dynamic content is not very useful. There
1937       are no limitation on the document size, but keep in mind that monit
1938       will use time to download the document over the network so it's proba‐
1939       bly smart not to ask monit to compute a checksum for documents larger
1940       than 1Mb or so, depending on you network connection of course. Tip; If
1941       you get a checksum error even if the document has the correct sum, the
1942       reason may be that the download timed out. In this case, explicit set a
1943       longer timeout than the default 5 seconds.
1944
1945       As mentioned above, if the server protocol is not supported by monit
1946       you can write your own protocol test using send/expect strings. Here we
1947       show a protocol test using send/expect for an imaginary "Ali Baba and
1948       the Forty Thieves" protocol:
1949
1950        if failed host cave.persia.ir port 4040
1951           send "Open, Sesame!\r\n"
1952           expect "Please enter the cave\r\n"
1953           send "Shut, Sesame!\r\n"
1954           expect "See you later [A-Za-z ]+\r\n"
1955        then restart
1956
1957       The TCPSSL statement can optionally test the md5 sum of the server's
1958       certificate. You must state the md5 certificate string you expect the
1959       server to deliver and upon a connect to the server, the server's actual
1960       md5 sum certificate string is tested.  Any other symbol but [A-Fa-f0-9]
1961       is being ignored in that sting.  Thus it is possible to copy and paste
1962       the output of e.g. openssl.  If they do not match, the connection test
1963       fails. If the ssl version handshake does not work properly you can also
1964       force a specific ssl version, as we demonstrate in this example:
1965
1966        if failed host shop.sol.no port 443
1967           type TCPSSL SSLV3 # Force monit to use ssl version 3
1968           # We expect the server to return this  md5 certificate sum
1969           # as either 12-34-56-78-90-AB-CD-EF-12-34-56-78-90-AB-CD-EF
1970           # or e.g.   1234567890ABCDEF1234567890ABCDEF
1971           # or e.g.   1234567890abcdef1234567890abcdef
1972           # what ever come in more handy (see text above)
1973           CERTMD5 12-34-56-78-90-AB-CD-EF-12-34-56-78-90-AB-CD-EF
1974           protocol http
1975        then restart
1976
1977       Here's an example where a connection test is used inside a process
1978       entry:
1979
1980        check process apache with pidfile /var/run/apache.pid
1981              start program = "/etc/init.d/httpd start"
1982              stop program = "/etc/init.d/httpd stop"
1983              if failed host www.tildeslash.com port 80 then restart
1984
1985       Here, a connection test is used in a remote host entry:
1986
1987        check host up2date with address ftp.redhat.com
1988              if failed port 21 and protocol ftp then alert
1989
1990       Since we did not explicit specify a host in the above test, monit will
1991       connect to port 21 at ftp.redhat.com. Apropos, the host address can be
1992       specified as a dotted IP address string or as hostname in the DNS. The
1993       following is exactly[*] the same test, but here an ip address is used
1994       instead:
1995
1996        check host up2date with address 66.187.232.30
1997              if failed port 21 and protocol ftp then alert
1998
1999       [*] Well, not quite, since we specify an ip-address directly we will
2000       bypass any DNS round-robin setup, but that's another story.
2001
2002       For more examples, see the example section below.
2003

MONIT HTTPD

2005       If specified in the control file, monit will start a monit daemon with
2006       http support. From a Browser you can then start and stop services, dis‐
2007       able or enable service monitoring as well as view the status of each
2008       service. Also, if monit logs to its own file, you can view the content
2009       of this logfile in a Browser.
2010
2011       The control file statement for starting a monit daemon with http sup‐
2012       port is a global set-statement:
2013
2014       set httpd port 2812
2015
2016       And you can use this URL, http://localhost:2812/, to access the daemon
2017       from a browser. The port number, in this case 2812, can be any number
2018       that you are allowed to bind to.
2019
2020       If you have compiled monit with openssl, you can also start the httpd
2021       server with ssl support, using the following expression:
2022
2023        set httpd port 2812
2024            ssl enable
2025            pemfile /etc/certs/monit.pem
2026
2027       And you can use this URL, https://localhost:2812/, to access the monit
2028       web server over an ssl encrypted connection.
2029
2030       The pemfile, in the example above, holds both the server's private key
2031       and certificate. This file should be stored in a safe place on the
2032       filesystem and should have strict permissions, that is, no more than
2033       0700.
2034
2035       In addition, if you want to check for client certificates you can use
2036       the CLIENTPEMFILE statement. In this case, a connecting client has to
2037       provided a certificate known by monit in order to connect. This file
2038       also needs to have all necessary CA certificates. A configuration could
2039       look like:
2040
2041        set httpd port 2812
2042            ssl enable
2043            pemfile /etc/certs/monit.pem
2044            clientpemfile /etc/certs/monit-client.pem
2045
2046       By default self signed client certificates are not allowed. If you want
2047       to use a self signed certificate from a client it has to be allowed
2048       explicitly with the ALLOWSELFCERTIFICATION statement.
2049
2050       For more information on how to use monit with SSL and for more informa‐
2051       tion about certificates and generating pem files, please consult the
2052       README.SSL file accompanying the software.
2053
2054       If you only want the http server to accept connect requests to one host
2055       addresses you can specify the bind address either as an IP number
2056       string or as a hostname. In the following example we bind the http
2057       server to the loopback device. In other words the http server will only
2058       be reachable from localhost:
2059
2060         set httpd port 2812 and use the address 127.0.0.1
2061
2062       or
2063
2064         set httpd port 2812 and use the address localhost
2065
2066       If you do not use the ADDRESS statement the http server will accept
2067       connections on any/all local addresses.
2068
2069       It is possible to hide monit's httpd server version, which usually is
2070       available in httpd header responses and in error pages.
2071
2072         set httpd port 2812
2073           ...
2074           signature {enable|disable}
2075
2076       Use disable to hide the server signature - monit will only report its
2077       name (e.g. 'monit' instead of for example 'monit 4.2'). By default the
2078       version signature is enabled. It is worth to stress that this option
2079       provides no security advantage and falls into the "security through
2080       obscurity" category.
2081
2082       If you remove the httpd statement from the config file, monit will stop
2083       the httpd server on configuration reload. Likewise if you change the
2084       port number, monit will restart the http server using the new specified
2085       port number.
2086
2087       The status page displayed by the monit web server is automatically
2088       refreshed with the same poll time set for the monit daemon.
2089
2090       Note:
2091
2092       We strongly recommend that you start monit with http support (and bind
2093       the server to localhost, only, unless you are behind a firewall). The
2094       built-in web-server is small and does not use much resources, and more
2095       importantly, monit can use the http server for interprocess communica‐
2096       tion between a monit client and a monit daemon.
2097
2098       For instance, you must start a monit daemon with http support if you
2099       want to be able to use most of the available console commands. I.e.
2100       'monit stop all', 'monit start all' etc.
2101
2102       If a monit daemon is running in the background we will ask the daemon
2103       (via the HTTP protocol) to execute the above commands.  That is, the
2104       daemon is requested to start and stop the services.  This ensures that
2105       a daemon will not restart a service that you requested to stop and that
2106       (any) timeout lock will be removed from a service when you start it.
2107
2108       Monit HTTPD Authentication
2109
2110       monit supports two types of authentication schema's for connecting to
2111       the httpd server, (three, if you count SSL client certificate valida‐
2112       tion). Both schema's can be used together or by itself. You must choose
2113       at least one.
2114
2115       Host and network allow list
2116
2117       The http server maintains an access-control list of hosts and networks
2118       allowed to connect to the server. You can add as many hosts as you want
2119       to, but only hosts with a valid domain name or its IP address are
2120       allowed. Networks require a network IP and a netmask to be accepted.
2121
2122       The http server will query a name server to check any hosts connecting
2123       to the server. If a host (client) is trying to connect to the server,
2124       but cannot be found in the access list or cannot be resolved, the
2125       server will shutdown the connection to the client promptly.
2126
2127       Control file example:
2128
2129         set httpd port 2812
2130             allow localhost
2131             allow my.other.work.machine.com
2132             allow 10.1.1.1
2133             allow 192.168.1.0/255.255.255.0
2134             allow 10.0.0.0/8
2135
2136       Clients, not mentioned in the allow list, trying to connect to the
2137       server are logged with their ip-address.
2138
2139       Basic Authentication
2140
2141       This authentication schema is HTTP specific and described in more
2142       detail in RFC 2617.
2143
2144       In short; a server challenge a client (e.g. a Browser) to send authen‐
2145       tication information (username and password) and if accepted, the
2146       server will allow the client access to the requested document.
2147
2148       The biggest weakness with Basic Authentication is that the username and
2149       password is sent in clear-text (i.e. base64 encoded) over the network.
2150       It is therefor recommended that you do not use this authentication
2151       method unless you run the monit http server with ssl support. With ssl
2152       support it is completely safe to use Basic Authentication since all
2153       http data, including Basic Authentication headers will be encrypted.
2154
2155       monit will use Basic Authentication if an allow statement contains a
2156       username and a password separated with a single ':' character, like so;
2157       allow username:password. The username and password must be written in
2158       clear-text.
2159
2160       Alternatively you can use files in "htpasswd" format (one user:passwd
2161       entry per line), like so: allow [cleartext|crypt|md5] /path [users]. By
2162       default cleartext passwords are read. In case the passwords are
2163       digested it is necessary to specify the cryptographic method. If you do
2164       not want all users in the password file to have access to monit you can
2165       specify only those users that should have access, in the allow state‐
2166       ment. Otherwise all users are added.
2167
2168       Example1:
2169
2170         set httpd port 2812
2171             allow hauk:password
2172             allow md5 /etc/httpd/htpasswd john paul ringo george
2173
2174       If you use this method together with a host list, then only clients
2175       from the listed hosts will be allowed to connect to the monit http
2176       server and each client will be asked to provide a username and a pass‐
2177       word.
2178
2179       Example2:
2180
2181         set httpd port 2812
2182             allow localhost
2183             allow 10.1.1.1
2184             allow hauk:password
2185
2186       If you only want to use Basic Authentication, then just provide allow
2187       entries with username and password or password files as in example 1
2188       above.
2189
2190       Finally it is possible to define some users as read-only. A read-only
2191       user can read the monit web pages but will not get access to push-but‐
2192       tons and cannot change a service from the web interface.
2193
2194         set httpd port 2812
2195             allow admin:password
2196             allow hauk:password read-only
2197
2198       A user is set to read-only by using the read-only keyword after user‐
2199       name:password. In the above example the user hauk is defined as a read-
2200       only user, while the admin user has all access rights.
2201
2202       NB! a monit client will use the first username:password pair in an
2203       allow list and you should not define the first user as a read-only
2204       user. If you do, monit console commands will not work.
2205
2206       If you use Basic Authentication it is a good idea to set the access
2207       permission for the control file (~/.monitrc) to only readable and
2208       writable for the user running monit, because the password is written in
2209       clear-text. (Use this command, /bin/chmod 600 ~/.monitrc). In fact,
2210       since monit version 3.0, monit will complain and exit if the control
2211       file is readable by others.
2212
2213       Clients trying to connect to the server but supply the wrong username
2214       and/or password are logged with their ip-address.
2215
2216       If the monit command line interface is being used, at least one cleart‐
2217       ext password is necessary. Otherwise, the monit command line interface
2218       will not be able to connect to the monit daemon server.
2219

DEPENDENCIES

2221       If specified in the control file, monit can do dependency checking
2222       before start, stop, monitoring or unmonitoring of services. The depen‐
2223       dency statement may be used within any service entries in the monit
2224       control file.
2225
2226       The syntax for the depend statement is simply:
2227
2228       DEPENDS on service[, service [,...]]
2229
2230       Where service is a service entry name, for instance apache or datafs.
2231
2232       You may add more than one service name of any type or use more than one
2233       depend statement in an entry.
2234
2235       Services specified in a depend statement will be checked during
2236       stop/start/monitor/unmonitor operations. If a service is stopped or
2237       unmonitored it will stop/unmonitor any services that depends on itself.
2238       Likewise, if a service is started, it will first stop any services that
2239       depends on itself and after it is started, start all depending services
2240       again. If the service is to be monitored (enable monitoring), all ser‐
2241       vices which this service depends on will be monitored before enabling
2242       monitoring of this service.
2243
2244       Here is an example where we set up an apache service entry to depend on
2245       the underlying apache binary. If the binary should change an alert is
2246       sent and apache is not monitored anymore. The rationale is security and
2247       that monit should not execute a possibly cracked apache binary.
2248
2249        (1) check process apache
2250        (2)    with pidfile "/usr/local/apache/logs/httpd.pid"
2251        (3)    ...
2252        (4)    depends on httpd
2253        (5)
2254        (6) check file httpd with path /usr/local/apache/bin/httpd
2255        (7)    if failed checksum then unmonitor
2256
2257       The first entry is the process entry for apache shown before (abbrevi‐
2258       ated for clarity). The fourth line sets up a dependency between this
2259       entry and the service entry named httpd in line 6. A depend tree works
2260       as follows, if an action is conducted in a lower branch it will propa‐
2261       gate upward in the tree and for every dependent entry execute the same
2262       action. In this case, if the checksum should fail in line 7 then an
2263       unmonitor action is executed and the apache binary is not checked any‐
2264       more. But since the apache process entry depends on the httpd entry
2265       this entry will also execute the unmonitor action. In short, if the
2266       checksum test for the httpd binary file should fail, both the check
2267       file httpd entry and the check process apache entry is set in un-moni‐
2268       toring mode.
2269
2270       A dependency tree is a general construct and can be used between all
2271       types of service entries and span many levels and propagate any sup‐
2272       ported action (except the exec action which will not propagate upward
2273       in a dependency tree for obvious reasons).
2274
2275       Here is another different example. Consider the following common server
2276       setup:
2277
2278         WEB-SERVER -> APPLICATION-SERVER -> DATABASE -> FILESYSTEM
2279             (a)               (b)             (c)          (d)
2280
2281       You can set dependencies so that the web-server depends on the applica‐
2282       tion server to run before the web-server starts and the application
2283       server depends on the database server and the database depends on the
2284       file-system to be mounted before it starts. See also the example sec‐
2285       tion below for examples using the depend statement.
2286
2287       Here we describe how monit will function with the above dependencies:
2288
2289       If no servers are running
2290           monit will start the servers in the following order: d, c, b, a
2291
2292       If all servers are running
2293           When you run 'monit stop all' this is the stop order: a, b, c, d.
2294           If you run 'monit stop d' then a, b and c are also stopped because
2295           they depend on d and finally d is stopped.
2296
2297       If a does not run
2298           When monit runs it will start a
2299
2300       If b does not run
2301           When monit runs it will first stop a then start b and finally start
2302           a again.
2303
2304       If c does not run
2305           When monit runs it will first stop a and b then start c and finally
2306           start b then a.
2307
2308       If d does not run
2309           When monit runs it will first stop a, b and c then start d and
2310           finally start c, b then a.
2311
2312       If the control file contains a depend loop.
2313           A depend loop is for example; a->b and b->a or a->b->c->a.
2314
2315           When monit starts it will check for such loops and complain and
2316           exit if a loop was found. It will also exit with a complaint if a
2317           depend statement was used that does not point to a service in the
2318           control file.
2319

THE RUN CONTROL FILE

2321       The preferred way to set up monit is to write a .monitrc file in your
2322       home directory. When there is a conflict between the command-line argu‐
2323       ments and the arguments in this file, the command-line arguments take
2324       precedence. To protect the security of your control file and passwords
2325       the control file must have permissions no more than 0700 (u=xrw,g=,o=);
2326       monit will complain and exit otherwise.
2327
2328       Run Control Syntax
2329
2330       Comments begin with a '#' and extend through the end of the line.  Oth‐
2331       erwise the file consists of a series of service entries or global
2332       option statements in a free-format, token-oriented syntax.
2333
2334       There are three kinds of tokens: grammar keywords, numbers (i.e.  deci‐
2335       mal digit sequences) and strings. Strings can be either quoted or
2336       unquoted. A quoted string is bounded by double quotes and may contain
2337       whitespace (and quoted digits are treated as a string). An unquoted
2338       string is any whitespace-delimited token, containing characters and/or
2339       numbers.
2340
2341       On a semantic level, the control file consists of two types of entries:
2342
2343       1. Global set-statements
2344           A global set-statement starts with the keyword set and the item to
2345           configure.
2346
2347       2. One or more service entry statements.
2348           Each service entry consists of the keywords `check', followed by
2349           the service type. Each entry requires a <unique> descriptive name,
2350           which may be freely chosen. This name is used by monit to refer to
2351           the service internally and in all interactions with the user.
2352
2353       Currently, six types of check statements are supported:
2354
2355       1. CHECK PROCESS <unique name> PIDFILE <path>
2356           <path> is the absolute path to the program's pidfile. If the pid‐
2357           file does not exist or does not contain the pid number of a running
2358           process, monit will call the entry's start method if defined, If
2359           monit runs in passive mode or the start methods is not defined,
2360           monit will just send alerts on errors.
2361
2362       2. CHECK FILE <unique name> PATH <path>
2363           <path> is the absolute path to the file. If the file does not exist
2364           or disappeared, monit will call the entry's start method if
2365           defined, if <path> does not point to a regular file type (for
2366           instance a directory), monit will disable monitoring of this entry.
2367           If monit runs in passive mode or the start methods is not defined,
2368           monit will just send alerts on errors.
2369
2370       3. CHECK FIFO <unique name> PATH <path>
2371           <path> is the absolute path to the fifo. If the fifo does not exist
2372           or disappeared, monit will call the entry's start method if
2373           defined, if <path> does not point to a fifo type (for instance a
2374           directory), monit will disable monitoring of this entry. If monit
2375           runs in passive mode or the start methods is not defined, monit
2376           will just send alerts on errors.
2377
2378       4. CHECK DEVICE <unique name> PATH <path>
2379           <path> is the path to the device block special file, mount point,
2380           file or a directory which is part of a filesystem. It is recom‐
2381           mended to use a block special file directly (for example /dev/hda1
2382           on Linux or /dev/dsk/c0t0d0s1 on Solaris, etc.) If you use a mount
2383           point (for example /data), be careful, because if the device is
2384           unmounted the test will still be true because the mount point
2385           exist.
2386
2387           If the device becomes unavailable, monit will call the entry's
2388           start method if defined. if <path> does not point to a device,
2389           monit will disable monitoring of this entry. If monit runs in pas‐
2390           sive mode or the start methods is not defined, monit will just send
2391           alerts on errors.
2392
2393       5. CHECK DIRECTORY <unique name> PATH <path>
2394           <path> is the absolute path to the directory. If the directory does
2395           not exist or disappeared, monit will call the entry's start method
2396           if defined, if <path> does not point to a directory, monit will
2397           disable monitoring of this entry. If monit runs in passive mode or
2398           the start methods is not defined, monit will just send alerts on
2399           errors.
2400
2401       6. CHECK HOST <unique name> ADDRESS <host address>
2402           The host address can be specified as a hostname string or as an ip-
2403           address string on a dotted decimal format. Such as, tildeslash.com
2404           or "64.87.72.95".
2405
2406       7. CHECK SYSTEM <unique name>
2407           The system name is usualy hostname, but any descriptive name can be
2408           used. This test allows to check general system resources such as
2409           CPU usage (percent of time spent in user, system and wait), total
2410           memory usage or load average.
2411
2412       You can use noise keywords like 'if', `and', `with(in)', `has',
2413       `using', 'use', 'on(ly)', `usage' and `program(s)' anywhere in an entry
2414       to make it resemble English. They're ignored, but can make entries much
2415       easier to read at a glance. The punctuation characters ';' ',' and '='
2416       are also ignored. Keywords are case insensitive.
2417
2418        Here are the legal global keywords:
2419
2420        Keyword         Function
2421        ----------------------------------------------------------------
2422        set daemon      Set a background poll interval in seconds.
2423        set init        Set monit to run from init. monit will not
2424                        transform itself into a daemon process.
2425        set logfile     Name of a file to dump error- and status-
2426                        messages to. If syslog is specified as the
2427                        file, monit will utilize the syslog daemon
2428                        to log messages. This can optionally be
2429                        followed by 'facility <facility>' where
2430                        facility is 'log_local0' - 'log_local7' or
2431                        'log_daemon'. If no facility is specified,
2432                        LOG_USER is used.
2433        set mailserver  The mailserver used for sending alert
2434                        notifications. If the mailserver is not
2435                        defined, monit will try to use 'localhost'
2436                        as the smtp-server for sending mail. You
2437                        can add more mail servers, if monit cannot
2438                        connect to the first server it will try the
2439                        next server and so on.
2440        set mail-format Set a global mail format for all alert
2441                        messages emitted by monit.
2442        set pidfile     Explicit set the location of the monit lock
2443                        file. E.g. set pidfile /var/run/xyzmonit.pid.
2444        set statefile   Explicit set the location of the file monit
2445                        will write state data to. If not set, the
2446                        default is $HOME/.monit.state.
2447        set httpd port  Activates monit http server at the given
2448                        port number.
2449        ssl enable      Enables ssl support for the httpd server.
2450                        Requires the use of the pemfile statement.
2451        ssl disable     Disables ssl support for the httpd server.
2452                        It is equal to omitting any ssl statement.
2453        pemfile         Set the pemfile to be used with ssl.
2454        clientpemfile   Set the pemfile to be used when client
2455                        certificates should be checked by monit.
2456        address         If specified, the http server will only
2457                        accept connect requests to this addresses
2458                        This statement is an optional part of the
2459                        set httpd statement.
2460        allow           Specifies a host or IP address allowed to
2461                        connect to the http server. Can also specify
2462                        a username and password allowed to connect
2463                        to the server. More than one allow statement
2464                        are allowed. This statement is also an
2465                        optional part of the set httpd statement.
2466        read-only       Set the user defined in username:password
2467                        to read only. A read-only user cannot change
2468                        a service from the monit web interface.
2469        include         include a file or files matching the globstring
2470
2471        Here are the legal service entry keywords:
2472
2473        Keyword         Function
2474        ----------------------------------------------------------------
2475        check           Starts an entry and must be followed by the type
2476                        of monitored service {device|directory|file|host
2477                        process|system} and a descriptive name for the
2478                        service.
2479        pidfile         Specify the  process pidfile. Every
2480                        process must create a pidfile with its
2481                        current process id. This statement should only
2482                        be used in a process service entry.
2483        path            Must be followed by a path to the block
2484                        special file for filesystem (device), regular
2485                        file, directory or a process's pidfile.
2486        group           Specify a groupname for a service entry.
2487        start           The program used to start the specified
2488                        service. Full path is required. This
2489                        statement is optional, but recommended.
2490        stop            The program used to stop the specified
2491                        service. Full path is required. This
2492                        statement is optional, but recommended.
2493        pid and ppid    These keywords may be used as standalone
2494                        statements in a process service entry to
2495                        override the alert action for change of
2496                        process pid and ppid.
2497        uid and gid     These keywords are either 1) an optional part of
2498                        a start, stop or exec statement. They may be
2499                        used to specify a user id and a group id the
2500                        program (process) should switch to upon start.
2501                        This feature can only be used if the superuser
2502                        is running monit. 2) uid and gid may also be
2503                        used as standalone statements in a file service
2504                        entry to test a file's uid and gid attributes.
2505        host            The hostname or IP address to test the port
2506                        at. This keyword can only be used together
2507                        with a port statement or in the check host
2508                        statement.
2509        port            Specify a TCP/IP service port number which
2510                        a process is listening on. This statement
2511                        is also optional. If this statement is not
2512                        prefixed with a host-statement, localhost is
2513                        used as the hostname to test the port at.
2514        type            Specifies the socket type monit should use when
2515                        testing a connection to a port. If the type
2516                        keyword is omitted, tcp is used. This keyword
2517                        must be followed by either tcp, udp or tcpssl.
2518        tcp             Specifies that monit should use a TCP
2519                        socket type (stream) when testing a port.
2520        tcpssl          Specifies that monit should use a TCP socket
2521                        type (stream) and the secure socket layer (ssl)
2522                        when testing a port connection.
2523        udp             Specifies that monit should use a UDP socket
2524                        type (datagram) when testing a port.
2525        certmd5         The md5 sum of a certificate a ssl forged
2526                        server has to deliver.
2527        proto(col)      This keyword specifies the type of service
2528                        found at the port. monit knows at the moment
2529                        how to speak HTTP, SMTP, FTP, POP, IMAP, MYSQL,
2530                        NNTP, SSH, DWP, LDAP2, LDAP3, RDATE, NTP3, DNS,
2531                        POSTFIX-POLICY, APACHE-STATUS, TNS, PGSQL and
2532                        RSYNC.
2533                        You're welcome to write new protocol test
2534                        modules. If no protocol is specified monit will
2535                        use a default test which in most cases are good
2536                        enough.
2537        request         Specifies a server request and must come
2538                        after the protocol keyword mentioned above.
2539                         - for http it can contain an URL and an
2540                           optional query string.
2541                         - other protocols does not support this
2542                           statement yet
2543        send/expect     These keywords specify a generic protocol.
2544                        Both require a string whether to be sent or
2545                        to be matched against (as extended regex if
2546                        supported).  Send/expect can not be used
2547                        together with the proto(col) statement.
2548        unix(socket)    Specifies a Unix socket file and used like
2549                        the port statement above to test a Unix
2550                        domain network socket connection.
2551        URL             Specify an URL string which monit will use for
2552                        connection testing.
2553        content         Optional sub-statement for the URL statement.
2554                        Specifies that monit should test the content
2555                        returned by the server against a regular
2556                        expression.
2557        timeout x sec.  Define a network port connection timeout. Must
2558                        be followed by a number in seconds and the
2559                        keyword, seconds.
2560        timeout         Define a service timeout. Must be followed by
2561                        two digits. The first digit is max number of
2562                        restarts for the service. The second digit
2563                        is the cycle interval to test restarts.
2564                        This statement is optional.
2565        alert           Specifies an email address for notification
2566                        if a service event occurs. Alert can also
2567                        be postfixed, to only send a message for
2568                        certain events. See the examples above. More
2569                        than one alert statement is allowed in an
2570                        entry. This statement is also optional.
2571        noalert         Specifies an email address which don't want
2572                        to receive alerts. This statement is also
2573                        optional.
2574        restart, stop   These keywords may be used as actions for
2575        unmonitor,      various test statements. The exec statement is
2576        start and       special in that it requires a following string
2577        exec            specifying the program to be execute. You may
2578                        also specify an UID and GID for the exec
2579                        statement. The program executed will then run
2580                        using the specified user id and group id.
2581        mail-format     Specifies a mail format for an alert message
2582                        This statement is an optional part of the
2583                        alert statement.
2584        checksum        Specify that monit should compute and monitor a
2585                        file's md5/sha1 checksum. May only be used in a
2586                        check file entry.
2587        expect          Specifies a md5/sha1 checksum string monit
2588                        should expect when testing the checksum. This
2589                        statement is an optional part of the checksum
2590                        statement.
2591        timestamp       Specifies an expected timestamp for a file
2592                        or directory. More than one timestamp statement
2593                        are allowed. May only be used in a check file or
2594                        check directory entry.
2595        changed         Part of a timestamp statement and used as an
2596                        operator to simply test for a timestamp change.
2597        every           Validate this entry only at every n poll cycle.
2598                        Useful in daemon mode when the cycle is short
2599                        and a service takes some time to start.
2600        mode            Must be followed either by the keyword active,
2601                        passive or manual. If active, monit will restart
2602                        the service if it is not running (this is the
2603                        default behavior). If passive, monit will not
2604                        (re)start the service if it is not running - it
2605                        will only monitor and send alerts (resource
2606                        related restart and stop options are ignored
2607                        in this mode also). If manual, monit will enter
2608                        active mode only if a service was started under
2609                        monit's control otherwise the service isn't
2610                        monitored.
2611        cpu             Must be followed by a compare operator, a number
2612                        with "%" and an action. This statement is used
2613                        to check the cpu usage in percent of a process
2614                        with its children over a number of cycles. If
2615                        the compare expression matches then the
2616                        specified action is executed.
2617        mem             The equivalent to the cpu token for memory of a
2618                        process (w/o children!).  This token must be
2619                        followed by a compare operator a number with
2620                        unit {B|KB|MB|GB|%|byte|kilobyte|megabyte|
2621                        gigabyte|percent} and an action.
2622        loadavg         Must be followed by [1min,5min,15min] in (), a
2623                        compare operator, a number and an action. This
2624                        statement is used to check the system load
2625                        average over a number of cycles. If the compare
2626                        expression matches then the specified action is
2627                        executed.
2628        children        This is the number of child processes spawn by a
2629                        process. The syntax is the same as above.
2630        totalmem        The equivalent of mem, except totalmem is an
2631                        aggregation of memory, not only used by a
2632                        process but also by all its child
2633                        processes. The syntax is the same as above.
2634        space           Must be followed by a compare operator, a
2635                        number, unit {B|KB|MB|GB|%|byte|kilobyte|
2636                        megabyte|gigabyte|percent} and an action.
2637        inode(s)        Must be followed by a compare operator, integer
2638                        number, optionally by percent sign (if not, the
2639                        limit is absolute) and an action.
2640        perm(ission)    Must be followed by an octal number describing
2641                        the permissions.
2642        size            Must be followed by a compare operator, a
2643                        number, unit {B|KB|MB|GB|byte|kilobyte|
2644                        megabyte|gigabyte} and an action.
2645        depends (on)    Must be followed by the name of a service this
2646                        service depends on.
2647
2648       Here's the complete list of reserved keywords used by monit:
2649
2650       if, then, else, set, daemon, logfile, syslog, address, httpd, ssl,
2651       enable, disable, pemfile, allow, read-only, check, init, count, pid‐
2652       file, statefile, group, start, stop, uid, gid, connection, port(num‐
2653       ber), unix(socket), type, proto(col), tcp, tcpssl, udp, alert, noalert,
2654       mail-format, restart, timeout, checksum, resource, expect, send,
2655       mailserver, every, mode, active, passive, manual, depends, host,
2656       default, http, ftp, smtp, pop, ntp3, nntp, imap, clamav, ssh, dwp,
2657       ldap2, ldap3, tns, request, cpu, mem, totalmem, children, loadavg,
2658       timestamp, changed, second(s), minute(s), hour(s), day(s), space,
2659       inode, pid, ppid, perm(ission), icmp, process, file, directory, device,
2660       size, unmonitor, rdate, rsync, data, invalid, exec, nonexist, policy,
2661       reminder, instance, eventqueue,
2662        basedir, slot(s), system and failed
2663
2664       And here is a complete list of noise keywords ignored by monit:
2665
2666       is, as, are, on(ly), with(in), and, has, using, use, the, sum, pro‐
2667       gram(s), than, for, usage, was, but, of.
2668
2669       Note: If the start or stop programs are shell scripts, then the script
2670       must begin with "#!" and the remainder of the first line must specify
2671       an interpreter for the program. E.g.  "#!/bin/sh"
2672
2673       It's possible to write scripts directly into the start and stop entries
2674       by using a string of shell-commands. Like so:
2675
2676        start="/bin/bash -c 'echo $$ > pidfile; exec program'"
2677        stop="/bin/bash -c 'kill -s SIGTERM `cat pidfile`'"
2678
2679       CONFIGURATION EXAMPLES
2680
2681       The simplest form is just the check statement. In this example we check
2682       to see if the server is running and log a message if not:
2683
2684        check process resin with pidfile /usr/local/resin/srun.pid
2685
2686       To have monit start the server if it's not running, add a start state‐
2687       ment:
2688
2689        check process resin with pidfile /usr/local/resin/srun.pid
2690              start program = "/usr/local/resin/bin/srun.sh start"
2691
2692       Here's a more advanced example for monitoring an apache web-server lis‐
2693       tening on the default port number for HTTP and HTTPS. In this example
2694       monit will restart apache if it's not accepting connections at the port
2695       numbers. The method monit use for a process restart is to first execute
2696       the stop-program, wait for the process to stop and then execute the
2697       start-program. (If monit was unable to stop or start the service a
2698       failed alert message will be sent if you have requested alert messages
2699       to be sent).
2700
2701        check process apache with pidfile /var/run/httpd.pid
2702              start program = "/etc/init.d/httpd start"
2703              stop program  = "/etc/init.d/httpd stop"
2704              if failed port 80 then restart
2705              if failed port 443 with timeout 15 seconds then restart
2706
2707       This example demonstrate how you can run a program as a specified user
2708       (uid) and with a specified group (gid). Many daemon programs will do
2709       the uid and gid switch by them self, but for those programs that does
2710       not (e.g. Java programs), monit's ability to start a program as a cer‐
2711       tain user can be very useful. In this example we start the Tomcat Java
2712       Servlet Engine as the standard nobody user and group. Please note that
2713       monit will only switch uid and gid for a program if the super-user is
2714       running monit, otherwise monit will simply ignore the request to change
2715       uid and gid.
2716
2717        check process tomcat with pidfile /var/run/tomcat.pid
2718              start program = "/etc/init.d/tomcat start"
2719                    as uid nobody and gid nobody
2720              stop program  = "/etc/init.d/tomcat stop"
2721                    # You can also use id numbers instead and write:
2722                    as uid 99 and with gid 99
2723              if failed port 8080 then alert
2724
2725       In this example we use udp for connection testing to check if the name-
2726       server is running and also use timeout and alert:
2727
2728        check process named with pidfile /var/run/named.pid
2729              start program = "/etc/init.d/named start"
2730              stop program  = "/etc/init.d/named stop"
2731              if failed port 53 use type udp protocol dns then restart
2732              if 3 restarts within 5 cycles then timeout
2733
2734       The following example illustrate how to check if the service 'sophie'
2735       is answering connections on its Unix domain socket:
2736
2737        check process sophie with pidfile /var/run/sophie.pid
2738              start program = "/etc/init.d/sophie start"
2739              stop  program = "/etc/init.d/sophie stop"
2740              if failed unix /var/run/sophie then restart
2741
2742       In this example we check an apache web-server running on localhost that
2743       answers for several IP-based virtual hosts or vhosts, hence the host
2744       statement before port:
2745
2746        check process apache with pidfile /var/run/httpd.pid
2747              start "/etc/init.d/httpd start"
2748              stop  "/etc/init.d/httpd stop"
2749              if failed host www.sol.no port 80 then alert
2750              if failed host shop.sol.no port 443 then alert
2751              if failed host chat.sol.no port 80 then alert
2752              if failed host www.tildeslash.com port 80 then alert
2753
2754       To make sure that monit is communicating with a http server a protocol
2755       test can be added:
2756
2757        check process apache with pidfile /var/run/httpd.pid
2758              start "/etc/init.d/httpd start"
2759              stop  "/etc/init.d/httpd stop"
2760              if failed host www.sol.no port 80
2761                 protocol HTTP
2762                 then alert
2763
2764       This example shows a different way to check a webserver using the
2765       send/expect mechanism:
2766
2767        check process apache with pidfile /var/run/httpd.pid
2768              start "/etc/init.d/httpd start"
2769              stop  "/etc/init.d/httpd stop"
2770              if failed host www.sol.no port 80
2771                 send "GET / HTTP/1.0\r\nHost: www.sol.no\r\n\r\n"
2772                 expect "HTTP/[0-9\.]{3} 200 .*\r\n"
2773                 then alert
2774
2775       To make sure that Apache is logging successfully (i.e. no more than 60
2776       percent of child servers are logging), use its mod_status page at
2777       www.sol.no/server-status with this special protocol test:
2778
2779        check process apache with pidfile /var/run/httpd.pid
2780              start "/etc/init.d/httpd start"
2781              stop  "/etc/init.d/httpd stop"
2782              if failed host www.sol.no port 80
2783              protocol apache-status loglimit > 60% then restart
2784
2785       This configuration can be used to alert you if 25 percent or more of
2786       Apache child processes are stuck performing DNS lookups:
2787
2788        check process apache with pidfile /var/run/httpd.pid
2789              start "/etc/init.d/httpd start"
2790              stop  "/etc/init.d/httpd stop"
2791              if failed host www.sol.no port 80
2792              protocol apache-status dnslimit > 25% then alert
2793
2794       Here we use an icmp ping test to check if a remote host is up and if
2795       not send an alert:
2796
2797        check host www.tildeslash.com with address www.tildeslash.com
2798              if failed icmp type echo count 5 with timeout 15 seconds
2799                 then alert
2800
2801       In the following example we ask monit to compute and verify the check‐
2802       sum for the underlying apache binary used by the start and stop pro‐
2803       grams. If the the checksum test should fail, monitoring will be dis‐
2804       abled to prevent possibly starting a compromised binary:
2805
2806        check process apache with pidfile /var/run/httpd.pid
2807              start program = "/etc/init.d/httpd start"
2808              stop program  = "/etc/init.d/httpd stop"
2809              if failed host www.tildeslash.com port 80 then restart
2810              depends on apache_bin
2811
2812        check file apache_bin with path /usr/local/apache/bin/httpd
2813              if failed checksum then unmonitor
2814
2815       In this example we ask monit to test the checksum for a document on a
2816       remote server. If the checksum was changed we send an alert:
2817
2818        check host tildeslash with address www.tildeslash.com
2819              if failed port 80 protocol http
2820                 and request "/monit/dist/monit-4.0.tar.gz"
2821                     with checksum f9d26b8393736b5dfad837bb13780786
2822              then alert
2823              alert hauk@tildeslash.com with mail-format {subject:
2824                Aaaalarm! }
2825
2826       Some servers are slow starters, like for example Java based Application
2827       Servers. So if we want to keep the poll-cycle low (i.e. < 60 seconds)
2828       but allow some services to take its time to start, the every statement
2829       is handy:
2830
2831        check process dynamo with pidfile /etc/dynamo.pid
2832              start program = "/etc/init.d/dynamo start"
2833              stop program  = "/etc/init.d/dynamo stop"
2834              if failed port 8840 then alert
2835              every 2 cycles
2836
2837       Here is an example where we group together two database entries so you
2838       can manage them together, e.g.; 'monit -g database start all'. The mode
2839       statement is also illustrated in the first entry and have the effect
2840       that monit will not try to (re)start this service if it is not running:
2841
2842        check process sybase with pidfile /var/run/sybase.pid
2843              start = "/etc/init.d/sybase start"
2844              stop  = "/etc/init.d/sybase stop"
2845              mode passive
2846              group database
2847
2848        check process oracle with pidfile /var/run/oracle.pid
2849              start program = "/etc/init.d/oracle start"
2850              stop program  = "/etc/init.d/oracle stop"
2851              mode active # Not necessary really, since it's the default
2852              if failed port 9001 then restart
2853              group database
2854
2855       Here is an example to show the usage of the resource checks. It will
2856       send an alert when the CPU usage of the http daemon and its child pro‐
2857       cesses raises beyond 60% for over two cycles. Apache is restarted if
2858       the CPU usage is over 80% for five cycles or the memory usage over
2859       100Mb for five cycles or if the machines load average is more than 10
2860       for 8 cycles:
2861
2862        check process apache with pidfile /var/run/httpd.pid
2863              start program = "/etc/init.d/httpd start"
2864              stop program  = "/etc/init.d/httpd stop"
2865              if cpu > 60% for 2 cycles then alert
2866              if cpu > 80% for 5 cycles then restart
2867              if mem > 100 MB for 5 cycles then stop
2868              if loadavg(5min) greater than 10.0 for 8 cycles then stop
2869
2870       This examples demonstrate the timestamp statement with exec and how you
2871       may restart apache if its configuration file was changed.
2872
2873        check file httpd.conf with path /etc/httpd/httpd.conf
2874              if changed timestamp
2875                 then exec "/etc/init.d/httpd graceful"
2876
2877       In this example we demonstrate usage of the extended alert statement
2878       and a file check dependency:
2879
2880        check process apache with pidfile /var/run/httpd.pid
2881             start = "/etc/init.d/httpd start"
2882             stop  = "/etc/init.d/httpd stop"
2883             if failed host www.tildeslash.com  port 80 then restart
2884             alert admin@bar on {nonexist, timeout}
2885               with mail-format {
2886                     from:     bofh@$HOST
2887                     subject:  apache $EVENT - $ACTION
2888                     message:  This event occurred on $HOST at $DATE.
2889                     Your faithful employee,
2890                     monit
2891             }
2892             if 3 restarts within 5 cycles then timeout
2893             depend httpd_bin
2894             group apache
2895
2896        check file httpd_bin with path /usr/local/apache/bin/httpd
2897              if failed checksum
2898                 and expect 8f7f419955cefa0b33a2ba316cba3659
2899                     then unmonitor
2900              if failed permission 755 then unmonitor
2901              if failed uid root then unmonitor
2902              if failed gid root then unmonitor
2903              if changed timestamp then alert
2904              alert security@bar on {checksum, timestamp,
2905                                     permission, uid, gid}
2906                    with mail-format {subject: Alaaarrm! on $HOST}
2907              group apache
2908
2909       In this example, we demonstrate usage of the depend statement. In this
2910       case, we want to start oracle and apache. However, we've set up apache
2911       to use oracle as a back end, and if oracle is restarted, apache must be
2912       restarted as well.
2913
2914        check process apache with pidfile /var/run/httpd.pid
2915              start = "/etc/init.d/httpd start"
2916              stop  = "/etc/init.d/httpd stop"
2917              depends on oracle
2918
2919        check process oracle with pidfile /var/run/oracle.pid
2920              start = "/etc/init.d/oracle start"
2921              stop  = "/etc/init.d/oracle stop"
2922              if failed port 9001 then restart
2923
2924       Next, we have 2 services, oracle-import and oracle-export that need to
2925       be restarted if oracle is restarted, but are independent of each other.
2926
2927        check process oracle with pidfile /var/run/oracle.pid
2928              start = "/etc/init.d/oracle start"
2929              stop  = "/etc/init.d/oracle stop"
2930              if failed port 9001 then restart
2931
2932        check process oracle-import
2933             with pidfile /var/run/oracle-import.pid
2934              start = "/etc/init.d/oracle-import start"
2935              stop  = "/etc/init.d/oracle-import stop"
2936              depends on oracle
2937
2938        check process oracle-export
2939             with pidfile /var/run/oracle-export.pid
2940              start = "/etc/init.d/oracle-export start"
2941              stop  = "/etc/init.d/oracle-export stop"
2942              depends on oracle
2943
2944       Finally an example with all statements:
2945
2946        check process apache with pidfile /var/run/httpd.pid
2947              start program = "/etc/init.d/httpd start"
2948              stop program  = "/etc/init.d/httpd stop"
2949              if 3 restarts within 5 cycles then timeout
2950              if failed host www.sol.no  port 80 protocol http
2951                 and use the request "/login.cgi"
2952                     then alert
2953              if failed host shop.sol.no port 443 type tcpssl
2954                 protocol http and with timeout 15 seconds
2955                     then restart
2956              if cpu is greater than 60% for 2 cycles then alert
2957              if cpu > 80% for 5 cycles then restart
2958              if totalmem > 100 MB then stop
2959              if children > 200 then alert
2960              alert bofh@bar with mail-format {from: monit@foo.bar.no}
2961              every 2 cycles
2962              mode active
2963              depends on weblogic
2964              depends on httpd.pid
2965              depends on httpd.conf
2966              depends on httpd_bin
2967              depends on datafs
2968              group server
2969
2970        check file httpd.pid with path /usr/local/apache/logs/httpd.pid
2971              group server
2972              if timestamp > 7 days then restart
2973              every 2 cycles
2974              alert bofh@bar with mail-format {from: monit@foo.bar.no}
2975              depends on datafs
2976
2977        check file httpd.conf with path /etc/httpd/httpd.conf
2978              group server
2979              if timestamp was changed
2980                 then exec "/usr/local/apache/bin/apachectl graceful"
2981              every 2 cycles
2982              alert bofh@bar with mail-format {from: monit@foo.bar.no}
2983              depends on datafs
2984
2985        check file httpd_bin with path /usr/local/apache/bin/httpd
2986              group server
2987              if failed checksum and expect the sum
2988                 8f7f419955cefa0b33a2ba316cba3659 then unmonitor
2989              if failed permission 755 then unmonitor
2990              if failed uid root then unmonitor
2991              if failed gid root then unmonitor
2992              if changed size then alert
2993              if changed timestamp then alert
2994              every 2 cycles
2995              alert bofh@bar with mail-format {from: monit@foo.bar.no}
2996              alert foo@bar on { checksum, size, timestamp, uid, gid }
2997              depends on datafs
2998
2999        check device datafs with path /dev/sdb1
3000              group server
3001              start program  = "/bin/mount /data"
3002              stop program  =  "/bin/umount /data"
3003              if failed permission 660 then unmonitor
3004              if failed uid root then unmonitor
3005              if failed gid disk then unmonitor
3006              if space usage > 80 % then alert
3007              if space usage > 94 % then stop
3008              if inode usage > 80 % then alert
3009              if inode usage > 94 % then stop
3010              alert root@localhost
3011
3012        check host ftp.redhat.com with address ftp.redhat.com
3013              if failed icmp type echo with timeout 15 seconds
3014                 then alert
3015              if failed port 21 protocol ftp
3016                 then exec "/usr/X11R6/bin/xmessage -display
3017                            :0 ftp connection failed"
3018              alert foo@bar.com
3019
3020        check host www.gnu.org with address www.gnu.org
3021              if failed port 80 protocol http
3022                 and request "/pub/gnu/bash/bash-2.05b.tar.gz"
3023                     with checksum 8f7f419955cefa0b33a2ba316cba3659
3024              then alert
3025              alert rms@gnu.org with mail-format {
3026                   subject: The gnu server may be hacked again! }
3027
3028       Note; only the check type, pidfile/path/address statements are manda‐
3029       tory, the other statements are optional and the order of the optional
3030       statements is not important.
3031

MONIT WITH HEARTBEAT

3033       You can download heartbeat from http://www.linux-ha.org/download/. It
3034       might be useful to have a look at The Heartbeat Getting Started Guide
3035       at: http://www.linux-ha.org/GettingStarted.html
3036
3037       Starting up a Node
3038
3039       This is the normal start sequence for a cluster-node. With this
3040       sequence, there should be no error-case, which is not handled either by
3041       heartbeat or by monit. For example, if monit dies, initd restarts it.
3042       If heartbeat dies, monit restarts it. If the node dies, the heartbeat
3043       instance on the other node detects it and restart the services there.
3044
3045       1. initd starts monit with group local
3046       2. monit starts heartbeat in local group
3047       3. heartbeat requests monit to start the node group
3048       4. monit starts the node group
3049
3050       Monit: /etc/monitrc
3051
3052       This example describes a cluster with 2 nodes. Services running on Node
3053       1 are in the group node1 and Node 2 services are in the node2 group.
3054
3055       The local group entries are mode active, the node group entries are
3056       mode manual and controlled by heartbeat.
3057
3058        #
3059        # local services on both hosts
3060        #
3061
3062        check process heartbeat with pidfile /var/run/heartbeat.pid
3063              start program = "/etc/init.d/heartbeat start"
3064              stop  program = "/etc/init.d/heartbeat start"
3065              mode  active
3066              alert foo@bar
3067              group local
3068
3069        check process postfix with pidfile /var/run/postfix/master.pid
3070              start program = "/etc/init.d/postfix start"
3071              stop program  = "/etc/init.d/postfix stop"
3072              mode  active
3073              alert foo@bar
3074              group local
3075
3076        #
3077        # node1 services
3078        #
3079
3080        check process apache with pidfile /var/apache/logs/httpd.pid
3081              start program = "/etc/init.d/apache start"
3082              stop program  = "/etc/init.d/apache stop"
3083              depends named
3084              alert foo@bar
3085              mode  manual
3086              group node1
3087
3088        check process named with pidfile /var/tmp/named.pid
3089              start program = "/etc/init.d/named start"
3090              stop program  = "/etc/init.d/named stop"
3091              alert foo@bar
3092              mode  manual
3093              group node1
3094
3095        #
3096        # node2 services
3097        #
3098
3099        check process named-slave with pidfile /var/tmp/named-slave.pid
3100              start program = "/etc/init.d/named-slave start"
3101              stop program  = "/etc/init.d/named-slave stop"
3102              mode  manual
3103              alert foo@bar
3104              group node2
3105
3106        check process squid with pidfile /var/squid/logs/squid.pid
3107              start program = "/etc/init.d/squid start"
3108              stop program  = "/etc/init.d/squid stop"
3109              depends named-slave
3110              alert foo@bar
3111              mode  manual
3112              group node2
3113
3114       initd:  /etc/inittab
3115
3116       Monit is started on both nodes with initd. You will need to add an
3117       entry in /etc/inittab to start monit with the same local group heart‐
3118       beat is member of.
3119
3120        #/etc/inittab
3121        mo:2345:respawn:/usr/local/bin/monit -d 10 -c /etc/monitrc -g local
3122
3123       heartbeat:  /etc/ha.d/haresources
3124
3125       When heartbeat starts, heartbeat looks up the node entry and start the
3126       script /etc/init.d/monit-node1 or /etc/init.d/monit-node2. The script
3127       calls monit to start the specific group per node.
3128
3129        # /etc/ha.d/haresources
3130        node1 IPaddr::172.16.100.1  monit-node1
3131        node2 IPaddr::172.16.100.2  monit-node2
3132
3133       /etc/init.d/monit-node1
3134
3135        #!/bin/bash
3136        #
3137        # sample script for starting/stopping all services on node1
3138        #
3139        prog="/usr/local/bin/monit -g node1"
3140        start()
3141        {
3142              echo -n $"Starting $prog:"
3143              $prog start all
3144              echo
3145        }
3146
3147        stop()
3148        {
3149              echo -n $"Stopping $prog:"
3150              $prog stop all
3151              echo
3152        }
3153
3154        case "$1" in
3155              start)
3156                   start;;
3157              stop)
3158                   stop;;
3159              *)
3160                   echo $"Usage: $0 {start|stop}"
3161                   RETVAL=1
3162        esac
3163        exit $RETVAL
3164
3165       Handling state
3166
3167       As mentioned elsewhere, monit save its state to a state file. If the
3168       monit process should die, upon restart monit will read its last known
3169       state from this file. This can be a problem if monit is used in a clus‐
3170       ter, as illustrate in this scenario:
3171
3172       1   The active node fails, the second takes over
3173
3174       2   After a reboot, the failed node comes back, monit read its state
3175           file and start all the services (even manual ones) as they were
3176           running before the failure. This is a problem because services will
3177           now run on both nodes.
3178
3179       The solution to this problem is to remove the monit.state file in a rc-
3180       script called at boot time and before monit is started.
3181

FILES

3183       ~/.monitrc
3184          Default run control file
3185
3186       /etc/monitrc
3187          If the control file is not found in the default
3188          location and /etc contains a monitrc file, this
3189          file will be used instead.
3190
3191       ./monitrc
3192          If the control file is not found in either of the
3193          previous two locations, and the current working
3194          directory contains a monitrc file, this file is
3195          used instead.
3196
3197       ~/.monitrc.pid
3198          Lock file to help prevent concurrent runs (non-root
3199          mode).
3200
3201       /var/run/monit.pid
3202          Lock file to help prevent concurrent runs (root mode,
3203          Linux systems).
3204
3205       /etc/monit.pid
3206          Lock file to help prevent concurrent runs (root mode,
3207          systems without /var/run).
3208
3209       ~/.monit.state
3210          monit save its state to this file and utilize
3211          information found in this file to recover from
3212          a crash. This is a binary file and its content is
3213          only of interest to monit. You may set the location
3214          of this file in the monit control file or by using
3215          the -s switch when monit is started.
3216

ENVIRONMENT

3218       No environment variables are used by monit. However, when monit execute
3219       a script or a program monit will set several environment variables
3220       which can be utilized by the executable. The following and only the
3221       following environment variables are available:
3222
3223       MONIT_EVENT
3224           The event that occurred on the service
3225
3226       MONIT_SERVICE
3227           The name of the service (from monitrc) on which the event occurred.
3228
3229       MONIT_DATE
3230           The time and date (rfc 822 style) the event occurred
3231
3232       MONIT_HOST
3233           The host the event occurred on
3234
3235       The following environment variables are only available for process ser‐
3236       vice entries:
3237
3238       MONIT_PROCESS_PID
3239           The process pid. This may be 0 if the process was (re)started,
3240
3241       MONIT_PROCESS_MEMORY
3242           Process memory. This may be 0 if the process was (re)started,
3243
3244       MONIT_PROCESS_CHILDREN
3245           Process children. This may be 0 if the process was (re)started,
3246
3247       MONIT_PROCESS_CPU_PERCENT
3248           Process cpu%. This may be 0 if the process was (re)started,
3249
3250       In addition the following spartan PATH environment variable is avail‐
3251       able:
3252
3253       PATH=/bin:/usr/bin:/sbin:/usr/sbin
3254
3255       Scripts or programs that depends on other environment variables or on a
3256       more verbose PATH must provide means to set these variables by them
3257       self.
3258

SIGNALS

3260       If a monit daemon is running, SIGUSR1 wakes it up from its sleep phase
3261       and forces a poll of all services. SIGTERM and SIGINT will gracefully
3262       terminate a monit daemon. The SIGTERM signal is sent to a monit daemon
3263       if monit is started with the quit action argument.
3264
3265       Sending a SIGHUP signal to a running monit daemon will force the daemon
3266       to reinitialize itself, specifically it will reread configuration,
3267       close and reopen log files.
3268
3269       Running monit in foreground while a background monit daemon is running
3270       will wake up the daemon.
3271

NOTES

3273       This is a very silent program. Use the -v switch if you want to see
3274       what monit is doing, and tail -f the logfile. Optionally for testing
3275       purposes; you can start monit with the -Iv switch. Monit will then
3276       print debug information to the console, to stop monit in this mode,
3277       simply press CTRL^C (i.e. SIGINT) in the same console.
3278
3279       The syntax (and parser) of the control file is inspired by Eric S. Ray‐
3280       mond et al. excellent fetchmail program. Some portions of this man page
3281       does also receive inspiration from the same authors.
3282

AUTHORS

3284       Jan-Henrik Haukeland <hauk@tildeslash.com>, Martin Pala <mart‐
3285       inp@tildeslash.com>, Christian Hopp <chopp@iei.tu-clausthal.de>, Rory
3286       Toma <rory@digeo.com>
3287
3288       See also http://www.tildeslash.com/monit/who.html
3289

COPYRIGHT

3291       Copyright (C) 2000-2007 by the monit project group. All Rights
3292       Reserved. This product is distributed in the hope that it will be use‐
3293       ful, but WITHOUT any warranty; without even the implied warranty of
3294       MERCHANTABILITY or FITNESS for a particular purpose.
3295