1WATCHDOG.CONF(5)              File Formats Manual             WATCHDOG.CONF(5)
2
3
4

NAME

6       watchdog.conf - configuration file for the watchdog daemon
7

DESCRIPTION

9       This file carries all configuration options for the Linux watchdog dae‐
10       mon.  Each option has to be written on  a  line  for  itself.  Comments
11       start with '#'.  Blanks are ignored except after the '=' sign. An empty
12       text after the '=' sign disables the feature  as  long  as  that  makes
13       sense.
14

OPTIONS

16       interval = <interval>
17              Set  the  highest  possible  interval  between two writes to the
18              watchdog device.  The  device  is  triggered  after  each  check
19              regardless  of  the  time  it  took.  After finishing all checks
20              watchdog goes to sleep for a full cycle of  <interval>  seconds.
21              Default  value  is  1 second. The kernel drivers expects a write
22              command every minute. Otherwise the  system  will  be  rebooted.
23              Therefore  an  interval  of  more than a minute can only be used
24              with the force command-line option [--force | -f].
25
26       logtick = <logtick>
27              If you enable verbose logging, a message  is  written  into  the
28              syslog  or a logfile. While this is nice, it is not necessary to
29              get a message every interval which  really  fills  up  disk  and
30              needs  CPU. logtick allows adjustment of the number of intervals
31              skipped before a log message is written. If you use logtick = 60
32              and interval = 10, only every 10 minutes (600 seconds) a message
33              is written. This may make the exact time of a  crash  harder  to
34              find  but greatly reduces disk usage and administrator nerves if
35              you're looking for a  particular  syslog  entry  in  between  of
36              watchdog messages.
37
38       max-load-1 = <load1>
39              Set  the  maximal allowed load average for a 1 minute span. Once
40              this load average is reached the  system  is  rebooted.  Default
41              value  is  0.  That means the load average check is disabled. Be
42              careful not to set this parameter too low. To set a  value  less
43              then  the  predefined minimal value of 2, you have to use the -f
44              command line option.
45
46       max-load-5 = <load5>
47              Set the maximal allowed load average for a 5 minute  span.  Once
48              this  load  average  is  reached the system is rebooted. Default
49              value is 3/4*max-load-1.  Be careful not to this  parameter  too
50              low. To set a value less then the predefined minimal value of 2,
51              you have to use the -f command line option.
52
53       max-load-15 = <load15>
54              Set the maximal allowed load average for a 15 minute span.  Once
55              this  load  average  is  reached the system is rebooted. Default
56              value is 1/2*max-load-1.  Be careful not to this  parameter  too
57              low. To set a value less then the predefined minimal value of 2,
58              you have to use the -f command line option.
59
60       min-memory = <minpage>
61              Set the minimal amount of virtual memory that has to stay  free.
62              Note that this is in memory pages (4kB on x86). Default value is
63              0 pages which means this test is  disabled.  The  page  size  is
64              taken  from  the system include files.  This is a 'passive' test
65              and works by reading /proc/meminfo
66
67       allocatable-memory = <minpage>
68              Set the minimum amount of allocatable memory  available  on  the
69              system.   Note  that this is in pages.  Default value is 0 pages
70              which means the test is disabled.  As with min-memory, the  page
71              size is taken from the system include files. This is an 'active'
72              test and it works by attempting to memory-map  a  block  of  the
73              configured size.
74
75       watchdog-device = <device>
76              Set  the  watchdog device name, typically /dev/watchdog. Default
77              is to disable keep alive support. This should be tested by  run‐
78              ning  the  daemon from the command line before configuring it to
79              start automatically on booting.
80
81       watchdog-timeout = <timeout>
82              Set the watchdog device timeout during startup.  If not  set,  a
83              default is used that should be set to the kernel timer margin at
84              compile time.
85
86       temperature-sensor = <temp-virtual-file>
87              Set the temperature sensor name. This  is  normally  a  'virtual
88              file'  under  /sys and it contains the temperature in milli-Cel‐
89              sius. Usually these are generated by the  sensors  package,  but
90              take  care as device enumeration may not be fixed. Default is to
91              disable temperature checking. Multiple sensors can  be  used  by
92              having repeated temperature-sensor entries.
93
94       max-temperature = <temp>
95              Set  the  maximal  allowed temperature. Once this temperature is
96              reached the system is stopped. Default value is 90  C.  Watchdog
97              will  issue warnings once the temperature increases 90%, 95% and
98              98% of this temperature.
99
100       temp-power-off = <yes|no>
101              Set the watchdog action on overheating. Yes option (default)  is
102              to power the machine off, no option is to halt machine and allow
103              Ctrl-Alt-Del reboot.
104
105       file = <filename>
106              Set file name for file mode.  This option can be given as  often
107              as you like to check several files.
108
109       change = <mtime>
110              Set  the change interval time for file mode. This options always
111              belongs to the active filename, that is when finding  a  'change
112              ='  line  watchdog  assumes it belongs to the most recently read
113              'file =' line.  They don't necessarily have to follow each other
114              directly. But you cannot specify a 'change =' before a 'file ='.
115              The default is to only stat the file and don't look for changes.
116              Using this feature to monitor changes in /var/log/messages might
117              require some special syslog daemon configuration,  e.g.  rsyslog
118              needs  "$ActionWriteAllMarkMessages  on"  to be set to make sure
119              the marks are written no matter what.
120
121       pidfile = <pidfilename>
122              Set pidfile name for server test mode.  This option can be given
123              as  often as you like to check several servers.  See the Systemd
124              section in watchdog (8) for more information.
125
126       ping = <ip-addr>
127              Set IPv4 address for ping mode.  This option can  be  used  more
128              than once to check different connections.
129
130       interface = <if-name>
131              Set  interface  name  for network mode.  This option can be used
132              more than once to check different interfaces. Note  it  is  only
133              possible to check physical interfaces, and not aliased IP inter‐
134              faces.
135
136       test-binary = <testbin>
137              Execute the given binary to do some user  defined  tests.   With
138              enforcing  SELinux  policy  please  use  the /usr/libexec/watch‐
139              dog/scripts/ for your test-binary configuration.
140
141       test-timeout = <timeout in seconds>
142              User defined tests may only run for <timeout> seconds. Set to  0
143              for unlimited.
144
145       repair-binary = <repbin>
146              Execute  the  given binary in case of a problem instead of shut‐
147              ting down the system.  With enforcing SELinux policy please  use
148              the  /usr/libexec/watchdog/scripts/  for your repair-binary con‐
149              figuration.
150
151       repair-timeout = <timeout in seconds>
152              repair command may only run for <timeout> seconds. Set to 0  for
153              'unlimited',  but  note that the hardware timer is not refreshed
154              in this case so the system will hard-reset at some point.
155
156       retry-timeout = <timeout in seconds>
157              Allow most error conditions to persist  for  <timeout>  seconds.
158              Set to 0 for immediate action (like softboot behaviour).
159
160       repair-maximum = <count>
161              This allows no more then <count> repair attempts against a given
162              fault that report success (i.e. return 0), but fail to clear the
163              fault,  before a reboot is initiated anyway. If set to zero then
164              a repairable fault can always be blocked  by  a  repair  program
165              reporting success (previous daemon behaviour).
166
167       admin = <mail-address>
168              Email address to send admin mail to. That is, who shall be noti‐
169              fied that the machine is being halted or  rebooted.  Default  is
170              'root'.  If  you want to disable notification via email just set
171              admin to en empty string.
172
173       realtime = <yes|no>
174              If set to yes watchdog will lock itself into  memory  so  it  is
175              never swapped out.
176
177       priority = <schedule priority>
178              Set the schedule priority for realtime mode.
179
180       test-directory = <test directory>
181              Set  the  directory to run user test/repair scripts.  Default is
182              '/etc/watchdog.d' The /etc/watchdog.d/ is recognized by  SELinux
183              policy.   See the Test Directory section in watchdog(8) for more
184              information.
185
186       log-dir = <log directory>
187              Set the log directory to capture the standard output  and  stan‐
188              dard error from repair-binary and test-binary execution. Default
189              is '/var/log/watchdog'.
190
191       sigterm-delay = <time in seconds>
192              Set the time on shut down between first sending SIGTERM  to  all
193              processes,  and then sending SIGKILL. Default is 5 seconds which
194              is generally enough, but systems with large databases or virtual
195              machines might need longer.
196
197       verbose = <yes|no>
198              This  overrides the command line --verbose option. Generally the
199              verbose mode is only enabled for debugging as it creates  a  lot
200              of syslog chatter, so use this option with consideration.
201

FILES

203       /etc/watchdog.conf
204              The watchdog configuration file
205
206       /etc/watchdog.d
207              A  directory  containing  test-or-repair  commands. See the Test
208              Directory section in watchdog(8) for more information.
209

SEE ALSO

211       watchdog(8)
212
213
214
2154th Berkeley Distribution        January 2016                 WATCHDOG.CONF(5)
Impressum