1WATCHDOG.CONF(5)              File Formats Manual             WATCHDOG.CONF(5)
2
3
4

NAME

6       watchdog.conf - configuration file for the watchdog daemon
7

DESCRIPTION

9       This file carries all configuration options for the Linux watchdog dae‐
10       mon.  Each option has to be written on  a  line  for  itself.  Comments
11       start with '#'.  Blanks are ignored except after the '=' sign. An empty
12       text after the '=' sign disables the feature  as  long  as  that  makes
13       sense.
14

OPTIONS

16       interval = <interval>
17              Set  the  highest  possible  interval  between two writes to the
18              watchdog device.  The  device  is  triggered  after  each  check
19              regardless  of  the  time  it  took.  After finishing all checks
20              watchdog goes to sleep for a full cycle of  <interval>  seconds.
21              Default  value  is  1 second. The kernel drivers expects a write
22              command every minute. Otherwise the  system  will  be  rebooted.
23              Therefore  an  interval  of  more than a minute can only be used
24              with the force command-line option [--force | -f].
25
26       logtick = <logtick>
27              If you enable verbose logging, a message  is  written  into  the
28              syslog  or a logfile. While this is nice, it is not necessary to
29              get a message every interval which  really  fills  up  disk  and
30              needs  CPU. logtick allows adjustment of the number of intervals
31              skipped before a log message is written. If you use logtick = 60
32              and interval = 10, only every 10 minutes (600 seconds) a message
33              is written. This may make the exact time of a  crash  harder  to
34              find  but greatly reduces disk usage and administrator nerves if
35              you're looking for a  particular  syslog  entry  in  between  of
36              watchdog messages.
37
38       max-load-1 = <load1>
39              Set  the  maximal allowed load average for a 1 minute span. Once
40              this load average is reached the  system  is  rebooted.  Default
41              value  is  0.  That means the load average check is disabled. Be
42              careful not to set this parameter too low. To set a  value  less
43              then  the  predefined minimal value of 2, you have to use the -f
44              command line option.
45
46       max-load-5 = <load5>
47              Set the maximal allowed load average for a 5 minute  span.  Once
48              this  load  average  is  reached the system is rebooted. Default
49              value is 3/4*max-load-1.  Be careful not to this  parameter  too
50              low. To set a value less then the predefined minimal value of 2,
51              you have to use the -f command line option.
52
53       max-load-15 = <load15>
54              Set the maximal allowed load average for a 15 minute span.  Once
55              this  load  average  is  reached the system is rebooted. Default
56              value is 1/2*max-load-1.  Be careful not to this  parameter  too
57              low. To set a value less then the predefined minimal value of 2,
58              you have to use the -f command line option.
59
60       min-memory = <minpage>
61              Set the minimal amount of virtual memory that has to stay  free.
62              Note that this is in memory pages (4kB on x86). Default value is
63              0 pages which means this test is  disabled.  The  page  size  is
64              taken  from  the system include files.  This is a 'passive' test
65              and works by reading /proc/meminfo
66
67       allocatable-memory = <minpage>
68              Set the minimum amount of allocatable memory  available  on  the
69              system.   Note  that this is in pages.  Default value is 0 pages
70              which means the test is disabled.  As with min-memory, the  page
71              size is taken from the system include files. This is an 'active'
72              test and it works by attempting to memory-map  a  block  of  the
73              configured size.
74
75       watchdog-device = <device>
76              Set  the  watchdog device name, typically /dev/watchdog. Default
77              is to disable keep alive support. This should be tested by  run‐
78              ning  the  daemon from the command line before configuring it to
79              start automatically on booting.
80
81       watchdog-refresh-use-settimeout = <auto|yes|no>
82              Refresh watchdog timer by setting its timeout instead of using a
83              normal  watchdog  refresh operation. Might help if your watchdog
84              trips by itself when the first timeout interval elapses. Default
85              is  'auto' for IT87 fix-up but this can be disabled with 'no' or
86              forced for other modules with 'yes'.
87
88       watchdog-timeout = <timeout>
89              Set the watchdog device timeout during startup.  If not  set,  a
90              default is used that should be set to the kernel timer margin at
91              compile time.
92
93       temperature-sensor = <temp-virtual-file>
94              Set the temperature sensor name. This  is  normally  a  'virtual
95              file'  under  /sys and it contains the temperature in milli-Cel‐
96              sius. Usually these are generated by the  sensors  package,  but
97              take  care as device enumeration may not be fixed. Default is to
98              disable temperature checking. Multiple sensors can  be  used  by
99              having repeated temperature-sensor entries.
100
101       max-temperature = <temp>
102              Set  the  maximal  allowed temperature. Once this temperature is
103              reached the system is stopped. Default value is 90  C.  Watchdog
104              will  issue warnings once the temperature increases 90%, 95% and
105              98% of this temperature.
106
107       temp-power-off = <yes|no>
108              Set the watchdog action on overheating. Yes option (default)  is
109              to power the machine off, no option is to halt machine and allow
110              Ctrl-Alt-Del reboot.
111
112       file = <filename>
113              Set file name for file mode.  This option can be given as  often
114              as you like to check several files.
115
116       change = <mtime>
117              Set  the change interval time for file mode. This options always
118              belongs to the active filename, that is when finding  a  'change
119              ='  line  watchdog  assumes it belongs to the most recently read
120              'file =' line.  They don't necessarily have to follow each other
121              directly. But you cannot specify a 'change =' before a 'file ='.
122              The default is to only stat the file and don't look for changes.
123              Using this feature to monitor changes in /var/log/messages might
124              require some special syslog daemon configuration,  e.g.  rsyslog
125              needs  "$ActionWriteAllMarkMessages  on"  to be set to make sure
126              the marks are written no matter what.
127
128       pidfile = <pidfilename>
129              Set pidfile name for server test mode.  This option can be given
130              as  often as you like to check several servers.  See the Systemd
131              section in watchdog (8) for more information.
132
133       ping = <ip-addr>
134              Set IPv4 address for ping mode.  This option can  be  used  more
135              than once to check different connections.
136
137       interface = <if-name>
138              Set  interface  name  for network mode.  This option can be used
139              more than once to check different interfaces. Note  it  is  only
140              possible to check physical interfaces, and not aliased IP inter‐
141              faces.
142
143       test-binary = <testbin>
144              Execute the given binary to do some user  defined  tests.   With
145              enforcing  SELinux  policy  please  use  the /usr/libexec/watch‐
146              dog/scripts/ for your test-binary configuration.
147
148       test-timeout = <timeout in seconds>
149              User defined tests may only run for <timeout> seconds. Set to  0
150              for unlimited.
151
152       repair-binary = <repbin>
153              Execute  the  given binary in case of a problem instead of shut‐
154              ting down the system.  With enforcing SELinux policy please  use
155              the  /usr/libexec/watchdog/scripts/  for your repair-binary con‐
156              figuration.
157
158       repair-timeout = <timeout in seconds>
159              repair command may only run for <timeout> seconds. Set to 0  for
160              'unlimited',  but  note that the hardware timer is not refreshed
161              in this case so the system will hard-reset at some point.
162
163       retry-timeout = <timeout in seconds>
164              Allow most error conditions to persist  for  <timeout>  seconds.
165              Set to 0 for immediate action (like softboot behaviour).
166
167       repair-maximum = <count>
168              This allows no more then <count> repair attempts against a given
169              fault that report success (i.e. return 0), but fail to clear the
170              fault,  before a reboot is initiated anyway. If set to zero then
171              a repairable fault can always be blocked  by  a  repair  program
172              reporting success (previous daemon behaviour).
173
174       admin = <mail-address>
175              Email address to send admin mail to. That is, who shall be noti‐
176              fied that the machine is being halted or  rebooted.  Default  is
177              'root'.  If  you want to disable notification via email just set
178              admin to en empty string.
179
180       realtime = <yes|no>
181              If set to yes watchdog will lock itself into  memory  so  it  is
182              never swapped out.
183
184       priority = <schedule priority>
185              Set the schedule priority for realtime mode.
186
187       test-directory = <test directory>
188              Set  the  directory to run user test/repair scripts.  Default is
189              '/etc/watchdog.d' The /etc/watchdog.d/ is recognized by  SELinux
190              policy.   See the Test Directory section in watchdog(8) for more
191              information.
192
193       log-dir = <log directory>
194              Set the log directory to capture the standard output  and  stan‐
195              dard error from repair-binary and test-binary execution. Default
196              is '/var/log/watchdog'.
197
198       sigterm-delay = <time in seconds>
199              Set the time on shut down between first sending SIGTERM  to  all
200              processes,  and then sending SIGKILL. Default is 5 seconds which
201              is generally enough, but systems with large databases or virtual
202              machines might need longer.
203
204       verbose = <yes|no>
205              This  overrides the command line --verbose option. Generally the
206              verbose mode is only enabled for debugging as it creates  a  lot
207              of syslog chatter, so use this option with consideration.
208

FILES

210       /etc/watchdog.conf
211              The watchdog configuration file
212
213       /etc/watchdog.d
214              A  directory  containing  test-or-repair  commands. See the Test
215              Directory section in watchdog(8) for more information.
216

SEE ALSO

218       watchdog(8)
219
220
221
2224th Berkeley Distribution        January 2016                 WATCHDOG.CONF(5)
Impressum