1WATCHDOG.CONF(5) File Formats Manual WATCHDOG.CONF(5)
2
3
4
6 watchdog.conf - configuration file for the watchdog daemon
7
9 This file carries all configuration options for the Linux watchdog dae‐
10 mon. Each option has to be written on a line for itself. Comments
11 start with '#'. Blanks are ignored except after the '=' sign. An empty
12 text after the '=' sign disables the feature as long as that makes
13 sense.
14
16 interval = <interval>
17 Set the highest possible interval between two writes to the
18 watchdog device. The device is triggered after each check
19 regardless of the time it took. After finishing all checks
20 watchdog goes to sleep for a full cycle of <interval> seconds.
21 Default value is 1 second. The kernel drivers expects a write
22 command every minute. Otherwise the system will be rebooted.
23 Therefore an interval of more than a minute can only be used
24 with the force command-line option [--force | -f].
25
26 logtick = <logtick>
27 If you enable verbose logging, a message is written into the
28 syslog or a logfile. While this is nice, it is not necessary to
29 get a message every interval which really fills up disk and
30 needs CPU. logtick allows adjustment of the number of intervals
31 skipped before a log message is written. If you use logtick = 60
32 and interval = 10, only every 10 minutes (600 seconds) a message
33 is written. This may make the exact time of a crash harder to
34 find but greatly reduces disk usage and administrator nerves if
35 you're looking for a particular syslog entry in between of
36 watchdog messages.
37
38 max-load-1 = <load1>
39 Set the maximal allowed load average for a 1 minute span. Once
40 this load average is reached the system is rebooted. Default
41 value is 0. That means the load average check is disabled. Be
42 careful not to set this parameter too low. To set a value less
43 then the predefined minimal value of 2, you have to use the -f
44 command line option.
45
46 max-load-5 = <load5>
47 Set the maximal allowed load average for a 5 minute span. Once
48 this load average is reached the system is rebooted. Default
49 value is 3/4*max-load-1. Be careful not to this parameter too
50 low. To set a value less then the predefined minimal value of 2,
51 you have to use the -f command line option.
52
53 max-load-15 = <load15>
54 Set the maximal allowed load average for a 15 minute span. Once
55 this load average is reached the system is rebooted. Default
56 value is 1/2*max-load-1. Be careful not to this parameter too
57 low. To set a value less then the predefined minimal value of 2,
58 you have to use the -f command line option.
59
60 min-memory = <minpage>
61 Set the minimal amount of virtual memory that has to stay free.
62 Note that this is in memory pages (4kB on x86). Default value is
63 0 pages which means this test is disabled. The page size is
64 taken from the system include files. This is a 'passive' test
65 and works by reading /proc/meminfo
66
67 allocatable-memory = <minpage>
68 Set the minimum amount of allocatable memory available on the
69 system. Note that this is in pages. Default value is 0 pages
70 which means the test is disabled. As with min-memory, the page
71 size is taken from the system include files. This is an 'active'
72 test and it works by attempting to memory-map a block of the
73 configured size.
74
75 watchdog-device = <device>
76 Set the watchdog device name, typically /dev/watchdog. Default
77 is to disable keep alive support. This should be tested by run‐
78 ning the daemon from the command line before configuring it to
79 start automatically on booting.
80
81 watchdog-refresh-use-settimeout = <auto|yes|no>
82 Refresh watchdog timer by setting its timeout instead of using a
83 normal watchdog refresh operation. Might help if your watchdog
84 trips by itself when the first timeout interval elapses. Default
85 is 'auto' for IT87 fix-up but this can be disabled with 'no' or
86 forced for other modules with 'yes'.
87
88 watchdog-timeout = <timeout>
89 Set the watchdog device timeout during startup. If not set, a
90 default is used that should be set to the kernel timer margin at
91 compile time.
92
93 temperature-sensor = <temp-virtual-file>
94 Set the temperature sensor name. This is normally a 'virtual
95 file' under /sys and it contains the temperature in milli-Cel‐
96 sius. Usually these are generated by the sensors package, but
97 take care as device enumeration may not be fixed. Default is to
98 disable temperature checking. Multiple sensors can be used by
99 having repeated temperature-sensor entries.
100
101 max-temperature = <temp>
102 Set the maximal allowed temperature. Once this temperature is
103 reached the system is stopped. Default value is 90 C. Watchdog
104 will issue warnings once the temperature increases 90%, 95% and
105 98% of this temperature.
106
107 temp-power-off = <yes|no>
108 Set the watchdog action on overheating. Yes option (default) is
109 to power the machine off, no option is to halt machine and allow
110 Ctrl-Alt-Del reboot.
111
112 file = <filename>
113 Set file name for file mode. This option can be given as often
114 as you like to check several files.
115
116 change = <mtime>
117 Set the change interval time for file mode. This options always
118 belongs to the active filename, that is when finding a 'change
119 =' line watchdog assumes it belongs to the most recently read
120 'file =' line. They don't necessarily have to follow each other
121 directly. But you cannot specify a 'change =' before a 'file ='.
122 The default is to only stat the file and don't look for changes.
123 Using this feature to monitor changes in /var/log/messages might
124 require some special syslog daemon configuration, e.g. rsyslog
125 needs "$ActionWriteAllMarkMessages on" to be set to make sure
126 the marks are written no matter what.
127
128 pidfile = <pidfilename>
129 Set pidfile name for server test mode. This option can be given
130 as often as you like to check several servers. See the Systemd
131 section in watchdog (8) for more information.
132
133 ping = <ip-addr>
134 Set IPv4 address for ping mode. This option can be used more
135 than once to check different connections.
136
137 interface = <if-name>
138 Set interface name for network mode. This option can be used
139 more than once to check different interfaces. Note it is only
140 possible to check physical interfaces, and not aliased IP inter‐
141 faces.
142
143 test-binary = <testbin>
144 Execute the given binary to do some user defined tests. With
145 enforcing SELinux policy please use the /usr/libexec/watch‐
146 dog/scripts/ for your test-binary configuration.
147
148 test-timeout = <timeout in seconds>
149 User defined tests may only run for <timeout> seconds. Set to 0
150 for unlimited.
151
152 repair-binary = <repbin>
153 Execute the given binary in case of a problem instead of shut‐
154 ting down the system. With enforcing SELinux policy please use
155 the /usr/libexec/watchdog/scripts/ for your repair-binary con‐
156 figuration.
157
158 repair-timeout = <timeout in seconds>
159 repair command may only run for <timeout> seconds. Set to 0 for
160 'unlimited', but note that the hardware timer is not refreshed
161 in this case so the system will hard-reset at some point.
162
163 retry-timeout = <timeout in seconds>
164 Allow most error conditions to persist for <timeout> seconds.
165 Set to 0 for immediate action (like softboot behaviour).
166
167 repair-maximum = <count>
168 This allows no more then <count> repair attempts against a given
169 fault that report success (i.e. return 0), but fail to clear the
170 fault, before a reboot is initiated anyway. If set to zero then
171 a repairable fault can always be blocked by a repair program
172 reporting success (previous daemon behaviour).
173
174 admin = <mail-address>
175 Email address to send admin mail to. That is, who shall be noti‐
176 fied that the machine is being halted or rebooted. Default is
177 'root'. If you want to disable notification via email just set
178 admin to en empty string.
179
180 realtime = <yes|no>
181 If set to yes watchdog will lock itself into memory so it is
182 never swapped out.
183
184 priority = <schedule priority>
185 Set the schedule priority for realtime mode.
186
187 test-directory = <test directory>
188 Set the directory to run user test/repair scripts. Default is
189 '/etc/watchdog.d' The /etc/watchdog.d/ is recognized by SELinux
190 policy. See the Test Directory section in watchdog(8) for more
191 information.
192
193 log-dir = <log directory>
194 Set the log directory to capture the standard output and stan‐
195 dard error from repair-binary and test-binary execution. Default
196 is '/var/log/watchdog'.
197
198 sigterm-delay = <time in seconds>
199 Set the time on shut down between first sending SIGTERM to all
200 processes, and then sending SIGKILL. Default is 5 seconds which
201 is generally enough, but systems with large databases or virtual
202 machines might need longer.
203
204 verbose = <yes|no>
205 This overrides the command line --verbose option. Generally the
206 verbose mode is only enabled for debugging as it creates a lot
207 of syslog chatter, so use this option with consideration.
208
210 /etc/watchdog.conf
211 The watchdog configuration file
212
213 /etc/watchdog.d
214 A directory containing test-or-repair commands. See the Test
215 Directory section in watchdog(8) for more information.
216
218 watchdog(8)
219
220
221
2224th Berkeley Distribution January 2016 WATCHDOG.CONF(5)