1WDMD(8)                     System Manager's Manual                    WDMD(8)
2
3
4

NAME

6       wdmd - watchdog multiplexing daemon
7
8

SYNOPSIS

10       wdmd [OPTIONS]
11
12

DESCRIPTION

14       This daemon opens /dev/watchdog and allows multiple independent sources
15       to detmermine whether each KEEPALIVE is done.  Every test interval  (10
16       seconds),  the  daemon  tests  each  source.   If  any  test fails, the
17       KEEPALIVE is not done.  In a standard configuration, the watchdog timer
18       will  reset  the  system  if no KEEPALIVE is done for 60 seconds ("fire
19       timeout").  This means that if a single test fails 5-6  times  in  row,
20       the  watchdog  will  fire  and  reset  the  system.  With multiple test
21       sources, fewer separate failures back to back can also cause  a  reset,
22       e.g.
23
24       T seconds, P pass, F fail
25       T00: test1 P, test2 P, test3 P: KEEPALIVE done
26       T10: test1 F, test2 F, test3 P: KEEPALIVE skipped
27       T20: test1 F, test2 P, test3 P: KEEPALIVE skipped
28       T30: test1 P, test2 F, test3 P: KEEPALIVE skipped
29       T40: test1 P, test2 P, test3 F: KEEPALIVE skipped
30       T50: test1 F, test2 F, test3 P: KEEPALIVE skipped
31       T60: test1 P, test2 F, test3 P: KEEPALIVE skipped
32       T60: watchdog fires, system resets
33
34       (Depending  on timings, the system may be reset sometime shortly before
35       T60, and the tests at T60 would not be run.)
36
37       A crucial aspect to the design and function of wdmd is that if any sin‐
38       gle  source  does  not pass tests for the fire timeout, the watchdog is
39       guaranteed to fire, regardless of whether other sources on  the  system
40       have passed or failed.  A spurious reset due to the combined effects of
41       multiple failing tests as shown above, is an accepted side effect.
42
43       The wdmd init script will load the softdog module if no other  watchdog
44       module has been loaded.
45
46       wdmd  cannot be used on the system with any other program that needs to
47       open /dev/watchdog, e.g. watchdog(8).
48
49
50   Test Source: clients
51       Using libwdmd, programs connect to wdmd via a  unix  socket,  and  send
52       regular messages to wdmd to update an expiry time for their connection.
53       Every test interval, wdmd will check if the expiry time for  a  connec‐
54       tion has been reached.  If so, the test for that client fails.
55
56
57   Test Source: scripts
58       wdmd  will run scripts from a designated directory every test interval.
59       If a script exits with 0, the test is considered a success, otherwise a
60       failure.  If a script does not exit by the end of the test interval, it
61       is considered a failure.
62
63

OPTIONS

65       --version, -V
66                Print version.
67
68
69       --help, -h
70                Print usage.
71
72
73       --dump, -d
74                Print debug information from the daemon.
75
76
77       --probe, -p
78                Print path of functional watchdog device.  Exit code  0  indi‐
79              cates a
80                functional  device  was  found.  Exit code 1 indicates a func‐
81              tional device
82                was not found.
83
84
85       -D
86                Enable debugging to stderr and don't fork.
87
88
89       -H 0|1
90                Enable (1) or disable (0) high priority features such as real‐
91              time
92                scheduling priority and mlockall.
93
94
95       -G name
96                Group ownership for the socket.
97
98
99       -S 0|1
100                Enable (1) or disable (0) script tests.
101
102
103       -s path
104                Path to scripts dir.
105
106
107       -k num
108                Kill unfinished scripts after num seconds.
109
110
111       -w path
112                The path to the watchdog device to try first.
113
114
115
116
117
118                                  2011-08-01                           WDMD(8)
Impressum