1
2fenced(8) System Manager's Manual fenced(8)
3
4
5
7 fenced - the I/O Fencing daemon
8
9
11 fenced [OPTION]...
12
13
15 The fencing daemon, fenced, should be run on every node that will use
16 CLVM or GFS. It should be started after the node has joined the CMAN
17 cluster (fenced is only used with CMAN; it is not used with
18 GULM/SLM/RLM.) A node that is not running fenced is not permitted to
19 mount GFS file systems.
20
21 All fencing daemons running in the cluster form a group called the
22 "fence domain". Any member of the fence domain that fails is fenced by
23 a remaining domain member. The actual fencing does not occur unless
24 the cluster has quorum so if a node failure causes the loss of quorum,
25 the failed node will not be fenced until quorum has been regained. If
26 a failed domain member (due to be fenced) rejoins the cluster prior to
27 the actual fencing operation is carried out, the fencing operation is
28 bypassed.
29
30 The fencing daemon depends on CMAN for cluster membership information
31 and it depends on CCS to provide cluster.conf information. The fencing
32 daemon calls fencing agents according to cluster.conf information.
33
34
35 Node failure
36 When a domain member fails, the actual fencing must be completed before
37 GFS recovery can begin. This means any delay in carrying out the fenc‐
38 ing operation will also delay the completion of GFS file system opera‐
39 tions; most file system operations will hang during this period.
40
41 When a domain member fails, the actual fencing operation can be delayed
42 by a configurable number of seconds (post_fail_delay or -f). Within
43 this time the failed node can rejoin the cluster to avoid being fenced.
44 This delay is 0 by default to minimize the time that applications using
45 GFS are stalled by recovery. A delay of -1 causes the fence daemon to
46 wait indefinitely for the failed node to rejoin the cluster. In this
47 case the node is not fenced and all recovery must wait until the failed
48 node rejoins the cluster.
49
50
51 Domain startup
52 When the domain is first created in the cluster (by the first node to
53 join it) and subsequently enabled (by the cluster gaining quorum) any
54 nodes listed in cluster.conf that are not presently members of the CMAN
55 cluster are fenced. The status of these nodes is unknown and to be on
56 the side of safety they are assumed to be in need of fencing. This
57 startup fencing can be disabled; but it's only truely safe to do so if
58 an operator is present to verify that no cluster nodes are in need of
59 fencing. (Dangerous nodes that need to be fenced are those that had
60 gfs mounted, did not cleanly unmount, and are now either hung or unable
61 to communicate with other nodes over the network.)
62
63 The first way to avoid fencing nodes unnecessarily on startup is to
64 ensure that all nodes have joined the cluster before any of the nodes
65 start the fence daemon. This method is difficult to automate.
66
67 A second way to avoid fencing nodes unnecessarily on startup is using
68 the post_join_delay parameter (or -j option). This is the number of
69 seconds the fence daemon will delay before actually fencing any victims
70 after nodes join the domain. This delay will give any nodes that have
71 been tagged for fencing the chance to join the cluster and avoid being
72 fenced. A delay of -1 here will cause the daemon to wait indefinitely
73 for all nodes to join the cluster and no nodes will actually be fenced
74 on startup.
75
76 To disable fencing at domain-creation time entirely, the -c option can
77 be used to declare that all nodes are in a clean or safe state to
78 start. The clean_start cluster.conf option can also be set to do this,
79 but automatically disabling startup fencing in cluster.conf can risk
80 file system corruption.
81
82 Avoiding unnecessary fencing at startup is primarily a concern when
83 nodes are fenced by power cycling. If nodes are fenced by disabling
84 their SAN access, then unnecessarily fencing a node is usually less
85 disruptive.
86
87
89 Fencing daemon behavior can be controlled by setting options in the
90 cluster.conf file under the section <fence_daemon> </fence_daemon>.
91 See above for complete descriptions of these values. The delay values
92 are in seconds; -1 secs means an unlimitted delay. The values shown
93 are the defaults.
94
95 Post-join delay is the number of seconds the daemon will wait before
96 fencing any victims after a node joins the domain.
97
98 <fence_daemon post_join_delay="3">
99 </fence_daemon>
100
101 Post-fail delay is the number of seconds the daemon will wait before
102 fencing any victims after a domain member fails.
103
104 <fence_daemon post_fail_delay="0">
105 </fence_daemon>
106
107 Clean-start is used to prevent any startup fencing the daemon might do.
108 It indicates that the daemon should assume all nodes are in a clean
109 state to start.
110
111 <fence_daemon clean_start="0">
112 </fence_daemon>
113
114
116 Command line options override corresonding values in cluster.conf.
117
118 -j secs
119 Post-join fencing delay
120
121 -f secs
122 Post-fail fencing delay
123
124 -c All nodes are in a clean state to start.
125
126 -D Enable debugging code and don't fork into the background.
127
128 -n name
129 Name of the fence domain, "default" if none.
130
131 -V Print the version information and exit.
132
133 -h Print out a help message describing available options, then
134 exit.
135
136
138 gfs(8), fence(8)
139
140
141
142
143 fenced(8)