1CMAN_TOOL(8) System Manager's Manual CMAN_TOOL(8)
2
3
4
6 cman_tool - Cluster Management Tool
7
9 cman_tool join | leave | kill | expected | votes | version | wait |
10 status | nodes | services | debug [options]
11
13 cman_tool is a program that manages the cluster management subsystem
14 CMAN. cman_tool can be used to join the node to a cluster, leave the
15 cluster, kill another cluster node or change the value of expected
16 votes of a cluster.
17 Be careful that you understand the consequences of the commands issued
18 via cman_tool as they can affect all nodes in your cluster. Most of the
19 time the cman_tool will only be invoked from your startup and shutdown
20 scripts.
21
23 join This is the main use of cman_tool. It instructs the cluster man‐
24 ager to attempt to join an existing cluster or (if no existing
25 cluster exists) then to form a new one on its own.
26 If no options are given to this command then it will take the
27 cluster configuration information from cluster.conf. However, it
28 is possible to provide all the information on the command-line
29 or to override cluster.conf values by using the command line.
30
31
32 leave Tells CMAN to leave the cluster. You cannot do this if there are
33 subsystems (eg DLM, GFS) active. You should dismount all GFS
34 filesystems, shutdown CLVM, fenced and anything else using the
35 cluster manager before using cman_tool leave. Look at
36 'cman_tool status' and group_tool to see how many (and which)
37 subsystems are active.
38 When a node leaves the cluster, the remaining nodes recalculate
39 quorum and this may block cluster activity if the required num‐
40 ber of votes is not present. If this node is to be down for an
41 extended period of time and you need to keep the cluster run‐
42 ning, add the remove option, and the remaining nodes will recal‐
43 culate quorum such that activity can continue.
44
45
46 kill Tells CMAN to kill another node in the cluster. This will cause
47 the local node to send a "KILL" message to that node and it will
48 shut down. Recovery will occur for the killed node as if it had
49 failed. This is a sort of remote version of "leave force" so
50 only use if if you really know what you are doing.
51
52
53 expected
54 Tells CMAN a new value of expected votes and instructs it to
55 recalculate quorum based on this value.
56 The recalculation takes into account the number of currently
57 active nodes, so this option should not be used in an attempt to
58 artificially lower the quorum value in advance of a planned
59 shutdown of cluster nodes. Instead, the 'cman_tool leave
60 remove' command should be used (see the 'leave' subcommand
61 above).
62 Use this option if your cluster has lost quorum due to nodes
63 failing and you need to get it running again in a hurry.
64
65
66 version
67 Used alone this will report the major, minor, patch and config
68 versions used by CMAN (also displayed in 'cman_tool status'). It
69 can also be used with -r to tell cluster members to update the
70 cluster configuration.
71 If -r is specified, cman will read the configuration file, vali‐
72 date it, distribute it around the cluster (if necessary) an
73 activate it. See the VERSION OPTIONS section below for addi‐
74 tional options to the version command.
75
76
77 wait Waits until the node is a member of the cluster and then
78 returns.
79
80
81 status Displays the local view of the cluster status.
82
83
84 nodes Displays the local view of the cluster nodes.
85
86
87 services
88 Displays the local view of subsystems using cman (deprecated,
89 group_tool should be used instead).
90
91
92 debug Sets the debug level of the running cman daemon. Debug output
93 will be sent to syslog level LOG_DEBUG. the -d switch specifies
94 the new logging level. This is the same bitmask used for
95 cman_tool join -d
96
98 -w Normally, "cman_tool leave" will fail if the cluster is in tran‐
99 sition (ie another node is joining or leaving the cluster). By
100 adding the -w flag, cman_tool will wait and retry the leave
101 operation repeatedly until it succeeds or a more serious error
102 occurs.
103
104 -t <seconds>
105 If -w is also specified then -t dictates the maximum amount of
106 time cman_tool is prepared to wait. If the operation times out
107 then a status of 2 is returned.
108
109 force Shuts down the cluster manager without first telling any of the
110 subsystems to close down. Use this option with extreme care as
111 it could easily cause data loss.
112
113 remove Tells the rest of the cluster to recalculate quorum such that
114 activity can continue without this node.
115
116
118 -e <expected-votes>
119 The new value of expected votes to use. This will usually be
120 enough to bring the cluster back to life. Values that would
121 cause incorrect quorum will be rejected.
122
123
125 -n <nodename>
126 The node name of the node to be killed. This should be the
127 unqualified node name as it appears in 'cman_tool nodes'.
128
129
131 -r Update config version. You don't need to use this when adding a
132 new node, the new cman node will tell the rest of the cluster to
133 read the latest version of the config file automatically. The
134 version present in the new configuration must be higher than the
135 one currently in use by cman.
136
137 cman_tool version on its own will always show the current ver‐
138 sion and not the one being looked for. So be aware that the dis‐
139 play will possibly not update immediately after you have run
140 cman_tool version -r.
141
142 -D<option>
143 see "JOIN" options
144
145 -S By default cman_tool version will try to distribute the new
146 cluster.conf file using ccs_sync and ricci. If you have distrib‐
147 uted the file yourself and/or do not have ricci installed then
148 the -S option will skip this step. NOTE: it is still important
149 that all nodes in the cluster have the same version of the file.
150 Make sure that this is the case before using this option.
151
153 -q Waits until the cluster is quorate before returning. -t <sec‐
154 onds> Dictates the maximum amount of time cman_tool is prepared
155 to wait. If the operation times out then a status of 2 is
156 returned.
157
158
160 -c <clustername>
161 Provides a text name for the cluster. You can have several clus‐
162 ters on one LAN and they are distinguished by this name. Note
163 that the name is hashed to provide a unique number which is what
164 actually distinguishes the cluster, so it is possible that two
165 different names can clash. If this happens, the node will not be
166 allowed into the existing cluster and you will have to pick
167 another name or use different port number for cluster communica‐
168 tion.
169
170 -p <port>
171 UDP port number used for cluster communication. This defaults to
172 5405.
173
174 -v <votes>
175 Number of votes this node has in the cluster. Defaults to 1.
176
177 -e <expected votes>
178 Number of expected votes for the whole cluster. If different
179 nodes provide different values then the highest is used. The
180 cluster will only operate when quorum is reached - that is more
181 than half the available votes are available to the cluster. The
182 default for this value is the total number of votes for all
183 nodes in the configuration file.
184
185 -2 Sets the cluster up for a special "two node only" mode. Because
186 of the quorum requirements mentioned above, a two-node cluster
187 cannot be valid. This option tells the cluster manager that
188 there will only ever be two nodes in the cluster and relies on
189 fencing to ensure cluster integrity. If you specify this you
190 cannot add more nodes without taking down the existing cluster
191 and reconfiguring it. Expected votes should be set to 1 for a
192 two-node cluster.
193
194 -n <nodename>
195 Overrides the node name. By default the unqualified hostname is
196 used. This option is also used to specify which interface is
197 used for cluster communication.
198
199 -N <nodeid>
200 Overrides the node ID for this node. Normally, nodes are
201 assigned a node id in cluster.conf. If you specify an incorrect
202 node ID here, the node might not be allowed to join the cluster.
203 Setting node IDs in the configuration is a far better way to do
204 this. Note that the node's application to join the cluster may
205 be rejected if you try to set the nodeid to one that has already
206 been used, or if the node was previously a member of the cluster
207 but with a different nodeid.
208
209 -o <nodename>
210 Override the name this node will have in the cluster. This will
211 normally be the hostname or the first name specified by -n.
212 Note how this differs from -n: -n tells cman_tool how to find
213 the host address and/or the entry in the configuration file. -o
214 simply changes the name the node will have in the cluster and
215 has no bearing on the actual name of the machine. Use this
216 option will extreme caution.
217
218 -m <multicast-address>
219 Specifies a multicast address to use for cluster communication.
220 This is required for IPv6 operation. You should also specify an
221 ethernet interface to bind to this multicast address using the
222 -i option.
223
224 -w Join and wait until the node is a cluster member.
225
226 -q Join and wait until the cluster is quorate. If the cluster join
227 fails and -w (or -q) is specified, then it will be retried. Note
228 that cman_tool cannot tell whether the cluster join was rejected
229 by another node for a good reason or that it timed out for some
230 benign reason; so it is strongly recommended that a timeout is
231 also given with the wait options to join. If you don't want join
232 to retry on failure but do want to wait, use the cman_tool join
233 command without -w followed by cman_tool wait.
234
235 -k <keyfile>
236 All traffic sent out by cman/corosync is encrypted. By default
237 the security key used is simply the cluster name. If you need
238 more security you can specify a key file that contains the key
239 used to encrypt cluster communications. Of course, the contents
240 of the key file must be the same on all nodes in the cluster. It
241 is up to you to securely copy the file to the nodes.
242
243 -t <seconds>
244 If -w or -q is also specified then -t dictates the maximum
245 amount of time cman_tool is prepared to wait. If the operation
246 times out then a status of 2 is returned. Note that just
247 because cman_tool has given up, does not mean that cman itself
248 has stopped trying to join a cluster.
249
250 -X Tells cman not to use the configuration file to get cluster
251 information. If you use this option then cman will apply several
252 defaults to the cluster to get it going. The cluster name will
253 be "RHCluster", node IDs will default to the IP address of the
254 node and remote node names will show up as Node<nodeid>. All of
255 these, apart from the node names can be overridden on the
256 cman_tool command-line if required.
257 If you have to set up fence devices, services or anything else
258 in cluster.conf then this option is probably not worthwhile to
259 you - the extra readability of sensible node names and numbers
260 will make it worth using cluster.conf for the cluster too. But
261 for a simple failover cluster this might save you some effort.
262 On each node using this configuration you will need to have the
263 same authorization key installed. To create this key run
264 corosync-keygen
265 mv /etc/ais/authkey /etc/cluster/cman_authkey
266 then copy that file to all nodes you want to join the cluster.
267
268 -C Overrides the default configuration module. Usually cman uses
269 xmlconfig (cluster.conf) to load its configuration. If you have
270 your configuration database held elsewhere (eg LDAP) and have a
271 configuration plugin for it, then you should specify the name of
272 the module (see the documentation for the module for the name of
273 it - it's not necessarily the same as the filename) here.
274 It is possible to chain configuration modules by separating them
275 with colons. So to add two modules (eg) 'ldapconfig' and 'ldap‐
276 preproc' to the chain start cman with -C ldapconfig:ldappreproc
277 The default value for this is 'xmlconfig'. Note that if the -X
278 is on the command-line then -C will be ignored.
279
280 -A Don't load openais services. Normally cman_tool join will load
281 the configuration module 'openaisserviceenablestable' which will
282 load the services installed by openais. If you don't want to
283 use these services or have not installed openais then this
284 switch will disable them.
285
286 -D Tells cman_tool whether to validate the configuration before
287 loading or reloading it. By default the configuration is vali‐
288 dated, which is equivalent to -Dfail.
289 -Dwarn will validate the configuration and print any messages
290 arising, but will attempt to use it regardless of its validity.
291 -Dnone (or just -D) will skip the validation completely.
292 The -D switch does not take a space between -D and the parame‐
293 ter. so '-D fail' will cause an error. Use -Dfail.
294
296 -a Shows the IP address(es) the nodes are communicating on.
297
298 -n <nodename>
299 Shows node information for a specific node. This should be the
300 unqualified node name as it appears in 'cman_tool nodes'.
301
302 -F <format>
303 Specify the format of the output. The format string may contain
304 one or more format options, each separated by a comma. Valid
305 format options include: id, name, type, and addr.
306
308 -d<value>
309 The value is a bitmask of
310 2 Barriers
311 4 Membership messages
312 8 Daemon operation, including command-line interaction
313 16 Interaction with Corosync
314 32 Startup debugging (cman_tool join operations only)
315
317 the nodes subcommand shows a list of nodes known to cman. the state is
318 one of the following:
319 M The node is a member of the cluster
320 X The node is not a member of the cluster
321 d The node is known to the cluster but disallowed access to it.
322
324 cman_tool removes most environment variables before forking and running
325 Corosync, as well as adding some of its own for setting up configura‐
326 tion parameters that were overridden on the command-line, the exception
327 to this is that variable with names starting COROSYNC_ will be passed
328 down intact as they are assumed to be used for configuring the daemon.
329
330
332 Occasionally (but very infrequently I hope) you may see nodes marked as
333 "Disallowed" in cman_tool status or "d" in cman_tool nodes. This is a
334 bit of a nasty hack to get around mismatch between what the upper lay‐
335 ers expect of the cluster manager and corosync.
336
337 If a node experiences a momentary lack of connectivity, but one that is
338 long enough to trigger the token timeouts, then it will be removed from
339 the cluster. When connectivity is restored corosync will happily let it
340 rejoin the cluster with no fuss. Sadly the upper layers don't like this
341 very much. They may (indeed probably will have) have changed their
342 internal state while the other node was away and there is no straight‐
343 forward way to bring the rejoined node up-to-date with that state. When
344 this happens the node is marked "Disallowed" and is not permitted to
345 take part in cman operations.
346
347 If the remainder of the cluster is quorate the the node will be sent a
348 kill message and it will be forced to leave the cluster that way. Note
349 that fencing should kick in to remove the node permanently anyway, but
350 it may take longer than the network outage for this to complete.
351
352 If the remainder of the cluster is inquorate then we have a problem.
353 The likelihood is that we will have two (or more) partitioned clusters
354 and we cannot decide which is the "right" one. In this case we need to
355 defer to the system administrator to kill an appropriate selection of
356 nodes to restore the cluster to sensible operation.
357
358 The latter scenario should be very rare and may indicate a bug some‐
359 where in the code. If the local network is very flaky or busy it may be
360 necessary to increase some of the protocol timeouts for corosync. We
361 are trying to think of better solutions to this problem.
362
363 Recovering from this state can, unfortunately, be complicated. Fortu‐
364 nately, in the majority of cases, fencing will do the job for you, and
365 the disallowed state will only be temporary. If it persists, the recom‐
366 mended approach it is to do a cman tool nodes on all systems in the
367 cluster and determine the largest common subset of nodes that are valid
368 members to each other. Then reboot the others and let them rejoin cor‐
369 rectly. In the case of a single-node disconnection this should be
370 straightforward, with a large cluster that has experienced a network
371 partition it could get very complicated!
372
373 Example:
374
375 In this example we have a five node cluster that has experienced a net‐
376 work partition. Here is the output of cman_tool nodes from all systems:
377 Node Sts Inc Joined Name
378 1 M 2372 2007-11-05 02:58:55 node-01.example.com
379 2 d 2376 2007-11-05 02:58:56 node-02.example.com
380 3 d 2376 2007-11-05 02:58:56 node-03.example.com
381 4 M 2376 2007-11-05 02:58:56 node-04.example.com
382 5 M 2376 2007-11-05 02:58:56 node-05.example.com
383
384 Node Sts Inc Joined Name
385 1 d 2372 2007-11-05 02:58:55 node-01.example.com
386 2 M 2376 2007-11-05 02:58:56 node-02.example.com
387 3 M 2376 2007-11-05 02:58:56 node-03.example.com
388 4 d 2376 2007-11-05 02:58:56 node-04.example.com
389 5 d 2376 2007-11-05 02:58:56 node-05.example.com
390
391 Node Sts Inc Joined Name
392 1 d 2372 2007-11-05 02:58:55 node-01.example.com
393 2 M 2376 2007-11-05 02:58:56 node-02.example.com
394 3 M 2376 2007-11-05 02:58:56 node-03.example.com
395 4 d 2376 2007-11-05 02:58:56 node-04.example.com
396 5 d 2376 2007-11-05 02:58:56 node-05.example.com
397
398 Node Sts Inc Joined Name
399 1 M 2372 2007-11-05 02:58:55 node-01.example.com
400 2 d 2376 2007-11-05 02:58:56 node-02.example.com
401 3 d 2376 2007-11-05 02:58:56 node-03.example.com
402 4 M 2376 2007-11-05 02:58:56 node-04.example.com
403 5 M 2376 2007-11-05 02:58:56 node-05.example.com
404
405 Node Sts Inc Joined Name
406 1 M 2372 2007-11-05 02:58:55 node-01.example.com
407 2 d 2376 2007-11-05 02:58:56 node-02.example.com
408 3 d 2376 2007-11-05 02:58:56 node-03.example.com
409 4 M 2376 2007-11-05 02:58:56 node-04.example.com
410 5 M 2376 2007-11-05 02:58:56 node-05.example.com
411 In this scenario we should kill the node node-02 and node-03. Of
412 course, the 3 node cluster of node-01, node-04 & node-05 should remain
413 quorate and be able to fenced the two rejoined nodes anyway, but it is
414 possible that the cluster has a qdisk setup that precludes this.
415
416
418 This section details how the configuration systems work in cman. You
419 might need to know this if you are using the -C option to cman_tool, or
420 writing your own configuration subsystem.
421 By default cman uses two configuration plugins to corosync. The first,
422 'xmlconfig', reads the configuration information stored in cluster.conf
423 and stores it in an internal database, in the same schema as it finds
424 in cluster.conf. The second plugin, 'cmanpreconfig', takes the infor‐
425 mation in that the database, adds several cman defaults, determines the
426 corosync node name and nodeID and formats the information in a similar
427 manner to corosync.conf(5). Corosync then reads those keys to start the
428 cluster protocol. cmanpreconfig also reads several environment vari‐
429 ables that might be set by cman_tool which can override information in
430 the configuration.
431 In the absence of xmlconfig, ie when 'cman_tool join' is run with -X
432 switch (this removes xmlconfig from the module list), cmanpreconfig
433 also generates several defaults so that the cluster can be got running
434 without any configuration information - see above for the details.
435 Note that cmanpreconfig will not overwrite corosync keys that are
436 explicitly set in the configuration file, allowing you to provide cus‐
437 tom values for token timeouts etc, even though cman has its own
438 defaults for some of those values. The exception to this is the node
439 name/address and multicast values, which are always taken from the cman
440 configuration keys.
441 Most of the extra keys that cmanpreconfig adds are outside of the
442 /cluster/ tree and will only be seen if you dump the whole of
443 corosync's object database. However it does add some keys into /clus‐
444 ter/cman that you would not normally see in a normal cluster.conf file.
445 These are harmless, though could be confusing. The most obvious of
446 these is the "nodename" option which is passed from cmanpreconfig to
447 the name cman module, to save it recalculating the node name again.
448
449
450
451Cluster utilities Nov 8 2007 CMAN_TOOL(8)