1CTDB(1) CTDB - clustered TDB database CTDB(1)
2
3
4
6 ctdb - CTDB management utility
7
9 ctdb [OPTION...] {COMMAND} [COMMAND-ARGS]
10
12 ctdb is a utility to view and manage a CTDB cluster.
13
14 The following terms are used when referring to nodes in a cluster:
15
16 PNN
17 Physical Node Number. The physical node number is an integer that
18 describes the node in the cluster. The first node has physical node
19 number 0. in a cluster.
20
21 PNN-LIST
22 This is either a single PNN, a comma-separate list of PNNs or
23 "all".
24
25 Commands that reference a database use the following terms:
26
27 DB
28 This is either a database name, such as locking.tdb or a database
29 ID such as "0x42fe72c5".
30
31 DB-LIST
32 A space separated list of at least one DB.
33
35 -n PNN
36 The node specified by PNN should be queried for the requested
37 information. Default is to query the daemon running on the local
38 host.
39
40 -Y
41 Produce output in machine readable form for easier parsing by
42 scripts. This uses a field delimiter of ':'. Not all commands
43 support this option.
44
45 -x SEPARATOR
46 Use SEPARATOR to delimit fields in machine readable output. This
47 implies -Y.
48
49 -X
50 Produce output in machine readable form for easier parsing by
51 scripts. This uses a field delimiter of '|'. Not all commands
52 support this option.
53
54 This is equivalent to "-x|" and avoids some shell quoting issues.
55
56 -t TIMEOUT
57 Indicates that ctdb should wait up to TIMEOUT seconds for a
58 response to most commands sent to the CTDB daemon. The default is
59 10 seconds.
60
61 -T TIMELIMIT
62 Indicates that TIMELIMIT is the maximum run time (in seconds) for
63 the ctdb command. When TIMELIMIT is exceeded the ctdb command will
64 terminate with an error. The default is 120 seconds.
65
66 -? --help
67 Print some help text to the screen.
68
69 --usage
70 Print useage information to the screen.
71
72 -d --debug=DEBUGLEVEL
73 Change the debug level for the command. Default is NOTICE.
74
75 --socket=FILENAME
76 Specify that FILENAME is the name of the Unix domain socket to use
77 when connecting to the local CTDB daemon. The default is
78 /var/run/ctdb/ctdbd.socket.
79
81 These are commands used to monitor and administer a CTDB cluster.
82
83 pnn
84 This command displays the PNN of the current node.
85
86 status
87 This command shows the current status of all CTDB nodes based on
88 information from the queried node.
89
90 Note: If the the queried node is INACTIVE then the status might not be
91 current.
92
93 Node status
94 This includes the number of physical nodes and the status of each
95 node. See ctdb(7) for information about node states.
96
97 Generation
98 The generation id is a number that indicates the current generation
99 of a cluster instance. Each time a cluster goes through a
100 reconfiguration or a recovery its generation id will be changed.
101
102 This number does not have any particular meaning other than to keep
103 track of when a cluster has gone through a recovery. It is a random
104 number that represents the current instance of a ctdb cluster and
105 its databases. The CTDB daemon uses this number internally to be
106 able to tell when commands to operate on the cluster and the
107 databases was issued in a different generation of the cluster, to
108 ensure that commands that operate on the databases will not survive
109 across a cluster database recovery. After a recovery, all old
110 outstanding commands will automatically become invalid.
111
112 Sometimes this number will be shown as "INVALID". This only means
113 that the ctdbd daemon has started but it has not yet merged with
114 the cluster through a recovery. All nodes start with generation
115 "INVALID" and are not assigned a real generation id until they have
116 successfully been merged with a cluster through a recovery.
117
118 Virtual Node Number (VNN) map
119 Consists of the number of virtual nodes and mapping from virtual
120 node numbers to physical node numbers. Virtual nodes host CTDB
121 databases. Only nodes that are participating in the VNN map can
122 become lmaster or dmaster for database records.
123
124 Recovery mode
125 This is the current recovery mode of the cluster. There are two
126 possible modes:
127
128 NORMAL - The cluster is fully operational.
129
130 RECOVERY - The cluster databases have all been frozen, pausing all
131 services while the cluster awaits a recovery process to complete. A
132 recovery process should finish within seconds. If a cluster is
133 stuck in the RECOVERY state this would indicate a cluster
134 malfunction which needs to be investigated.
135
136 Once the recovery master detects an inconsistency, for example a
137 node becomes disconnected/connected, the recovery daemon will
138 trigger a cluster recovery process, where all databases are
139 remerged across the cluster. When this process starts, the recovery
140 master will first "freeze" all databases to prevent applications
141 such as samba from accessing the databases and it will also mark
142 the recovery mode as RECOVERY.
143
144 When the CTDB daemon starts up, it will start in RECOVERY mode.
145 Once the node has been merged into a cluster and all databases have
146 been recovered, the node mode will change into NORMAL mode and the
147 databases will be "thawed", allowing samba to access the databases
148 again.
149
150 Recovery master
151 This is the cluster node that is currently designated as the
152 recovery master. This node is responsible of monitoring the
153 consistency of the cluster and to perform the actual recovery
154 process when reqired.
155
156 Only one node at a time can be the designated recovery master.
157 Which node is designated the recovery master is decided by an
158 election process in the recovery daemons running on each node.
159
160 Example
161 # ctdb status
162 Number of nodes:4
163 pnn:0 192.168.2.200 OK (THIS NODE)
164 pnn:1 192.168.2.201 OK
165 pnn:2 192.168.2.202 OK
166 pnn:3 192.168.2.203 OK
167 Generation:1362079228
168 Size:4
169 hash:0 lmaster:0
170 hash:1 lmaster:1
171 hash:2 lmaster:2
172 hash:3 lmaster:3
173 Recovery mode:NORMAL (0)
174 Recovery master:0
175
176
177 nodestatus [PNN-LIST]
178 This command is similar to the status command. It displays the "node
179 status" subset of output. The main differences are:
180
181 · The exit code is the bitwise-OR of the flags for each specified
182 node, while ctdb status exits with 0 if it was able to retrieve
183 status for all nodes.
184
185 · ctdb status provides status information for all nodes. ctdb
186 nodestatus defaults to providing status for only the current node.
187 If PNN-LIST is provided then status is given for the indicated
188 node(s).
189
190 A common invocation in scripts is ctdb nodestatus all to check whether
191 all nodes in a cluster are healthy.
192
193 Example
194 # ctdb nodestatus
195 pnn:0 10.0.0.30 OK (THIS NODE)
196
197 # ctdb nodestatus all
198 Number of nodes:2
199 pnn:0 10.0.0.30 OK (THIS NODE)
200 pnn:1 10.0.0.31 OK
201
202
203 recmaster
204 This command shows the pnn of the node which is currently the
205 recmaster.
206
207 Note: If the the queried node is INACTIVE then the status might not be
208 current.
209
210 uptime
211 This command shows the uptime for the ctdb daemon. When the last
212 recovery or ip-failover completed and how long it took. If the
213 "duration" is shown as a negative number, this indicates that there is
214 a recovery/failover in progress and it started that many seconds ago.
215
216 Example
217 # ctdb uptime
218 Current time of node : Thu Oct 29 10:38:54 2009
219 Ctdbd start time : (000 16:54:28) Wed Oct 28 17:44:26 2009
220 Time of last recovery/failover: (000 16:53:31) Wed Oct 28 17:45:23 2009
221 Duration of last recovery/failover: 2.248552 seconds
222
223
224 listnodes
225 This command shows lists the ip addresses of all the nodes in the
226 cluster.
227
228 Example
229 # ctdb listnodes
230 192.168.2.200
231 192.168.2.201
232 192.168.2.202
233 192.168.2.203
234
235
236 natgw {master|list|status}
237 This command shows different aspects of NAT gateway status. For an
238 overview of CTDB's NAT gateway functionality please see the NAT GATEWAY
239 section in ctdb(7).
240
241 master
242 Show the PNN and private IP address of the current NAT gateway
243 master node.
244
245 Example output:
246
247 1 192.168.2.201
248
249
250 list
251 List the private IP addresses of nodes in the current NAT gateway
252 group, annotating the master node.
253
254 Example output:
255
256 192.168.2.200
257 192.168.2.201 MASTER
258 192.168.2.202
259 192.168.2.203
260
261
262 status
263 List the nodes in the current NAT gateway group and their status.
264
265 Example output:
266
267 pnn:0 192.168.2.200 UNHEALTHY (THIS NODE)
268 pnn:1 192.168.2.201 OK
269 pnn:2 192.168.2.202 OK
270 pnn:3 192.168.2.203 OK
271
272
273 ping
274 This command will "ping" specified CTDB nodes in the cluster to verify
275 that they are running.
276
277 Example
278 # ctdb ping
279 response from 0 time=0.000054 sec (3 clients)
280
281
282 ifaces
283 This command will display the list of network interfaces, which could
284 host public addresses, along with their status.
285
286 Example
287 # ctdb ifaces
288 Interfaces on node 0
289 name:eth5 link:up references:2
290 name:eth4 link:down references:0
291 name:eth3 link:up references:1
292 name:eth2 link:up references:1
293
294 # ctdb -X ifaces
295 |Name|LinkStatus|References|
296 |eth5|1|2|
297 |eth4|0|0|
298 |eth3|1|1|
299 |eth2|1|1|
300
301
302 ip
303 This command will display the list of public addresses that are
304 provided by the cluster and which physical node is currently serving
305 this ip. By default this command will ONLY show those public addresses
306 that are known to the node itself. To see the full list of all public
307 ips across the cluster you must use "ctdb ip all".
308
309 Example
310 # ctdb ip -v
311 Public IPs on node 0
312 172.31.91.82 node[1] active[] available[eth2,eth3] configured[eth2,eth3]
313 172.31.91.83 node[0] active[eth3] available[eth2,eth3] configured[eth2,eth3]
314 172.31.91.84 node[1] active[] available[eth2,eth3] configured[eth2,eth3]
315 172.31.91.85 node[0] active[eth2] available[eth2,eth3] configured[eth2,eth3]
316 172.31.92.82 node[1] active[] available[eth5] configured[eth4,eth5]
317 172.31.92.83 node[0] active[eth5] available[eth5] configured[eth4,eth5]
318 172.31.92.84 node[1] active[] available[eth5] configured[eth4,eth5]
319 172.31.92.85 node[0] active[eth5] available[eth5] configured[eth4,eth5]
320
321 # ctdb -X ip -v
322 |Public IP|Node|ActiveInterface|AvailableInterfaces|ConfiguredInterfaces|
323 |172.31.91.82|1||eth2,eth3|eth2,eth3|
324 |172.31.91.83|0|eth3|eth2,eth3|eth2,eth3|
325 |172.31.91.84|1||eth2,eth3|eth2,eth3|
326 |172.31.91.85|0|eth2|eth2,eth3|eth2,eth3|
327 |172.31.92.82|1||eth5|eth4,eth5|
328 |172.31.92.83|0|eth5|eth5|eth4,eth5|
329 |172.31.92.84|1||eth5|eth4,eth5|
330 |172.31.92.85|0|eth5|eth5|eth4,eth5|
331
332
333 ipinfo IP
334 This command will display details about the specified public addresses.
335
336 Example
337 # ctdb ipinfo 172.31.92.85
338 Public IP[172.31.92.85] info on node 0
339 IP:172.31.92.85
340 CurrentNode:0
341 NumInterfaces:2
342 Interface[1]: Name:eth4 Link:down References:0
343 Interface[2]: Name:eth5 Link:up References:2 (active)
344
345
346 event run|status|script list|script enable|script disable
347 This command is used to control event daemon and to inspect status of
348 various events.
349
350 run EVENT TIMEOUT [ARGUMENTS]
351 This command can be used to manually run specified EVENT with
352 optional ARGUMENTS. The event will be allowed to run a maximum of
353 TIMEOUT seconds. If TIMEOUT is 0, then there is no time limit for
354 running the event.
355
356 status [EVENT] [lastrun|lastpass|lastfail]
357 This command displays the last execution status of the specified
358 EVENT. If no event is specified, then the status of last executed
359 monitor event will be displayed.
360
361 To see the last successful execution of the event, lastpass can be
362 specified. Similarly lastfail can be specified to see the last
363 unsuccessful execution of the event. The optional lastrun can be
364 specified to query the last execution of the event.
365
366 The command will terminate with the exit status corresponding to
367 the overall status of event that is displayed. If lastpass is
368 specified, then the command will always terminate with 0. If
369 lastfail is specified then the command will always terminate with
370 non-zero exit status. If lastrun is specified, then the command
371 will terminate with 0 or not depending on if the last execution of
372 the event was successful or not.
373
374 The output is the list of event scripts executed. Each line shows
375 the name, status, duration and start time for each script.
376
377 Example output:
378
379 00.ctdb OK 0.014 Sat Dec 17 19:39:11 2016
380 01.reclock OK 0.013 Sat Dec 17 19:39:11 2016
381 05.system OK 0.029 Sat Dec 17 19:39:11 2016
382 06.nfs OK 0.014 Sat Dec 17 19:39:11 2016
383 10.external DISABLED
384 10.interface OK 0.037 Sat Dec 17 19:39:11 2016
385 11.natgw OK 0.011 Sat Dec 17 19:39:11 2016
386 11.routing OK 0.007 Sat Dec 17 19:39:11 2016
387 13.per_ip_routing OK 0.007 Sat Dec 17 19:39:11 2016
388 20.multipathd OK 0.007 Sat Dec 17 19:39:11 2016
389 31.clamd OK 0.007 Sat Dec 17 19:39:11 2016
390 40.vsftpd OK 0.013 Sat Dec 17 19:39:11 2016
391 41.httpd OK 0.018 Sat Dec 17 19:39:11 2016
392 49.winbind OK 0.023 Sat Dec 17 19:39:11 2016
393 50.samba OK 0.100 Sat Dec 17 19:39:12 2016
394 60.nfs OK 0.376 Sat Dec 17 19:39:12 2016
395 70.iscsi OK 0.009 Sat Dec 17 19:39:12 2016
396 91.lvs OK 0.007 Sat Dec 17 19:39:12 2016
397 99.timeout OK 0.007 Sat Dec 17 19:39:12 2016
398
399
400 script list
401 List the available event scripts.
402
403 Example output:
404
405 00.ctdb
406 01.reclock
407 05.system
408 06.nfs
409 10.external DISABLED
410 10.interface
411 11.natgw
412 11.routing
413 13.per_ip_routing
414 20.multipathd
415 31.clamd
416 40.vsftpd
417 41.httpd
418 49.winbind
419 50.samba
420 60.nfs
421 70.iscsi
422 91.lvs
423 99.timeout
424
425
426 script enable SCRIPT
427 Enable the specified event SCRIPT. Only enabled scripts will be
428 executed when running any event.
429
430 script disable SCRIPT
431 Disable the specified event SCRIPT. This will prevent the script
432 from executing when running any event.
433
434 scriptstatus
435 This command displays which event scripts where run in the previous
436 monitoring cycle and the result of each script. If a script failed with
437 an error, causing the node to become unhealthy, the output from that
438 script is also shown.
439
440 This command is deprecated. It's provided for backward compatibility.
441 In place of ctdb scriptstatus, use ctdb event status.
442
443 Example
444 # ctdb scriptstatus
445 00.ctdb OK 0.011 Sat Dec 17 19:40:46 2016
446 01.reclock OK 0.010 Sat Dec 17 19:40:46 2016
447 05.system OK 0.030 Sat Dec 17 19:40:46 2016
448 06.nfs OK 0.014 Sat Dec 17 19:40:46 2016
449 10.external DISABLED
450 10.interface OK 0.041 Sat Dec 17 19:40:46 2016
451 11.natgw OK 0.008 Sat Dec 17 19:40:46 2016
452 11.routing OK 0.007 Sat Dec 17 19:40:46 2016
453 13.per_ip_routing OK 0.007 Sat Dec 17 19:40:46 2016
454 20.multipathd OK 0.007 Sat Dec 17 19:40:46 2016
455 31.clamd OK 0.007 Sat Dec 17 19:40:46 2016
456 40.vsftpd OK 0.013 Sat Dec 17 19:40:46 2016
457 41.httpd OK 0.015 Sat Dec 17 19:40:46 2016
458 49.winbind OK 0.022 Sat Dec 17 19:40:46 2016
459 50.samba ERROR 0.077 Sat Dec 17 19:40:46 2016
460 OUTPUT: ERROR: samba tcp port 445 is not responding
461
462
463 listvars
464 List all tuneable variables, except the values of the obsolete tunables
465 like VacuumMinInterval. The obsolete tunables can be retrieved only
466 explicitly with the "ctdb getvar" command.
467
468 Example
469 # ctdb listvars
470 SeqnumInterval = 1000
471 ControlTimeout = 60
472 TraverseTimeout = 20
473 KeepaliveInterval = 5
474 KeepaliveLimit = 5
475 RecoverTimeout = 120
476 RecoverInterval = 1
477 ElectionTimeout = 3
478 TakeoverTimeout = 9
479 MonitorInterval = 15
480 TickleUpdateInterval = 20
481 EventScriptTimeout = 30
482 MonitorTimeoutCount = 20
483 RecoveryGracePeriod = 120
484 RecoveryBanPeriod = 300
485 DatabaseHashSize = 100001
486 DatabaseMaxDead = 5
487 RerecoveryTimeout = 10
488 EnableBans = 1
489 NoIPFailback = 0
490 DisableIPFailover = 0
491 VerboseMemoryNames = 0
492 RecdPingTimeout = 60
493 RecdFailCount = 10
494 LogLatencyMs = 0
495 RecLockLatencyMs = 1000
496 RecoveryDropAllIPs = 120
497 VacuumInterval = 10
498 VacuumMaxRunTime = 120
499 RepackLimit = 10000
500 VacuumLimit = 5000
501 VacuumFastPathCount = 60
502 MaxQueueDropMsg = 1000000
503 AllowUnhealthyDBRead = 0
504 StatHistoryInterval = 1
505 DeferredAttachTO = 120
506 AllowClientDBAttach = 1
507 RecoverPDBBySeqNum = 1
508 DeferredRebalanceOnNodeAdd = 300
509 FetchCollapse = 1
510 HopcountMakeSticky = 50
511 StickyDuration = 600
512 StickyPindown = 200
513 NoIPTakeover = 0
514 DBRecordCountWarn = 100000
515 DBRecordSizeWarn = 10000000
516 DBSizeWarn = 100000000
517 PullDBPreallocation = 10485760
518 NoIPHostOnAllDisabled = 0
519 TDBMutexEnabled = 1
520 LockProcessesPerDB = 200
521 RecBufferSizeLimit = 1000000
522 QueueBufferSize = 1024
523 IPAllocAlgorithm = 2
524
525
526 getvar NAME
527 Get the runtime value of a tuneable variable.
528
529 Example
530 # ctdb getvar MonitorInterval
531 MonitorInterval = 15
532
533
534 setvar NAME VALUE
535 Set the runtime value of a tuneable variable.
536
537 Example
538 # ctdb setvar MonitorInterval 20
539
540
541 lvs {master|list|status}
542 This command shows different aspects of LVS status. For an overview of
543 CTDB's LVS functionality please see the LVS section in ctdb(7).
544
545 master
546 Shows the PNN of the current LVS master node.
547
548 Example output:
549
550 2
551
552
553 list
554 Lists the currently usable LVS nodes.
555
556 Example output:
557
558 2 10.0.0.13
559 3 10.0.0.14
560
561
562 status
563 List the nodes in the current LVS group and their status.
564
565 Example output:
566
567 pnn:0 10.0.0.11 UNHEALTHY (THIS NODE)
568 pnn:1 10.0.0.12 UNHEALTHY
569 pnn:2 10.0.0.13 OK
570 pnn:3 10.0.0.14 OK
571
572
573 getcapabilities
574 This command shows the capabilities of the current node. See the
575 CAPABILITIES section in ctdb(7) for more details.
576
577 Example output:
578
579 RECMASTER: YES
580 LMASTER: YES
581
582
583 statistics
584 Collect statistics from the CTDB daemon about how many calls it has
585 served. Information about various fields in statistics can be found in
586 ctdb-statistics(7).
587
588 Example
589 # ctdb statistics
590 CTDB version 1
591 Current time of statistics : Tue Mar 8 15:18:51 2016
592 Statistics collected since : (003 21:31:32) Fri Mar 4 17:47:19 2016
593 num_clients 9
594 frozen 0
595 recovering 0
596 num_recoveries 2
597 client_packets_sent 8170534
598 client_packets_recv 7166132
599 node_packets_sent 16549998
600 node_packets_recv 5244418
601 keepalive_packets_sent 201969
602 keepalive_packets_recv 201969
603 node
604 req_call 26
605 reply_call 0
606 req_dmaster 9
607 reply_dmaster 12
608 reply_error 0
609 req_message 1339231
610 req_control 8177506
611 reply_control 6831284
612 client
613 req_call 15
614 req_message 334809
615 req_control 6831308
616 timeouts
617 call 0
618 control 0
619 traverse 0
620 locks
621 num_calls 8
622 num_current 0
623 num_pending 0
624 num_failed 0
625 total_calls 15
626 pending_calls 0
627 childwrite_calls 0
628 pending_childwrite_calls 0
629 memory_used 394879
630 max_hop_count 1
631 total_ro_delegations 0
632 total_ro_revokes 0
633 hop_count_buckets: 8 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0
634 lock_buckets: 0 0 8 0 0 0 0 0 0 0 0 0 0 0 0 0
635 locks_latency MIN/AVG/MAX 0.010005/0.010418/0.011010 sec out of 8
636 reclock_ctdbd MIN/AVG/MAX 0.002538/0.002538/0.002538 sec out of 1
637 reclock_recd MIN/AVG/MAX 0.000000/0.000000/0.000000 sec out of 0
638 call_latency MIN/AVG/MAX 0.000044/0.002142/0.011702 sec out of 15
639 childwrite_latency MIN/AVG/MAX 0.000000/0.000000/0.000000 sec out of 0
640
641
642 statisticsreset
643 This command is used to clear all statistics counters in a node.
644
645 Example: ctdb statisticsreset
646
647 dbstatistics DB
648 Display statistics about the database DB. Information about various
649 fields in dbstatistics can be found in ctdb-statistics(7).
650
651 Example
652 # ctdb dbstatistics locking.tdb
653 DB Statistics: locking.tdb
654 ro_delegations 0
655 ro_revokes 0
656 locks
657 total 14356
658 failed 0
659 current 0
660 pending 0
661 hop_count_buckets: 28087 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0
662 lock_buckets: 0 14188 38 76 32 19 3 0 0 0 0 0 0 0 0 0
663 locks_latency MIN/AVG/MAX 0.001066/0.012686/4.202292 sec out of 14356
664 vacuum_latency MIN/AVG/MAX 0.000472/0.002207/15.243570 sec out of 224530
665 Num Hot Keys: 1
666 Count:8 Key:ff5bd7cb3ee3822edc1f0000000000000000000000000000
667
668
669 getreclock
670 Show details of the recovery lock, if any.
671
672 Example output:
673
674 /clusterfs/.ctdb/recovery.lock
675
676
677 getdebug
678 Get the current debug level for the node. the debug level controls what
679 information is written to the log file.
680
681 The debug levels are mapped to the corresponding syslog levels. When a
682 debug level is set, only those messages at that level and higher levels
683 will be printed.
684
685 The list of debug levels from highest to lowest are :
686
687 ERROR WARNING NOTICE INFO DEBUG
688
689 setdebug DEBUGLEVEL
690 Set the debug level of a node. This controls what information will be
691 logged.
692
693 The debuglevel is one of ERROR WARNING NOTICE INFO DEBUG
694
695 getpid
696 This command will return the process id of the ctdb daemon.
697
698 disable
699 This command is used to administratively disable a node in the cluster.
700 A disabled node will still participate in the cluster and host
701 clustered TDB records but its public ip address has been taken over by
702 a different node and it no longer hosts any services.
703
704 enable
705 Re-enable a node that has been administratively disabled.
706
707 stop
708 This command is used to administratively STOP a node in the cluster. A
709 STOPPED node is connected to the cluster but will not host any public
710 ip addresse, nor does it participate in the VNNMAP. The difference
711 between a DISABLED node and a STOPPED node is that a STOPPED node does
712 not host any parts of the database which means that a recovery is
713 required to stop/continue nodes.
714
715 continue
716 Re-start a node that has been administratively stopped.
717
718 addip IPADDR/mask IFACE
719 This command is used to add a new public ip to a node during runtime.
720 It should be followed by a ctdb ipreallocate. This allows public
721 addresses to be added to a cluster without having to restart the ctdb
722 daemons.
723
724 Note that this only updates the runtime instance of ctdb. Any changes
725 will be lost next time ctdb is restarted and the public addresses file
726 is re-read. If you want this change to be permanent you must also
727 update the public addresses file manually.
728
729 delip IPADDR
730 This command flags IPADDR for deletion from a node at runtime. It
731 should be followed by a ctdb ipreallocate. If IPADDR is currently
732 hosted by the node it is being removed from, this ensures that the IP
733 will first be failed over to another node, if possible, and that it is
734 then actually removed.
735
736 Note that this only updates the runtime instance of CTDB. Any changes
737 will be lost next time CTDB is restarted and the public addresses file
738 is re-read. If you want this change to be permanent you must also
739 update the public addresses file manually.
740
741 moveip IPADDR PNN
742 This command can be used to manually fail a public ip address to a
743 specific node.
744
745 In order to manually override the "automatic" distribution of public ip
746 addresses that ctdb normally provides, this command only works when you
747 have changed the tunables for the daemon to:
748
749 IPAllocAlgorithm != 0
750
751 NoIPFailback = 1
752
753 shutdown
754 This command will shutdown a specific CTDB daemon.
755
756 setlmasterrole on|off
757 This command is used ot enable/disable the LMASTER capability for a
758 node at runtime. This capability determines whether or not a node can
759 be used as an LMASTER for records in the database. A node that does not
760 have the LMASTER capability will not show up in the vnnmap.
761
762 Nodes will by default have this capability, but it can be stripped off
763 nodes by the setting in the sysconfig file or by using this command.
764
765 Once this setting has been enabled/disabled, you need to perform a
766 recovery for it to take effect.
767
768 See also "ctdb getcapabilities"
769
770 setrecmasterrole on|off
771 This command is used ot enable/disable the RECMASTER capability for a
772 node at runtime. This capability determines whether or not a node can
773 be used as an RECMASTER for the cluster. A node that does not have the
774 RECMASTER capability can not win a recmaster election. A node that
775 already is the recmaster for the cluster when the capability is
776 stripped off the node will remain the recmaster until the next cluster
777 election.
778
779 Nodes will by default have this capability, but it can be stripped off
780 nodes by the setting in the sysconfig file or by using this command.
781
782 See also "ctdb getcapabilities"
783
784 reloadnodes
785 This command is used when adding new nodes, or removing existing nodes
786 from an existing cluster.
787
788 Procedure to add nodes:
789
790 1. To expand an existing cluster, first ensure with ctdb status that
791 all nodes are up and running and that they are all healthy. Do not
792 try to expand a cluster unless it is completely healthy!
793
794 2. On all nodes, edit /etc/ctdb/nodes and add the new nodes at the end
795 of this file.
796
797 3. Verify that all the nodes have identical /etc/ctdb/nodes files
798 after adding the new nodes.
799
800 4. Run ctdb reloadnodes to force all nodes to reload the nodes file.
801
802 5. Use ctdb status on all nodes and verify that they now show the
803 additional nodes.
804
805 6. Install and configure the new node and bring it online.
806
807 Procedure to remove nodes:
808
809 1. To remove nodes from an existing cluster, first ensure with ctdb
810 status that all nodes, except the node to be deleted, are up and
811 running and that they are all healthy. Do not try to remove nodes
812 from a cluster unless the cluster is completely healthy!
813
814 2. Shutdown and power off the node to be removed.
815
816 3. On all other nodes, edit the /etc/ctdb/nodes file and comment out
817 the nodes to be removed. Do not delete the lines for the deleted
818 nodes, just comment them out by adding a '#' at the beginning of
819 the lines.
820
821 4. Run ctdb reloadnodes to force all nodes to reload the nodes file.
822
823 5. Use ctdb status on all nodes and verify that the deleted nodes are
824 no longer listed.
825
826 reloadips [PNN-LIST]
827 This command reloads the public addresses configuration file on the
828 specified nodes. When it completes addresses will be reconfigured and
829 reassigned across the cluster as necessary.
830
831 This command is currently unable to make changes to the netmask or
832 interfaces associated with existing addresses. Such changes must be
833 made in 2 steps by deleting addresses in question and re-adding then.
834 Unfortunately this will disrupt connections to the changed addresses.
835
836 getdbmap
837 This command lists all clustered TDB databases that the CTDB daemon has
838 attached to. Some databases are flagged as PERSISTENT, this means that
839 the database stores data persistently and the data will remain across
840 reboots. One example of such a database is secrets.tdb where
841 information about how the cluster was joined to the domain is stored.
842 Some database are flagged as REPLICATED, this means that the data in
843 that database is replicated across all the nodes. But the data will not
844 remain across reboots. This type of database is used by CTDB to store
845 it's internal state.
846
847 If a PERSISTENT database is not in a healthy state the database is
848 flagged as UNHEALTHY. If there's at least one completely healthy node
849 running in the cluster, it's possible that the content is restored by a
850 recovery run automaticly. Otherwise an administrator needs to analyze
851 the problem.
852
853 See also "ctdb getdbstatus", "ctdb backupdb", "ctdb restoredb", "ctdb
854 dumpbackup", "ctdb wipedb", "ctdb setvar AllowUnhealthyDBRead 1" and
855 (if samba or tdb-utils are installed) "tdbtool check".
856
857 Most databases are not persistent and only store the state information
858 that the currently running samba daemons need. These databases are
859 always wiped when ctdb/samba starts and when a node is rebooted.
860
861 Example
862 # ctdb getdbmap
863 Number of databases:10
864 dbid:0x435d3410 name:notify.tdb path:/var/lib/ctdb/notify.tdb.0
865 dbid:0x42fe72c5 name:locking.tdb path:/var/lib/ctdb/locking.tdb.0
866 dbid:0x1421fb78 name:brlock.tdb path:/var/lib/ctdb/brlock.tdb.0
867 dbid:0x17055d90 name:connections.tdb path:/var/lib/ctdb/connections.tdb.0
868 dbid:0xc0bdde6a name:sessionid.tdb path:/var/lib/ctdb/sessionid.tdb.0
869 dbid:0x122224da name:test.tdb path:/var/lib/ctdb/test.tdb.0
870 dbid:0x2672a57f name:idmap2.tdb path:/var/lib/ctdb/persistent/idmap2.tdb.0 PERSISTENT
871 dbid:0xb775fff6 name:secrets.tdb path:/var/lib/ctdb/persistent/secrets.tdb.0 PERSISTENT
872 dbid:0xe98e08b6 name:group_mapping.tdb path:/var/lib/ctdb/persistent/group_mapping.tdb.0 PERSISTENT
873 dbid:0x7bbbd26c name:passdb.tdb path:/var/lib/ctdb/persistent/passdb.tdb.0 PERSISTENT
874
875 # ctdb getdbmap # example for unhealthy database
876 Number of databases:1
877 dbid:0xb775fff6 name:secrets.tdb path:/var/lib/ctdb/persistent/secrets.tdb.0 PERSISTENT UNHEALTHY
878
879 # ctdb -X getdbmap
880 |ID|Name|Path|Persistent|Unhealthy|
881 |0x7bbbd26c|passdb.tdb|/var/lib/ctdb/persistent/passdb.tdb.0|1|0|
882
883
884 backupdb DB FILE
885 Copy the contents of database DB to FILE. FILE can later be read back
886 using restoredb. This is mainly useful for backing up persistent
887 databases such as secrets.tdb and similar.
888
889 restoredb FILE [DB]
890 This command restores a persistent database that was previously backed
891 up using backupdb. By default the data will be restored back into the
892 same database as it was created from. By specifying dbname you can
893 restore the data into a different database.
894
895 setdbreadonly DB
896 This command will enable the read-only record support for a database.
897 This is an experimental feature to improve performance for contended
898 records primarily in locking.tdb and brlock.tdb. When enabling this
899 feature you must set it on all nodes in the cluster.
900
901 setdbsticky DB
902 This command will enable the sticky record support for the specified
903 database. This is an experimental feature to improve performance for
904 contended records primarily in locking.tdb and brlock.tdb. When
905 enabling this feature you must set it on all nodes in the cluster.
906
908 Internal commands are used by CTDB's scripts and are not required for
909 managing a CTDB cluster. Their parameters and behaviour are subject to
910 change.
911
912 gettickles IPADDR
913 Show TCP connections that are registered with CTDB to be "tickled" if
914 there is a failover.
915
916 gratarp IPADDR INTERFACE
917 Send out a gratuitous ARP for the specified interface through the
918 specified interface. This command is mainly used by the ctdb
919 eventscripts.
920
921 pdelete DB KEY
922 Delete KEY from DB.
923
924 pfetch DB KEY
925 Print the value associated with KEY in DB.
926
927 pstore DB KEY FILE
928 Store KEY in DB with contents of FILE as the associated value.
929
930 ptrans DB [FILE]
931 Read a list of key-value pairs, one per line from FILE, and store them
932 in DB using a single transaction. An empty value is equivalent to
933 deleting the given key.
934
935 The key and value should be separated by spaces or tabs. Each key/value
936 should be a printable string enclosed in double-quotes.
937
938 runstate [setup|first_recovery|startup|running]
939 Print the runstate of the specified node. Runstates are used to
940 serialise important state transitions in CTDB, particularly during
941 startup.
942
943 If one or more optional runstate arguments are specified then the node
944 must be in one of these runstates for the command to succeed.
945
946 Example
947 # ctdb runstate
948 RUNNING
949
950
951 setifacelink IFACE up|down
952 Set the internal state of network interface IFACE. This is typically
953 used in the 10.interface script in the "monitor" event.
954
955 Example: ctdb setifacelink eth0 up
956
957 tickle
958 Read a list of TCP connections, one per line, from standard input and
959 send a TCP tickle to the source host for each connection. A connection
960 is specified as:
961
962 SRC-IPADDR:SRC-PORT DST-IPADDR:DST-PORT
963
964
965 A single connection can be specified on the command-line rather than on
966 standard input.
967
968 A TCP tickle is a TCP ACK packet with an invalid sequence and
969 acknowledge number and will when received by the source host result in
970 it sending an immediate correct ACK back to the other end.
971
972 TCP tickles are useful to "tickle" clients after a IP failover has
973 occurred since this will make the client immediately recognize the TCP
974 connection has been disrupted and that the client will need to
975 reestablish. This greatly speeds up the time it takes for a client to
976 detect and reestablish after an IP failover in the ctdb cluster.
977
978 version
979 Display the CTDB version.
980
982 These commands are primarily used for CTDB development and testing and
983 should not be used for normal administration.
984
985 OPTIONS
986 --print-emptyrecords
987 This enables printing of empty records when dumping databases with
988 the catdb, cattbd and dumpdbbackup commands. Records with empty
989 data segment are considered deleted by ctdb and cleaned by the
990 vacuuming mechanism, so this switch can come in handy for debugging
991 the vacuuming behaviour.
992
993 --print-datasize
994 This lets database dumps (catdb, cattdb, dumpdbbackup) print the
995 size of the record data instead of dumping the data contents.
996
997 --print-lmaster
998 This lets catdb print the lmaster for each record.
999
1000 --print-hash
1001 This lets database dumps (catdb, cattdb, dumpdbbackup) print the
1002 hash for each record.
1003
1004 --print-recordflags
1005 This lets catdb and dumpdbbackup print the record flags for each
1006 record. Note that cattdb always prints the flags.
1007
1008 process-exists PID [SRVID]
1009 This command checks if a specific process exists on the CTDB host. This
1010 is mainly used by Samba to check if remote instances of samba are still
1011 running or not. When the optional SRVID argument is specified, the
1012 command check if a specific process exists on the CTDB host and has
1013 registered for specified SRVID.
1014
1015 getdbstatus DB
1016 This command displays more details about a database.
1017
1018 Example
1019 # ctdb getdbstatus test.tdb.0
1020 dbid: 0x122224da
1021 name: test.tdb
1022 path: /var/lib/ctdb/test.tdb.0
1023 PERSISTENT: no
1024 HEALTH: OK
1025
1026 # ctdb getdbstatus registry.tdb # with a corrupted TDB
1027 dbid: 0xf2a58948
1028 name: registry.tdb
1029 path: /var/lib/ctdb/persistent/registry.tdb.0
1030 PERSISTENT: yes
1031 HEALTH: NO-HEALTHY-NODES - ERROR - Backup of corrupted TDB in '/var/lib/ctdb/persistent/registry.tdb.0.corrupted.20091208091949.0Z'
1032
1033
1034 catdb DB
1035 Print a dump of the clustered TDB database DB.
1036
1037 cattdb DB
1038 Print a dump of the contents of the local TDB database DB.
1039
1040 dumpdbbackup FILE
1041 Print a dump of the contents from database backup FILE, similar to
1042 catdb.
1043
1044 wipedb DB
1045 Remove all contents of database DB.
1046
1047 recover
1048 This command will trigger the recovery daemon to do a cluster recovery.
1049
1050 ipreallocate, sync
1051 This command will force the recovery master to perform a full ip
1052 reallocation process and redistribute all ip addresses. This is useful
1053 to "reset" the allocations back to its default state if they have been
1054 changed using the "moveip" command. While a "recover" will also perform
1055 this reallocation, a recovery is much more hevyweight since it will
1056 also rebuild all the databases.
1057
1058 attach DBNAME [persistent|replicated]
1059 Create a new CTDB database called DBNAME and attach to it on all nodes.
1060
1061 detach DB-LIST
1062 Detach specified non-persistent database(s) from the cluster. This
1063 command will disconnect specified database(s) on all nodes in the
1064 cluster. This command should only be used when none of the specified
1065 database(s) are in use.
1066
1067 All nodes should be active and tunable AllowClientDBAccess should be
1068 disabled on all nodes before detaching databases.
1069
1070 dumpmemory
1071 This is a debugging command. This command will make the ctdb daemon to
1072 write a fill memory allocation map to standard output.
1073
1074 rddumpmemory
1075 This is a debugging command. This command will dump the talloc memory
1076 allocation tree for the recovery daemon to standard output.
1077
1078 ban BANTIME
1079 Administratively ban a node for BANTIME seconds. The node will be
1080 unbanned after BANTIME seconds have elapsed.
1081
1082 A banned node does not participate in the cluster. It does not host any
1083 records for the clustered TDB and does not host any public IP
1084 addresses.
1085
1086 Nodes are automatically banned if they misbehave. For example, a node
1087 may be banned if it causes too many cluster recoveries.
1088
1089 To administratively exclude a node from a cluster use the stop command.
1090
1091 unban
1092 This command is used to unban a node that has either been
1093 administratively banned using the ban command or has been automatically
1094 banned.
1095
1097 ctdbd(1), onnode(1), ctdb(7), ctdb-statistics(7), ctdb-tunables(7),
1098 http://ctdb.samba.org/
1099
1101 This documentation was written by Ronnie Sahlberg, Amitay Isaacs,
1102 Martin Schwenke
1103
1105 Copyright © 2007 Andrew Tridgell, Ronnie Sahlberg
1106
1107 This program is free software; you can redistribute it and/or modify it
1108 under the terms of the GNU General Public License as published by the
1109 Free Software Foundation; either version 3 of the License, or (at your
1110 option) any later version.
1111
1112 This program is distributed in the hope that it will be useful, but
1113 WITHOUT ANY WARRANTY; without even the implied warranty of
1114 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
1115 General Public License for more details.
1116
1117 You should have received a copy of the GNU General Public License along
1118 with this program; if not, see http://www.gnu.org/licenses.
1119
1120
1121
1122
1123ctdb 10/30/2018 CTDB(1)