ctdb(1) - c7

1CTDB(1)                  CTDB - clustered TDB database                 CTDB(1)
2
3
4

NAME

6       ctdb - CTDB management utility
7

SYNOPSIS

9       ctdb [OPTION...] {COMMAND} [COMMAND-ARGS]
10

DESCRIPTION

12       ctdb is a utility to view and manage a CTDB cluster.
13
14       The following terms are used when referring to nodes in a cluster:
15
16       PNN
17           Physical Node Number. The physical node number is an integer that
18           describes the node in the cluster. The first node has physical node
19           number 0. in a cluster.
20
21       PNN-LIST
22           This is either a single PNN, a comma-separate list of PNNs or
23           "all".
24
25       Commands that reference a database use the following terms:
26
27       DB
28           This is either a database name, such as locking.tdb or a database
29           ID such as "0x42fe72c5".
30
31       DB-LIST
32           A space separated list of at least one DB.
33

OPTIONS

35       -n PNN
36           The node specified by PNN should be queried for the requested
37           information. Default is to query the daemon running on the local
38           host.
39
40       -Y
41           Produce output in machine readable form for easier parsing by
42           scripts. This uses a field delimiter of ':'. Not all commands
43           support this option.
44
45       -x SEPARATOR
46           Use SEPARATOR to delimit fields in machine readable output. This
47           implies -Y.
48
49       -X
50           Produce output in machine readable form for easier parsing by
51           scripts. This uses a field delimiter of '|'. Not all commands
52           support this option.
53
54           This is equivalent to "-x|" and avoids some shell quoting issues.
55
56       -t TIMEOUT
57           Indicates that ctdb should wait up to TIMEOUT seconds for a
58           response to most commands sent to the CTDB daemon. The default is
59           10 seconds.
60
61       -T TIMELIMIT
62           Indicates that TIMELIMIT is the maximum run time (in seconds) for
63           the ctdb command. When TIMELIMIT is exceeded the ctdb command will
64           terminate with an error. The default is 120 seconds.
65
66       -? --help
67           Print some help text to the screen.
68
69       --usage
70           Print useage information to the screen.
71
72       -d --debug=DEBUGLEVEL
73           Change the debug level for the command. Default is NOTICE.
74
75       --socket=FILENAME
76           Specify that FILENAME is the name of the Unix domain socket to use
77           when connecting to the local CTDB daemon. The default is
78           /var/run/ctdb/ctdbd.socket.
79

ADMINISTRATIVE COMMANDS

81       These are commands used to monitor and administer a CTDB cluster.
82
83   pnn
84       This command displays the PNN of the current node.
85
86   status
87       This command shows the current status of all CTDB nodes based on
88       information from the queried node.
89
90       Note: If the the queried node is INACTIVE then the status might not be
91       current.
92
93       Node status
94           This includes the number of physical nodes and the status of each
95           node. See ctdb(7) for information about node states.
96
97       Generation
98           The generation id is a number that indicates the current generation
99           of a cluster instance. Each time a cluster goes through a
100           reconfiguration or a recovery its generation id will be changed.
101
102           This number does not have any particular meaning other than to keep
103           track of when a cluster has gone through a recovery. It is a random
104           number that represents the current instance of a ctdb cluster and
105           its databases. The CTDB daemon uses this number internally to be
106           able to tell when commands to operate on the cluster and the
107           databases was issued in a different generation of the cluster, to
108           ensure that commands that operate on the databases will not survive
109           across a cluster database recovery. After a recovery, all old
110           outstanding commands will automatically become invalid.
111
112           Sometimes this number will be shown as "INVALID". This only means
113           that the ctdbd daemon has started but it has not yet merged with
114           the cluster through a recovery. All nodes start with generation
115           "INVALID" and are not assigned a real generation id until they have
116           successfully been merged with a cluster through a recovery.
117
118       Virtual Node Number (VNN) map
119           Consists of the number of virtual nodes and mapping from virtual
120           node numbers to physical node numbers. Virtual nodes host CTDB
121           databases. Only nodes that are participating in the VNN map can
122           become lmaster or dmaster for database records.
123
124       Recovery mode
125           This is the current recovery mode of the cluster. There are two
126           possible modes:
127
128           NORMAL - The cluster is fully operational.
129
130           RECOVERY - The cluster databases have all been frozen, pausing all
131           services while the cluster awaits a recovery process to complete. A
132           recovery process should finish within seconds. If a cluster is
133           stuck in the RECOVERY state this would indicate a cluster
134           malfunction which needs to be investigated.
135
136           Once the recovery master detects an inconsistency, for example a
137           node becomes disconnected/connected, the recovery daemon will
138           trigger a cluster recovery process, where all databases are
139           remerged across the cluster. When this process starts, the recovery
140           master will first "freeze" all databases to prevent applications
141           such as samba from accessing the databases and it will also mark
142           the recovery mode as RECOVERY.
143
144           When the CTDB daemon starts up, it will start in RECOVERY mode.
145           Once the node has been merged into a cluster and all databases have
146           been recovered, the node mode will change into NORMAL mode and the
147           databases will be "thawed", allowing samba to access the databases
148           again.
149
150       Recovery master
151           This is the cluster node that is currently designated as the
152           recovery master. This node is responsible of monitoring the
153           consistency of the cluster and to perform the actual recovery
154           process when reqired.
155
156           Only one node at a time can be the designated recovery master.
157           Which node is designated the recovery master is decided by an
158           election process in the recovery daemons running on each node.
159
160       Example
161               # ctdb status
162               Number of nodes:4
163               pnn:0 192.168.2.200       OK (THIS NODE)
164               pnn:1 192.168.2.201       OK
165               pnn:2 192.168.2.202       OK
166               pnn:3 192.168.2.203       OK
167               Generation:1362079228
168               Size:4
169               hash:0 lmaster:0
170               hash:1 lmaster:1
171               hash:2 lmaster:2
172               hash:3 lmaster:3
173               Recovery mode:NORMAL (0)
174               Recovery master:0
175
176
177   nodestatus [PNN-LIST]
178       This command is similar to the status command. It displays the "node
179       status" subset of output. The main differences are:
180
181       ·   The exit code is the bitwise-OR of the flags for each specified
182           node, while ctdb status exits with 0 if it was able to retrieve
183           status for all nodes.
184
185       ·   ctdb status provides status information for all nodes.  ctdb
186           nodestatus defaults to providing status for only the current node.
187           If PNN-LIST is provided then status is given for the indicated
188           node(s).
189
190       A common invocation in scripts is ctdb nodestatus all to check whether
191       all nodes in a cluster are healthy.
192
193       Example
194               # ctdb nodestatus
195               pnn:0 10.0.0.30        OK (THIS NODE)
196
197               # ctdb nodestatus all
198               Number of nodes:2
199               pnn:0 10.0.0.30        OK (THIS NODE)
200               pnn:1 10.0.0.31        OK
201
202
203   recmaster
204       This command shows the pnn of the node which is currently the
205       recmaster.
206
207       Note: If the the queried node is INACTIVE then the status might not be
208       current.
209
210   uptime
211       This command shows the uptime for the ctdb daemon. When the last
212       recovery or ip-failover completed and how long it took. If the
213       "duration" is shown as a negative number, this indicates that there is
214       a recovery/failover in progress and it started that many seconds ago.
215
216       Example
217               # ctdb uptime
218               Current time of node          :                Thu Oct 29 10:38:54 2009
219               Ctdbd start time              : (000 16:54:28) Wed Oct 28 17:44:26 2009
220               Time of last recovery/failover: (000 16:53:31) Wed Oct 28 17:45:23 2009
221               Duration of last recovery/failover: 2.248552 seconds
222
223
224   listnodes
225       This command shows lists the ip addresses of all the nodes in the
226       cluster.
227
228       Example
229               # ctdb listnodes
230               192.168.2.200
231               192.168.2.201
232               192.168.2.202
233               192.168.2.203
234
235
236   natgw {master|list|status}
237       This command shows different aspects of NAT gateway status. For an
238       overview of CTDB's NAT gateway functionality please see the NAT GATEWAY
239       section in ctdb(7).
240
241       master
242           Show the PNN and private IP address of the current NAT gateway
243           master node.
244
245           Example output:
246
247               1 192.168.2.201
248
249
250       list
251           List the private IP addresses of nodes in the current NAT gateway
252           group, annotating the master node.
253
254           Example output:
255
256               192.168.2.200
257               192.168.2.201  MASTER
258               192.168.2.202
259               192.168.2.203
260
261
262       status
263           List the nodes in the current NAT gateway group and their status.
264
265           Example output:
266
267               pnn:0 192.168.2.200       UNHEALTHY (THIS NODE)
268               pnn:1 192.168.2.201       OK
269               pnn:2 192.168.2.202       OK
270               pnn:3 192.168.2.203       OK
271
272
273   ping
274       This command will "ping" specified CTDB nodes in the cluster to verify
275       that they are running.
276
277       Example
278               # ctdb ping
279               response from 0 time=0.000054 sec  (3 clients)
280
281
282   ifaces
283       This command will display the list of network interfaces, which could
284       host public addresses, along with their status.
285
286       Example
287               # ctdb ifaces
288               Interfaces on node 0
289               name:eth5 link:up references:2
290               name:eth4 link:down references:0
291               name:eth3 link:up references:1
292               name:eth2 link:up references:1
293
294               # ctdb -X ifaces
295               |Name|LinkStatus|References|
296               |eth5|1|2|
297               |eth4|0|0|
298               |eth3|1|1|
299               |eth2|1|1|
300
301
302   ip
303       This command will display the list of public addresses that are
304       provided by the cluster and which physical node is currently serving
305       this ip. By default this command will ONLY show those public addresses
306       that are known to the node itself. To see the full list of all public
307       ips across the cluster you must use "ctdb ip all".
308
309       Example
310               # ctdb ip -v
311               Public IPs on node 0
312               172.31.91.82 node[1] active[] available[eth2,eth3] configured[eth2,eth3]
313               172.31.91.83 node[0] active[eth3] available[eth2,eth3] configured[eth2,eth3]
314               172.31.91.84 node[1] active[] available[eth2,eth3] configured[eth2,eth3]
315               172.31.91.85 node[0] active[eth2] available[eth2,eth3] configured[eth2,eth3]
316               172.31.92.82 node[1] active[] available[eth5] configured[eth4,eth5]
317               172.31.92.83 node[0] active[eth5] available[eth5] configured[eth4,eth5]
318               172.31.92.84 node[1] active[] available[eth5] configured[eth4,eth5]
319               172.31.92.85 node[0] active[eth5] available[eth5] configured[eth4,eth5]
320
321               # ctdb -X ip -v
322               |Public IP|Node|ActiveInterface|AvailableInterfaces|ConfiguredInterfaces|
323               |172.31.91.82|1||eth2,eth3|eth2,eth3|
324               |172.31.91.83|0|eth3|eth2,eth3|eth2,eth3|
325               |172.31.91.84|1||eth2,eth3|eth2,eth3|
326               |172.31.91.85|0|eth2|eth2,eth3|eth2,eth3|
327               |172.31.92.82|1||eth5|eth4,eth5|
328               |172.31.92.83|0|eth5|eth5|eth4,eth5|
329               |172.31.92.84|1||eth5|eth4,eth5|
330               |172.31.92.85|0|eth5|eth5|eth4,eth5|
331
332
333   ipinfo IP
334       This command will display details about the specified public addresses.
335
336       Example
337               # ctdb ipinfo 172.31.92.85
338               Public IP[172.31.92.85] info on node 0
339               IP:172.31.92.85
340               CurrentNode:0
341               NumInterfaces:2
342               Interface[1]: Name:eth4 Link:down References:0
343               Interface[2]: Name:eth5 Link:up References:2 (active)
344
345
346   event run|status|script list|script enable|script disable
347       This command is used to control event daemon and to inspect status of
348       various events.
349
350       run EVENT TIMEOUT [ARGUMENTS]
351           This command can be used to manually run specified EVENT with
352           optional ARGUMENTS. The event will be allowed to run a maximum of
353           TIMEOUT seconds. If TIMEOUT is 0, then there is no time limit for
354           running the event.
355
356       status [EVENT] [lastrun|lastpass|lastfail]
357           This command displays the last execution status of the specified
358           EVENT. If no event is specified, then the status of last executed
359           monitor event will be displayed.
360
361           To see the last successful execution of the event, lastpass can be
362           specified. Similarly lastfail can be specified to see the last
363           unsuccessful execution of the event. The optional lastrun can be
364           specified to query the last execution of the event.
365
366           The command will terminate with the exit status corresponding to
367           the overall status of event that is displayed. If lastpass is
368           specified, then the command will always terminate with 0. If
369           lastfail is specified then the command will always terminate with
370           non-zero exit status. If lastrun is specified, then the command
371           will terminate with 0 or not depending on if the last execution of
372           the event was successful or not.
373
374           The output is the list of event scripts executed. Each line shows
375           the name, status, duration and start time for each script.
376
377           Example output:
378
379               00.ctdb              OK         0.014 Sat Dec 17 19:39:11 2016
380               01.reclock           OK         0.013 Sat Dec 17 19:39:11 2016
381               05.system            OK         0.029 Sat Dec 17 19:39:11 2016
382               06.nfs               OK         0.014 Sat Dec 17 19:39:11 2016
383               10.external          DISABLED
384               10.interface         OK         0.037 Sat Dec 17 19:39:11 2016
385               11.natgw             OK         0.011 Sat Dec 17 19:39:11 2016
386               11.routing           OK         0.007 Sat Dec 17 19:39:11 2016
387               13.per_ip_routing    OK         0.007 Sat Dec 17 19:39:11 2016
388               20.multipathd        OK         0.007 Sat Dec 17 19:39:11 2016
389               31.clamd             OK         0.007 Sat Dec 17 19:39:11 2016
390               40.vsftpd            OK         0.013 Sat Dec 17 19:39:11 2016
391               41.httpd             OK         0.018 Sat Dec 17 19:39:11 2016
392               49.winbind           OK         0.023 Sat Dec 17 19:39:11 2016
393               50.samba             OK         0.100 Sat Dec 17 19:39:12 2016
394               60.nfs               OK         0.376 Sat Dec 17 19:39:12 2016
395               70.iscsi             OK         0.009 Sat Dec 17 19:39:12 2016
396               91.lvs               OK         0.007 Sat Dec 17 19:39:12 2016
397               99.timeout           OK         0.007 Sat Dec 17 19:39:12 2016
398
399
400       script list
401           List the available event scripts.
402
403           Example output:
404
405               00.ctdb
406               01.reclock
407               05.system
408               06.nfs
409               10.external          DISABLED
410               10.interface
411               11.natgw
412               11.routing
413               13.per_ip_routing
414               20.multipathd
415               31.clamd
416               40.vsftpd
417               41.httpd
418               49.winbind
419               50.samba
420               60.nfs
421               70.iscsi
422               91.lvs
423               99.timeout
424
425
426       script enable SCRIPT
427           Enable the specified event SCRIPT. Only enabled scripts will be
428           executed when running any event.
429
430       script disable SCRIPT
431           Disable the specified event SCRIPT. This will prevent the script
432           from executing when running any event.
433
434   scriptstatus
435       This command displays which event scripts where run in the previous
436       monitoring cycle and the result of each script. If a script failed with
437       an error, causing the node to become unhealthy, the output from that
438       script is also shown.
439
440       This command is deprecated. It's provided for backward compatibility.
441       In place of ctdb scriptstatus, use ctdb event status.
442
443       Example
444               # ctdb scriptstatus
445               00.ctdb              OK         0.011 Sat Dec 17 19:40:46 2016
446               01.reclock           OK         0.010 Sat Dec 17 19:40:46 2016
447               05.system            OK         0.030 Sat Dec 17 19:40:46 2016
448               06.nfs               OK         0.014 Sat Dec 17 19:40:46 2016
449               10.external          DISABLED
450               10.interface         OK         0.041 Sat Dec 17 19:40:46 2016
451               11.natgw             OK         0.008 Sat Dec 17 19:40:46 2016
452               11.routing           OK         0.007 Sat Dec 17 19:40:46 2016
453               13.per_ip_routing    OK         0.007 Sat Dec 17 19:40:46 2016
454               20.multipathd        OK         0.007 Sat Dec 17 19:40:46 2016
455               31.clamd             OK         0.007 Sat Dec 17 19:40:46 2016
456               40.vsftpd            OK         0.013 Sat Dec 17 19:40:46 2016
457               41.httpd             OK         0.015 Sat Dec 17 19:40:46 2016
458               49.winbind           OK         0.022 Sat Dec 17 19:40:46 2016
459               50.samba             ERROR      0.077 Sat Dec 17 19:40:46 2016
460                 OUTPUT: ERROR: samba tcp port 445 is not responding
461
462
463   listvars
464       List all tuneable variables, except the values of the obsolete tunables
465       like VacuumMinInterval. The obsolete tunables can be retrieved only
466       explicitly with the "ctdb getvar" command.
467
468       Example
469               # ctdb listvars
470               SeqnumInterval          = 1000
471               ControlTimeout          = 60
472               TraverseTimeout         = 20
473               KeepaliveInterval       = 5
474               KeepaliveLimit          = 5
475               RecoverTimeout          = 120
476               RecoverInterval         = 1
477               ElectionTimeout         = 3
478               TakeoverTimeout         = 9
479               MonitorInterval         = 15
480               TickleUpdateInterval    = 20
481               EventScriptTimeout      = 30
482               MonitorTimeoutCount     = 20
483               RecoveryGracePeriod     = 120
484               RecoveryBanPeriod       = 300
485               DatabaseHashSize        = 100001
486               DatabaseMaxDead         = 5
487               RerecoveryTimeout       = 10
488               EnableBans              = 1
489               NoIPFailback            = 0
490               DisableIPFailover       = 0
491               VerboseMemoryNames      = 0
492               RecdPingTimeout         = 60
493               RecdFailCount           = 10
494               LogLatencyMs            = 0
495               RecLockLatencyMs        = 1000
496               RecoveryDropAllIPs      = 120
497               VacuumInterval          = 10
498               VacuumMaxRunTime        = 120
499               RepackLimit             = 10000
500               VacuumLimit             = 5000
501               VacuumFastPathCount     = 60
502               MaxQueueDropMsg         = 1000000
503               AllowUnhealthyDBRead    = 0
504               StatHistoryInterval     = 1
505               DeferredAttachTO        = 120
506               AllowClientDBAttach     = 1
507               RecoverPDBBySeqNum      = 1
508               DeferredRebalanceOnNodeAdd = 300
509               FetchCollapse           = 1
510               HopcountMakeSticky      = 50
511               StickyDuration          = 600
512               StickyPindown           = 200
513               NoIPTakeover            = 0
514               DBRecordCountWarn       = 100000
515               DBRecordSizeWarn        = 10000000
516               DBSizeWarn              = 100000000
517               PullDBPreallocation     = 10485760
518               NoIPHostOnAllDisabled   = 0
519               TDBMutexEnabled         = 1
520               LockProcessesPerDB      = 200
521               RecBufferSizeLimit      = 1000000
522               QueueBufferSize         = 1024
523               IPAllocAlgorithm        = 2
524
525
526   getvar NAME
527       Get the runtime value of a tuneable variable.
528
529       Example
530               # ctdb getvar MonitorInterval
531               MonitorInterval         = 15
532
533
534   setvar NAME VALUE
535       Set the runtime value of a tuneable variable.
536
537       Example
538               # ctdb setvar MonitorInterval 20
539
540
541   lvs {master|list|status}
542       This command shows different aspects of LVS status. For an overview of
543       CTDB's LVS functionality please see the LVS section in ctdb(7).
544
545       master
546           Shows the PNN of the current LVS master node.
547
548           Example output:
549
550               2
551
552
553       list
554           Lists the currently usable LVS nodes.
555
556           Example output:
557
558               2 10.0.0.13
559               3 10.0.0.14
560
561
562       status
563           List the nodes in the current LVS group and their status.
564
565           Example output:
566
567               pnn:0 10.0.0.11        UNHEALTHY (THIS NODE)
568               pnn:1 10.0.0.12        UNHEALTHY
569               pnn:2 10.0.0.13        OK
570               pnn:3 10.0.0.14        OK
571
572
573   getcapabilities
574       This command shows the capabilities of the current node. See the
575       CAPABILITIES section in ctdb(7) for more details.
576
577       Example output:
578
579           RECMASTER: YES
580           LMASTER: YES
581
582
583   statistics
584       Collect statistics from the CTDB daemon about how many calls it has
585       served. Information about various fields in statistics can be found in
586       ctdb-statistics(7).
587
588       Example
589               # ctdb statistics
590               CTDB version 1
591               Current time of statistics  :                Tue Mar  8 15:18:51 2016
592               Statistics collected since  : (003 21:31:32) Fri Mar  4 17:47:19 2016
593                num_clients                        9
594                frozen                             0
595                recovering                         0
596                num_recoveries                     2
597                client_packets_sent          8170534
598                client_packets_recv          7166132
599                node_packets_sent           16549998
600                node_packets_recv            5244418
601                keepalive_packets_sent        201969
602                keepalive_packets_recv        201969
603                node
604                    req_call                      26
605                    reply_call                     0
606                    req_dmaster                    9
607                    reply_dmaster                 12
608                    reply_error                    0
609                    req_message              1339231
610                    req_control              8177506
611                    reply_control            6831284
612                client
613                    req_call                      15
614                    req_message               334809
615                    req_control              6831308
616                timeouts
617                    call                           0
618                    control                        0
619                    traverse                       0
620                locks
621                    num_calls                      8
622                    num_current                    0
623                    num_pending                    0
624                    num_failed                     0
625                total_calls                       15
626                pending_calls                      0
627                childwrite_calls                   0
628                pending_childwrite_calls             0
629                memory_used                   394879
630                max_hop_count                      1
631                total_ro_delegations               0
632                total_ro_revokes                   0
633                hop_count_buckets: 8 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0
634                lock_buckets: 0 0 8 0 0 0 0 0 0 0 0 0 0 0 0 0
635                locks_latency      MIN/AVG/MAX     0.010005/0.010418/0.011010 sec out of 8
636                reclock_ctdbd      MIN/AVG/MAX     0.002538/0.002538/0.002538 sec out of 1
637                reclock_recd       MIN/AVG/MAX     0.000000/0.000000/0.000000 sec out of 0
638                call_latency       MIN/AVG/MAX     0.000044/0.002142/0.011702 sec out of 15
639                childwrite_latency MIN/AVG/MAX     0.000000/0.000000/0.000000 sec out of 0
640
641
642   statisticsreset
643       This command is used to clear all statistics counters in a node.
644
645       Example: ctdb statisticsreset
646
647   dbstatistics DB
648       Display statistics about the database DB. Information about various
649       fields in dbstatistics can be found in ctdb-statistics(7).
650
651       Example
652               # ctdb dbstatistics locking.tdb
653               DB Statistics: locking.tdb
654                ro_delegations                     0
655                ro_revokes                         0
656                locks
657                    total                      14356
658                    failed                         0
659                    current                        0
660                    pending                        0
661                hop_count_buckets: 28087 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0
662                lock_buckets: 0 14188 38 76 32 19 3 0 0 0 0 0 0 0 0 0
663                locks_latency      MIN/AVG/MAX     0.001066/0.012686/4.202292 sec out of 14356
664                vacuum_latency     MIN/AVG/MAX     0.000472/0.002207/15.243570 sec out of 224530
665                Num Hot Keys:     1
666                    Count:8 Key:ff5bd7cb3ee3822edc1f0000000000000000000000000000
667
668
669   getreclock
670       Show details of the recovery lock, if any.
671
672       Example output:
673
674                /clusterfs/.ctdb/recovery.lock
675
676
677   getdebug
678       Get the current debug level for the node. the debug level controls what
679       information is written to the log file.
680
681       The debug levels are mapped to the corresponding syslog levels. When a
682       debug level is set, only those messages at that level and higher levels
683       will be printed.
684
685       The list of debug levels from highest to lowest are :
686
687       ERROR WARNING NOTICE INFO DEBUG
688
689   setdebug DEBUGLEVEL
690       Set the debug level of a node. This controls what information will be
691       logged.
692
693       The debuglevel is one of ERROR WARNING NOTICE INFO DEBUG
694
695   getpid
696       This command will return the process id of the ctdb daemon.
697
698   disable
699       This command is used to administratively disable a node in the cluster.
700       A disabled node will still participate in the cluster and host
701       clustered TDB records but its public ip address has been taken over by
702       a different node and it no longer hosts any services.
703
704   enable
705       Re-enable a node that has been administratively disabled.
706
707   stop
708       This command is used to administratively STOP a node in the cluster. A
709       STOPPED node is connected to the cluster but will not host any public
710       ip addresse, nor does it participate in the VNNMAP. The difference
711       between a DISABLED node and a STOPPED node is that a STOPPED node does
712       not host any parts of the database which means that a recovery is
713       required to stop/continue nodes.
714
715   continue
716       Re-start a node that has been administratively stopped.
717
718   addip IPADDR/mask IFACE
719       This command is used to add a new public ip to a node during runtime.
720       It should be followed by a ctdb ipreallocate. This allows public
721       addresses to be added to a cluster without having to restart the ctdb
722       daemons.
723
724       Note that this only updates the runtime instance of ctdb. Any changes
725       will be lost next time ctdb is restarted and the public addresses file
726       is re-read. If you want this change to be permanent you must also
727       update the public addresses file manually.
728
729   delip IPADDR
730       This command flags IPADDR for deletion from a node at runtime. It
731       should be followed by a ctdb ipreallocate. If IPADDR is currently
732       hosted by the node it is being removed from, this ensures that the IP
733       will first be failed over to another node, if possible, and that it is
734       then actually removed.
735
736       Note that this only updates the runtime instance of CTDB. Any changes
737       will be lost next time CTDB is restarted and the public addresses file
738       is re-read. If you want this change to be permanent you must also
739       update the public addresses file manually.
740
741   moveip IPADDR PNN
742       This command can be used to manually fail a public ip address to a
743       specific node.
744
745       In order to manually override the "automatic" distribution of public ip
746       addresses that ctdb normally provides, this command only works when you
747       have changed the tunables for the daemon to:
748
749       IPAllocAlgorithm != 0
750
751       NoIPFailback = 1
752
753   shutdown
754       This command will shutdown a specific CTDB daemon.
755
756   setlmasterrole on|off
757       This command is used ot enable/disable the LMASTER capability for a
758       node at runtime. This capability determines whether or not a node can
759       be used as an LMASTER for records in the database. A node that does not
760       have the LMASTER capability will not show up in the vnnmap.
761
762       Nodes will by default have this capability, but it can be stripped off
763       nodes by the setting in the sysconfig file or by using this command.
764
765       Once this setting has been enabled/disabled, you need to perform a
766       recovery for it to take effect.
767
768       See also "ctdb getcapabilities"
769
770   setrecmasterrole on|off
771       This command is used ot enable/disable the RECMASTER capability for a
772       node at runtime. This capability determines whether or not a node can
773       be used as an RECMASTER for the cluster. A node that does not have the
774       RECMASTER capability can not win a recmaster election. A node that
775       already is the recmaster for the cluster when the capability is
776       stripped off the node will remain the recmaster until the next cluster
777       election.
778
779       Nodes will by default have this capability, but it can be stripped off
780       nodes by the setting in the sysconfig file or by using this command.
781
782       See also "ctdb getcapabilities"
783
784   reloadnodes
785       This command is used when adding new nodes, or removing existing nodes
786       from an existing cluster.
787
788       Procedure to add nodes:
789
790        1. To expand an existing cluster, first ensure with ctdb status that
791           all nodes are up and running and that they are all healthy. Do not
792           try to expand a cluster unless it is completely healthy!
793
794        2. On all nodes, edit /etc/ctdb/nodes and add the new nodes at the end
795           of this file.
796
797        3. Verify that all the nodes have identical /etc/ctdb/nodes files
798           after adding the new nodes.
799
800        4. Run ctdb reloadnodes to force all nodes to reload the nodes file.
801
802        5. Use ctdb status on all nodes and verify that they now show the
803           additional nodes.
804
805        6. Install and configure the new node and bring it online.
806
807       Procedure to remove nodes:
808
809        1. To remove nodes from an existing cluster, first ensure with ctdb
810           status that all nodes, except the node to be deleted, are up and
811           running and that they are all healthy. Do not try to remove nodes
812           from a cluster unless the cluster is completely healthy!
813
814        2. Shutdown and power off the node to be removed.
815
816        3. On all other nodes, edit the /etc/ctdb/nodes file and comment out
817           the nodes to be removed.  Do not delete the lines for the deleted
818           nodes, just comment them out by adding a '#' at the beginning of
819           the lines.
820
821        4. Run ctdb reloadnodes to force all nodes to reload the nodes file.
822
823        5. Use ctdb status on all nodes and verify that the deleted nodes are
824           no longer listed.
825
826   reloadips [PNN-LIST]
827       This command reloads the public addresses configuration file on the
828       specified nodes. When it completes addresses will be reconfigured and
829       reassigned across the cluster as necessary.
830
831       This command is currently unable to make changes to the netmask or
832       interfaces associated with existing addresses. Such changes must be
833       made in 2 steps by deleting addresses in question and re-adding then.
834       Unfortunately this will disrupt connections to the changed addresses.
835
836   getdbmap
837       This command lists all clustered TDB databases that the CTDB daemon has
838       attached to. Some databases are flagged as PERSISTENT, this means that
839       the database stores data persistently and the data will remain across
840       reboots. One example of such a database is secrets.tdb where
841       information about how the cluster was joined to the domain is stored.
842       Some database are flagged as REPLICATED, this means that the data in
843       that database is replicated across all the nodes. But the data will not
844       remain across reboots. This type of database is used by CTDB to store
845       it's internal state.
846
847       If a PERSISTENT database is not in a healthy state the database is
848       flagged as UNHEALTHY. If there's at least one completely healthy node
849       running in the cluster, it's possible that the content is restored by a
850       recovery run automaticly. Otherwise an administrator needs to analyze
851       the problem.
852
853       See also "ctdb getdbstatus", "ctdb backupdb", "ctdb restoredb", "ctdb
854       dumpbackup", "ctdb wipedb", "ctdb setvar AllowUnhealthyDBRead 1" and
855       (if samba or tdb-utils are installed) "tdbtool check".
856
857       Most databases are not persistent and only store the state information
858       that the currently running samba daemons need. These databases are
859       always wiped when ctdb/samba starts and when a node is rebooted.
860
861       Example
862               # ctdb getdbmap
863               Number of databases:10
864               dbid:0x435d3410 name:notify.tdb path:/var/lib/ctdb/notify.tdb.0
865               dbid:0x42fe72c5 name:locking.tdb path:/var/lib/ctdb/locking.tdb.0
866               dbid:0x1421fb78 name:brlock.tdb path:/var/lib/ctdb/brlock.tdb.0
867               dbid:0x17055d90 name:connections.tdb path:/var/lib/ctdb/connections.tdb.0
868               dbid:0xc0bdde6a name:sessionid.tdb path:/var/lib/ctdb/sessionid.tdb.0
869               dbid:0x122224da name:test.tdb path:/var/lib/ctdb/test.tdb.0
870               dbid:0x2672a57f name:idmap2.tdb path:/var/lib/ctdb/persistent/idmap2.tdb.0 PERSISTENT
871               dbid:0xb775fff6 name:secrets.tdb path:/var/lib/ctdb/persistent/secrets.tdb.0 PERSISTENT
872               dbid:0xe98e08b6 name:group_mapping.tdb path:/var/lib/ctdb/persistent/group_mapping.tdb.0 PERSISTENT
873               dbid:0x7bbbd26c name:passdb.tdb path:/var/lib/ctdb/persistent/passdb.tdb.0 PERSISTENT
874
875               # ctdb getdbmap  # example for unhealthy database
876               Number of databases:1
877               dbid:0xb775fff6 name:secrets.tdb path:/var/lib/ctdb/persistent/secrets.tdb.0 PERSISTENT UNHEALTHY
878
879               # ctdb -X getdbmap
880               |ID|Name|Path|Persistent|Unhealthy|
881               |0x7bbbd26c|passdb.tdb|/var/lib/ctdb/persistent/passdb.tdb.0|1|0|
882
883
884   backupdb DB FILE
885       Copy the contents of database DB to FILE. FILE can later be read back
886       using restoredb. This is mainly useful for backing up persistent
887       databases such as secrets.tdb and similar.
888
889   restoredb FILE [DB]
890       This command restores a persistent database that was previously backed
891       up using backupdb. By default the data will be restored back into the
892       same database as it was created from. By specifying dbname you can
893       restore the data into a different database.
894
895   setdbreadonly DB
896       This command will enable the read-only record support for a database.
897       This is an experimental feature to improve performance for contended
898       records primarily in locking.tdb and brlock.tdb. When enabling this
899       feature you must set it on all nodes in the cluster.
900
901   setdbsticky DB
902       This command will enable the sticky record support for the specified
903       database. This is an experimental feature to improve performance for
904       contended records primarily in locking.tdb and brlock.tdb. When
905       enabling this feature you must set it on all nodes in the cluster.
906

INTERNAL COMMANDS

908       Internal commands are used by CTDB's scripts and are not required for
909       managing a CTDB cluster. Their parameters and behaviour are subject to
910       change.
911
912   gettickles IPADDR
913       Show TCP connections that are registered with CTDB to be "tickled" if
914       there is a failover.
915
916   gratarp IPADDR INTERFACE
917       Send out a gratuitous ARP for the specified interface through the
918       specified interface. This command is mainly used by the ctdb
919       eventscripts.
920
921   pdelete DB KEY
922       Delete KEY from DB.
923
924   pfetch DB KEY
925       Print the value associated with KEY in DB.
926
927   pstore DB KEY FILE
928       Store KEY in DB with contents of FILE as the associated value.
929
930   ptrans DB [FILE]
931       Read a list of key-value pairs, one per line from FILE, and store them
932       in DB using a single transaction. An empty value is equivalent to
933       deleting the given key.
934
935       The key and value should be separated by spaces or tabs. Each key/value
936       should be a printable string enclosed in double-quotes.
937
938   runstate [setup|first_recovery|startup|running]
939       Print the runstate of the specified node. Runstates are used to
940       serialise important state transitions in CTDB, particularly during
941       startup.
942
943       If one or more optional runstate arguments are specified then the node
944       must be in one of these runstates for the command to succeed.
945
946       Example
947               # ctdb runstate
948               RUNNING
949
950
951   setifacelink IFACE up|down
952       Set the internal state of network interface IFACE. This is typically
953       used in the 10.interface script in the "monitor" event.
954
955       Example: ctdb setifacelink eth0 up
956
957   tickle
958       Read a list of TCP connections, one per line, from standard input and
959       send a TCP tickle to the source host for each connection. A connection
960       is specified as:
961
962                SRC-IPADDR:SRC-PORT DST-IPADDR:DST-PORT
963
964
965       A single connection can be specified on the command-line rather than on
966       standard input.
967
968       A TCP tickle is a TCP ACK packet with an invalid sequence and
969       acknowledge number and will when received by the source host result in
970       it sending an immediate correct ACK back to the other end.
971
972       TCP tickles are useful to "tickle" clients after a IP failover has
973       occurred since this will make the client immediately recognize the TCP
974       connection has been disrupted and that the client will need to
975       reestablish. This greatly speeds up the time it takes for a client to
976       detect and reestablish after an IP failover in the ctdb cluster.
977
978   version
979       Display the CTDB version.
980

DEBUGGING COMMANDS

982       These commands are primarily used for CTDB development and testing and
983       should not be used for normal administration.
984
985   OPTIONS
986       --print-emptyrecords
987           This enables printing of empty records when dumping databases with
988           the catdb, cattbd and dumpdbbackup commands. Records with empty
989           data segment are considered deleted by ctdb and cleaned by the
990           vacuuming mechanism, so this switch can come in handy for debugging
991           the vacuuming behaviour.
992
993       --print-datasize
994           This lets database dumps (catdb, cattdb, dumpdbbackup) print the
995           size of the record data instead of dumping the data contents.
996
997       --print-lmaster
998           This lets catdb print the lmaster for each record.
999
1000       --print-hash
1001           This lets database dumps (catdb, cattdb, dumpdbbackup) print the
1002           hash for each record.
1003
1004       --print-recordflags
1005           This lets catdb and dumpdbbackup print the record flags for each
1006           record. Note that cattdb always prints the flags.
1007
1008   process-exists PID [SRVID]
1009       This command checks if a specific process exists on the CTDB host. This
1010       is mainly used by Samba to check if remote instances of samba are still
1011       running or not. When the optional SRVID argument is specified, the
1012       command check if a specific process exists on the CTDB host and has
1013       registered for specified SRVID.
1014
1015   getdbstatus DB
1016       This command displays more details about a database.
1017
1018       Example
1019               # ctdb getdbstatus test.tdb.0
1020               dbid: 0x122224da
1021               name: test.tdb
1022               path: /var/lib/ctdb/test.tdb.0
1023               PERSISTENT: no
1024               HEALTH: OK
1025
1026               # ctdb getdbstatus registry.tdb  # with a corrupted TDB
1027               dbid: 0xf2a58948
1028               name: registry.tdb
1029               path: /var/lib/ctdb/persistent/registry.tdb.0
1030               PERSISTENT: yes
1031               HEALTH: NO-HEALTHY-NODES - ERROR - Backup of corrupted TDB in '/var/lib/ctdb/persistent/registry.tdb.0.corrupted.20091208091949.0Z'
1032
1033
1034   catdb DB
1035       Print a dump of the clustered TDB database DB.
1036
1037   cattdb DB
1038       Print a dump of the contents of the local TDB database DB.
1039
1040   dumpdbbackup FILE
1041       Print a dump of the contents from database backup FILE, similar to
1042       catdb.
1043
1044   wipedb DB
1045       Remove all contents of database DB.
1046
1047   recover
1048       This command will trigger the recovery daemon to do a cluster recovery.
1049
1050   ipreallocate, sync
1051       This command will force the recovery master to perform a full ip
1052       reallocation process and redistribute all ip addresses. This is useful
1053       to "reset" the allocations back to its default state if they have been
1054       changed using the "moveip" command. While a "recover" will also perform
1055       this reallocation, a recovery is much more hevyweight since it will
1056       also rebuild all the databases.
1057
1058   attach DBNAME [persistent|replicated]
1059       Create a new CTDB database called DBNAME and attach to it on all nodes.
1060
1061   detach DB-LIST
1062       Detach specified non-persistent database(s) from the cluster. This
1063       command will disconnect specified database(s) on all nodes in the
1064       cluster. This command should only be used when none of the specified
1065       database(s) are in use.
1066
1067       All nodes should be active and tunable AllowClientDBAccess should be
1068       disabled on all nodes before detaching databases.
1069
1070   dumpmemory
1071       This is a debugging command. This command will make the ctdb daemon to
1072       write a fill memory allocation map to standard output.
1073
1074   rddumpmemory
1075       This is a debugging command. This command will dump the talloc memory
1076       allocation tree for the recovery daemon to standard output.
1077
1078   ban BANTIME
1079       Administratively ban a node for BANTIME seconds. The node will be
1080       unbanned after BANTIME seconds have elapsed.
1081
1082       A banned node does not participate in the cluster. It does not host any
1083       records for the clustered TDB and does not host any public IP
1084       addresses.
1085
1086       Nodes are automatically banned if they misbehave. For example, a node
1087       may be banned if it causes too many cluster recoveries.
1088
1089       To administratively exclude a node from a cluster use the stop command.
1090
1091   unban
1092       This command is used to unban a node that has either been
1093       administratively banned using the ban command or has been automatically
1094       banned.
1095

AUTHOR

1101       This documentation was written by Ronnie Sahlberg, Amitay Isaacs,
1102       Martin Schwenke
1103

COPYRIGHT

1105       Copyright © 2007 Andrew Tridgell, Ronnie Sahlberg
1106
1107       This program is free software; you can redistribute it and/or modify it
1108       under the terms of the GNU General Public License as published by the
1109       Free Software Foundation; either version 3 of the License, or (at your
1110       option) any later version.
1111
1112       This program is distributed in the hope that it will be useful, but
1113       WITHOUT ANY WARRANTY; without even the implied warranty of
1114       MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
1115       General Public License for more details.
1116
1117       You should have received a copy of the GNU General Public License along
1118       with this program; if not, see http://www.gnu.org/licenses.
1119
1120
1121
1122
1123ctdb                              10/30/2018                           CTDB(1)