ctdb(1) - f35

1CTDB(1)                  CTDB - clustered TDB database                 CTDB(1)
2
3
4

NAME

6       ctdb - CTDB management utility
7

SYNOPSIS

9       ctdb [OPTION...] {COMMAND} [COMMAND-ARGS]
10

DESCRIPTION

12       ctdb is a utility to view and manage a CTDB cluster.
13
14       The following terms are used when referring to nodes in a cluster:
15
16       PNN
17           Physical Node Number. The physical node number is an integer that
18           describes the node in the cluster. The first node has physical node
19           number 0. in a cluster.
20
21       PNN-LIST
22           This is either a single PNN, a comma-separate list of PNNs or
23           "all".
24
25       Commands that reference a database use the following terms:
26
27       DB
28           This is either a database name, such as locking.tdb or a database
29           ID such as "0x42fe72c5".
30
31       DB-LIST
32           A space separated list of at least one DB.
33

OPTIONS

35       -n PNN
36           The node specified by PNN should be queried for the requested
37           information. Default is to query the daemon running on the local
38           host.
39
40       -Y
41           Produce output in machine readable form for easier parsing by
42           scripts. This uses a field delimiter of ':'. Not all commands
43           support this option.
44
45       -x SEPARATOR
46           Use SEPARATOR to delimit fields in machine readable output. This
47           implies -Y.
48
49       -X
50           Produce output in machine readable form for easier parsing by
51           scripts. This uses a field delimiter of '|'. Not all commands
52           support this option.
53
54           This is equivalent to "-x|" and avoids some shell quoting issues.
55
56       -t TIMEOUT
57           Indicates that ctdb should wait up to TIMEOUT seconds for a
58           response to most commands sent to the CTDB daemon. The default is
59           10 seconds.
60
61       -T TIMELIMIT
62           Indicates that TIMELIMIT is the maximum run time (in seconds) for
63           the ctdb command. When TIMELIMIT is exceeded the ctdb command will
64           terminate with an error. The default is 120 seconds.
65
66       -? --help
67           Print some help text to the screen.
68
69       --usage
70           Print usage information to the screen.
71
72       -d --debug=DEBUGLEVEL
73           Change the debug level for the command. Default is NOTICE.
74

ADMINISTRATIVE COMMANDS

76       These are commands used to monitor and administer a CTDB cluster.
77
78   pnn
79       This command displays the PNN of the current node.
80
81   status
82       This command shows the current status of all CTDB nodes based on
83       information from the queried node.
84
85       Note: If the queried node is INACTIVE then the status might not be
86       current.
87
88       Node status
89           This includes the number of physical nodes and the status of each
90           node. See ctdb(7) for information about node states.
91
92       Generation
93           The generation id is a number that indicates the current generation
94           of a cluster instance. Each time a cluster goes through a
95           reconfiguration or a recovery its generation id will be changed.
96
97           This number does not have any particular meaning other than to keep
98           track of when a cluster has gone through a recovery. It is a random
99           number that represents the current instance of a ctdb cluster and
100           its databases. The CTDB daemon uses this number internally to be
101           able to tell when commands to operate on the cluster and the
102           databases was issued in a different generation of the cluster, to
103           ensure that commands that operate on the databases will not survive
104           across a cluster database recovery. After a recovery, all old
105           outstanding commands will automatically become invalid.
106
107           Sometimes this number will be shown as "INVALID". This only means
108           that the ctdbd daemon has started but it has not yet merged with
109           the cluster through a recovery. All nodes start with generation
110           "INVALID" and are not assigned a real generation id until they have
111           successfully been merged with a cluster through a recovery.
112
113       Virtual Node Number (VNN) map
114           Consists of the number of virtual nodes and mapping from virtual
115           node numbers to physical node numbers. Only nodes that are
116           participating in the VNN map can become lmaster for database
117           records.
118
119       Recovery mode
120           This is the current recovery mode of the cluster. There are two
121           possible modes:
122
123           NORMAL - The cluster is fully operational.
124
125           RECOVERY - The cluster databases have all been frozen, pausing all
126           services while the cluster awaits a recovery process to complete. A
127           recovery process should finish within seconds. If a cluster is
128           stuck in the RECOVERY state this would indicate a cluster
129           malfunction which needs to be investigated.
130
131           Once the recovery master detects an inconsistency, for example a
132           node becomes disconnected/connected, the recovery daemon will
133           trigger a cluster recovery process, where all databases are
134           remerged across the cluster. When this process starts, the recovery
135           master will first "freeze" all databases to prevent applications
136           such as samba from accessing the databases and it will also mark
137           the recovery mode as RECOVERY.
138
139           When the CTDB daemon starts up, it will start in RECOVERY mode.
140           Once the node has been merged into a cluster and all databases have
141           been recovered, the node mode will change into NORMAL mode and the
142           databases will be "thawed", allowing samba to access the databases
143           again.
144
145       Recovery master
146           This is the cluster node that is currently designated as the
147           recovery master. This node is responsible of monitoring the
148           consistency of the cluster and to perform the actual recovery
149           process when reqired.
150
151           Only one node at a time can be the designated recovery master.
152           Which node is designated the recovery master is decided by an
153           election process in the recovery daemons running on each node.
154
155       Example
156               # ctdb status
157               Number of nodes:4
158               pnn:0 192.168.2.200       OK (THIS NODE)
159               pnn:1 192.168.2.201       OK
160               pnn:2 192.168.2.202       OK
161               pnn:3 192.168.2.203       OK
162               Generation:1362079228
163               Size:4
164               hash:0 lmaster:0
165               hash:1 lmaster:1
166               hash:2 lmaster:2
167               hash:3 lmaster:3
168               Recovery mode:NORMAL (0)
169               Recovery master:0
170
171
172   nodestatus [PNN-LIST]
173       This command is similar to the status command. It displays the "node
174       status" subset of output. The main differences are:
175
176       •   The exit code is the bitwise-OR of the flags for each specified
177           node, while ctdb status exits with 0 if it was able to retrieve
178           status for all nodes.
179
180       •   ctdb status provides status information for all nodes.  ctdb
181           nodestatus defaults to providing status for only the current node.
182           If PNN-LIST is provided then status is given for the indicated
183           node(s).
184
185       A common invocation in scripts is ctdb nodestatus all to check whether
186       all nodes in a cluster are healthy.
187
188       Example
189               # ctdb nodestatus
190               pnn:0 10.0.0.30        OK (THIS NODE)
191
192               # ctdb nodestatus all
193               Number of nodes:2
194               pnn:0 10.0.0.30        OK (THIS NODE)
195               pnn:1 10.0.0.31        OK
196
197
198   recmaster
199       This command shows the pnn of the node which is currently the
200       recmaster.
201
202       Note: If the queried node is INACTIVE then the status might not be
203       current.
204
205   uptime
206       This command shows the uptime for the ctdb daemon. When the last
207       recovery or ip-failover completed and how long it took. If the
208       "duration" is shown as a negative number, this indicates that there is
209       a recovery/failover in progress and it started that many seconds ago.
210
211       Example
212               # ctdb uptime
213               Current time of node          :                Thu Oct 29 10:38:54 2009
214               Ctdbd start time              : (000 16:54:28) Wed Oct 28 17:44:26 2009
215               Time of last recovery/failover: (000 16:53:31) Wed Oct 28 17:45:23 2009
216               Duration of last recovery/failover: 2.248552 seconds
217
218
219   listnodes
220       This command shows lists the ip addresses of all the nodes in the
221       cluster.
222
223       Example
224               # ctdb listnodes
225               192.168.2.200
226               192.168.2.201
227               192.168.2.202
228               192.168.2.203
229
230
231   natgw {leader|list|status}
232       This command shows different aspects of NAT gateway status. For an
233       overview of CTDB's NAT gateway functionality please see the NAT GATEWAY
234       section in ctdb(7).
235
236       leader
237           Show the PNN and private IP address of the current NAT gateway
238           leader node.
239
240           Example output:
241
242               1 192.168.2.201
243
244
245       list
246           List the private IP addresses of nodes in the current NAT gateway
247           group, annotating the leader node.
248
249           Example output:
250
251               192.168.2.200
252               192.168.2.201  LEADER
253               192.168.2.202
254               192.168.2.203
255
256
257       status
258           List the nodes in the current NAT gateway group and their status.
259
260           Example output:
261
262               pnn:0 192.168.2.200       UNHEALTHY (THIS NODE)
263               pnn:1 192.168.2.201       OK
264               pnn:2 192.168.2.202       OK
265               pnn:3 192.168.2.203       OK
266
267
268   ping
269       This command will "ping" specified CTDB nodes in the cluster to verify
270       that they are running.
271
272       Example
273               # ctdb ping
274               response from 0 time=0.000054 sec  (3 clients)
275
276
277   ifaces
278       This command will display the list of network interfaces, which could
279       host public addresses, along with their status.
280
281       Example
282               # ctdb ifaces
283               Interfaces on node 0
284               name:eth5 link:up references:2
285               name:eth4 link:down references:0
286               name:eth3 link:up references:1
287               name:eth2 link:up references:1
288
289               # ctdb -X ifaces
290               |Name|LinkStatus|References|
291               |eth5|1|2|
292               |eth4|0|0|
293               |eth3|1|1|
294               |eth2|1|1|
295
296
297   ip
298       This command will display the list of public addresses that are
299       provided by the cluster and which physical node is currently serving
300       this ip. By default this command will ONLY show those public addresses
301       that are known to the node itself. To see the full list of all public
302       ips across the cluster you must use "ctdb ip all".
303
304       Example
305               # ctdb ip -v
306               Public IPs on node 0
307               172.31.91.82 node[1] active[] available[eth2,eth3] configured[eth2,eth3]
308               172.31.91.83 node[0] active[eth3] available[eth2,eth3] configured[eth2,eth3]
309               172.31.91.84 node[1] active[] available[eth2,eth3] configured[eth2,eth3]
310               172.31.91.85 node[0] active[eth2] available[eth2,eth3] configured[eth2,eth3]
311               172.31.92.82 node[1] active[] available[eth5] configured[eth4,eth5]
312               172.31.92.83 node[0] active[eth5] available[eth5] configured[eth4,eth5]
313               172.31.92.84 node[1] active[] available[eth5] configured[eth4,eth5]
314               172.31.92.85 node[0] active[eth5] available[eth5] configured[eth4,eth5]
315
316               # ctdb -X ip -v
317               |Public IP|Node|ActiveInterface|AvailableInterfaces|ConfiguredInterfaces|
318               |172.31.91.82|1||eth2,eth3|eth2,eth3|
319               |172.31.91.83|0|eth3|eth2,eth3|eth2,eth3|
320               |172.31.91.84|1||eth2,eth3|eth2,eth3|
321               |172.31.91.85|0|eth2|eth2,eth3|eth2,eth3|
322               |172.31.92.82|1||eth5|eth4,eth5|
323               |172.31.92.83|0|eth5|eth5|eth4,eth5|
324               |172.31.92.84|1||eth5|eth4,eth5|
325               |172.31.92.85|0|eth5|eth5|eth4,eth5|
326
327
328   ipinfo IP
329       This command will display details about the specified public addresses.
330
331       Example
332               # ctdb ipinfo 172.31.92.85
333               Public IP[172.31.92.85] info on node 0
334               IP:172.31.92.85
335               CurrentNode:0
336               NumInterfaces:2
337               Interface[1]: Name:eth4 Link:down References:0
338               Interface[2]: Name:eth5 Link:up References:2 (active)
339
340
341   event run|status|script list|script enable|script disable
342       This command is used to control event daemon and to inspect status of
343       various events.
344
345       The commands below require a component to be specified. In the current
346       version the only valid component is legacy.
347
348       run TIMEOUT COMPONENT EVENT [ARGUMENTS]
349           This command can be used to manually run specified EVENT in
350           COMPONENT with optional ARGUMENTS. The event will be allowed to run
351           a maximum of TIMEOUT seconds. If TIMEOUT is 0, then there is no
352           time limit for running the event.
353
354       status COMPONENT EVENT
355           This command displays the last execution status of the specified
356           EVENT in COMPONENT.
357
358           The command will terminate with the exit status corresponding to
359           the overall status of event that is displayed.
360
361           The output is the list of event scripts executed. Each line shows
362           the name, status, duration and start time for each script.
363
364           Example
365
366               # ctdb event status legacy monitor
367               00.ctdb              OK         0.014 Sat Dec 17 19:39:11 2016
368               01.reclock           OK         0.013 Sat Dec 17 19:39:11 2016
369               05.system            OK         0.029 Sat Dec 17 19:39:11 2016
370               06.nfs               OK         0.014 Sat Dec 17 19:39:11 2016
371               10.interface         OK         0.037 Sat Dec 17 19:39:11 2016
372               11.natgw             OK         0.011 Sat Dec 17 19:39:11 2016
373               11.routing           OK         0.007 Sat Dec 17 19:39:11 2016
374               13.per_ip_routing    OK         0.007 Sat Dec 17 19:39:11 2016
375               20.multipathd        OK         0.007 Sat Dec 17 19:39:11 2016
376               31.clamd             OK         0.007 Sat Dec 17 19:39:11 2016
377               40.vsftpd            OK         0.013 Sat Dec 17 19:39:11 2016
378               41.httpd             OK         0.018 Sat Dec 17 19:39:11 2016
379               49.winbind           OK         0.023 Sat Dec 17 19:39:11 2016
380               50.samba             OK         0.100 Sat Dec 17 19:39:12 2016
381               60.nfs               OK         0.376 Sat Dec 17 19:39:12 2016
382               70.iscsi             OK         0.009 Sat Dec 17 19:39:12 2016
383               91.lvs               OK         0.007 Sat Dec 17 19:39:12 2016
384
385
386       script list COMPONENT
387           List the available event scripts in COMPONENT. Enabled scripts are
388           flagged with a '*'.
389
390           Generally, event scripts are provided by CTDB. However, local or
391           3rd party event scripts may also be available. These are shown in a
392           separate section after those provided by CTDB.
393
394           Example
395
396               # ctdb event script list legacy
397               * 00.ctdb
398               * 01.reclock
399               * 05.system
400               * 06.nfs
401               * 10.interface
402                 11.natgw
403                 11.routing
404                 13.per_ip_routing
405                 20.multipathd
406                 31.clamd
407                 40.vsftpd
408                 41.httpd
409               * 49.winbind
410               * 50.samba
411               * 60.nfs
412                 70.iscsi
413                 91.lvs
414
415               * 02.local
416
417
418       script enable COMPONENT SCRIPT
419           Enable the specified event SCRIPT in COMPONENT. Only enabled
420           scripts will be executed when running any event.
421
422       script disable COMPONENT SCRIPT
423           Disable the specified event SCRIPT in COMPONENT. This will prevent
424           the script from executing when running any event.
425
426   scriptstatus
427       This command displays which event scripts where run in the previous
428       monitoring cycle and the result of each script. If a script failed with
429       an error, causing the node to become unhealthy, the output from that
430       script is also shown.
431
432       This command is deprecated. It's provided for backward compatibility.
433       In place of ctdb scriptstatus, use ctdb event status.
434
435       Example
436               # ctdb scriptstatus
437               00.ctdb              OK         0.011 Sat Dec 17 19:40:46 2016
438               01.reclock           OK         0.010 Sat Dec 17 19:40:46 2016
439               05.system            OK         0.030 Sat Dec 17 19:40:46 2016
440               06.nfs               OK         0.014 Sat Dec 17 19:40:46 2016
441               10.interface         OK         0.041 Sat Dec 17 19:40:46 2016
442               11.natgw             OK         0.008 Sat Dec 17 19:40:46 2016
443               11.routing           OK         0.007 Sat Dec 17 19:40:46 2016
444               13.per_ip_routing    OK         0.007 Sat Dec 17 19:40:46 2016
445               20.multipathd        OK         0.007 Sat Dec 17 19:40:46 2016
446               31.clamd             OK         0.007 Sat Dec 17 19:40:46 2016
447               40.vsftpd            OK         0.013 Sat Dec 17 19:40:46 2016
448               41.httpd             OK         0.015 Sat Dec 17 19:40:46 2016
449               49.winbind           OK         0.022 Sat Dec 17 19:40:46 2016
450               50.samba             ERROR      0.077 Sat Dec 17 19:40:46 2016
451                 OUTPUT: ERROR: samba tcp port 445 is not responding
452
453
454   listvars
455       List all tuneable variables, except the values of the obsolete tunables
456       like VacuumMinInterval. The obsolete tunables can be retrieved only
457       explicitly with the "ctdb getvar" command.
458
459       Example
460               # ctdb listvars
461               SeqnumInterval          = 1000
462               ControlTimeout          = 60
463               TraverseTimeout         = 20
464               KeepaliveInterval       = 5
465               KeepaliveLimit          = 5
466               RecoverTimeout          = 120
467               RecoverInterval         = 1
468               ElectionTimeout         = 3
469               TakeoverTimeout         = 9
470               MonitorInterval         = 15
471               TickleUpdateInterval    = 20
472               EventScriptTimeout      = 30
473               MonitorTimeoutCount     = 20
474               RecoveryGracePeriod     = 120
475               RecoveryBanPeriod       = 300
476               DatabaseHashSize        = 100001
477               DatabaseMaxDead         = 5
478               RerecoveryTimeout       = 10
479               EnableBans              = 1
480               NoIPFailback            = 0
481               VerboseMemoryNames      = 0
482               RecdPingTimeout         = 60
483               RecdFailCount           = 10
484               LogLatencyMs            = 0
485               RecLockLatencyMs        = 1000
486               RecoveryDropAllIPs      = 120
487               VacuumInterval          = 10
488               VacuumMaxRunTime        = 120
489               RepackLimit             = 10000
490               VacuumFastPathCount     = 60
491               MaxQueueDropMsg         = 1000000
492               AllowUnhealthyDBRead    = 0
493               StatHistoryInterval     = 1
494               DeferredAttachTO        = 120
495               AllowClientDBAttach     = 1
496               RecoverPDBBySeqNum      = 1
497               DeferredRebalanceOnNodeAdd = 300
498               FetchCollapse           = 1
499               HopcountMakeSticky      = 50
500               StickyDuration          = 600
501               StickyPindown           = 200
502               NoIPTakeover            = 0
503               DBRecordCountWarn       = 100000
504               DBRecordSizeWarn        = 10000000
505               DBSizeWarn              = 100000000
506               PullDBPreallocation     = 10485760
507               LockProcessesPerDB      = 200
508               RecBufferSizeLimit      = 1000000
509               QueueBufferSize         = 1024
510               IPAllocAlgorithm        = 2
511
512
513   getvar NAME
514       Get the runtime value of a tuneable variable.
515
516       Example
517               # ctdb getvar MonitorInterval
518               MonitorInterval         = 15
519
520
521   setvar NAME VALUE
522       Set the runtime value of a tuneable variable.
523
524       Example
525               # ctdb setvar MonitorInterval 20
526
527
528   lvs {leader|list|status}
529       This command shows different aspects of LVS status. For an overview of
530       CTDB's LVS functionality please see the LVS section in ctdb(7).
531
532       leader
533           Shows the PNN of the current LVS leader node.
534
535           Example output:
536
537               2
538
539
540       list
541           Lists the currently usable LVS nodes.
542
543           Example output:
544
545               2 10.0.0.13
546               3 10.0.0.14
547
548
549       status
550           List the nodes in the current LVS group and their status.
551
552           Example output:
553
554               pnn:0 10.0.0.11        UNHEALTHY (THIS NODE)
555               pnn:1 10.0.0.12        UNHEALTHY
556               pnn:2 10.0.0.13        OK
557               pnn:3 10.0.0.14        OK
558
559
560   getcapabilities
561       This command shows the capabilities of the current node. See the
562       CAPABILITIES section in ctdb(7) for more details.
563
564       Example output:
565
566           RECMASTER: YES
567           LMASTER: YES
568
569
570   statistics
571       Collect statistics from the CTDB daemon about how many calls it has
572       served. Information about various fields in statistics can be found in
573       ctdb-statistics(7).
574
575       Example
576               # ctdb statistics
577               CTDB version 1
578               Current time of statistics  :                Tue Mar  8 15:18:51 2016
579               Statistics collected since  : (003 21:31:32) Fri Mar  4 17:47:19 2016
580                num_clients                        9
581                frozen                             0
582                recovering                         0
583                num_recoveries                     2
584                client_packets_sent          8170534
585                client_packets_recv          7166132
586                node_packets_sent           16549998
587                node_packets_recv            5244418
588                keepalive_packets_sent        201969
589                keepalive_packets_recv        201969
590                node
591                    req_call                      26
592                    reply_call                     0
593                    req_dmaster                    9
594                    reply_dmaster                 12
595                    reply_error                    0
596                    req_message              1339231
597                    req_control              8177506
598                    reply_control            6831284
599                client
600                    req_call                      15
601                    req_message               334809
602                    req_control              6831308
603                timeouts
604                    call                           0
605                    control                        0
606                    traverse                       0
607                locks
608                    num_calls                      8
609                    num_current                    0
610                    num_pending                    0
611                    num_failed                     0
612                total_calls                       15
613                pending_calls                      0
614                childwrite_calls                   0
615                pending_childwrite_calls             0
616                memory_used                   394879
617                max_hop_count                      1
618                total_ro_delegations               0
619                total_ro_revokes                   0
620                hop_count_buckets: 8 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0
621                lock_buckets: 0 0 8 0 0 0 0 0 0 0 0 0 0 0 0 0
622                locks_latency      MIN/AVG/MAX     0.010005/0.010418/0.011010 sec out of 8
623                reclock_ctdbd      MIN/AVG/MAX     0.002538/0.002538/0.002538 sec out of 1
624                reclock_recd       MIN/AVG/MAX     0.000000/0.000000/0.000000 sec out of 0
625                call_latency       MIN/AVG/MAX     0.000044/0.002142/0.011702 sec out of 15
626                childwrite_latency MIN/AVG/MAX     0.000000/0.000000/0.000000 sec out of 0
627
628
629   statisticsreset
630       This command is used to clear all statistics counters in a node.
631
632       Example: ctdb statisticsreset
633
634   dbstatistics DB
635       Display statistics about the database DB. Information about various
636       fields in dbstatistics can be found in ctdb-statistics(7).
637
638       Example
639               # ctdb dbstatistics locking.tdb
640               DB Statistics: locking.tdb
641                ro_delegations                     0
642                ro_revokes                         0
643                locks
644                    total                      14356
645                    failed                         0
646                    current                        0
647                    pending                        0
648                hop_count_buckets: 28087 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0
649                lock_buckets: 0 14188 38 76 32 19 3 0 0 0 0 0 0 0 0 0
650                locks_latency      MIN/AVG/MAX     0.001066/0.012686/4.202292 sec out of 14356
651                vacuum_latency     MIN/AVG/MAX     0.000472/0.002207/15.243570 sec out of 224530
652                Num Hot Keys:     1
653                    Count:8 Key:ff5bd7cb3ee3822edc1f0000000000000000000000000000
654
655
656   getreclock
657       Show details of the recovery lock, if any.
658
659       Example output:
660
661                /clusterfs/.ctdb/recovery.lock
662
663
664   getdebug
665       Get the current debug level for the node. the debug level controls what
666       information is written to the log file.
667
668       The debug levels are mapped to the corresponding syslog levels. When a
669       debug level is set, only those messages at that level and higher levels
670       will be printed.
671
672       The list of debug levels from highest to lowest are :
673
674       ERROR WARNING NOTICE INFO DEBUG
675
676   setdebug DEBUGLEVEL
677       Set the debug level of a node. This controls what information will be
678       logged.
679
680       The debuglevel is one of ERROR WARNING NOTICE INFO DEBUG
681
682   getpid
683       This command will return the process id of the ctdb daemon.
684
685   disable
686       This command is used to administratively disable a node in the cluster.
687       A disabled node will still participate in the cluster and host
688       clustered TDB records but its public ip address has been taken over by
689       a different node and it no longer hosts any services.
690
691   enable
692       Re-enable a node that has been administratively disabled.
693
694   stop
695       This command is used to administratively STOP a node in the cluster. A
696       STOPPED node is connected to the cluster but will not host any public
697       ip addresse, nor does it participate in the VNNMAP. The difference
698       between a DISABLED node and a STOPPED node is that a STOPPED node does
699       not host any parts of the database which means that a recovery is
700       required to stop/continue nodes.
701
702   continue
703       Re-start a node that has been administratively stopped.
704
705   addip IPADDR/mask IFACE
706       This command is used to add a new public ip to a node during runtime.
707       It should be followed by a ctdb ipreallocate. This allows public
708       addresses to be added to a cluster without having to restart the ctdb
709       daemons.
710
711       Note that this only updates the runtime instance of ctdb. Any changes
712       will be lost next time ctdb is restarted and the public addresses file
713       is re-read. If you want this change to be permanent you must also
714       update the public addresses file manually.
715
716   delip IPADDR
717       This command flags IPADDR for deletion from a node at runtime. It
718       should be followed by a ctdb ipreallocate. If IPADDR is currently
719       hosted by the node it is being removed from, this ensures that the IP
720       will first be failed over to another node, if possible, and that it is
721       then actually removed.
722
723       Note that this only updates the runtime instance of CTDB. Any changes
724       will be lost next time CTDB is restarted and the public addresses file
725       is re-read. If you want this change to be permanent you must also
726       update the public addresses file manually.
727
728   moveip IPADDR PNN
729       This command can be used to manually fail a public ip address to a
730       specific node.
731
732       In order to manually override the "automatic" distribution of public ip
733       addresses that ctdb normally provides, this command only works when you
734       have changed the tunables for the daemon to:
735
736       IPAllocAlgorithm != 0
737
738       NoIPFailback = 1
739
740   shutdown
741       This command will shutdown a specific CTDB daemon.
742
743   setlmasterrole on|off
744       This command is used to enable/disable the LMASTER capability for a
745       node at runtime. This capability determines whether or not a node can
746       be used as an LMASTER for records in the database. A node that does not
747       have the LMASTER capability will not show up in the vnnmap.
748
749       Nodes will by default have this capability, but it can be stripped off
750       nodes by the setting in the sysconfig file or by using this command.
751
752       Once this setting has been enabled/disabled, you need to perform a
753       recovery for it to take effect.
754
755       See also "ctdb getcapabilities"
756
757   setrecmasterrole on|off
758       This command is used to enable/disable the RECMASTER capability for a
759       node at runtime. This capability determines whether or not a node can
760       be used as an RECMASTER for the cluster. A node that does not have the
761       RECMASTER capability can not win a recmaster election. A node that
762       already is the recmaster for the cluster when the capability is
763       stripped off the node will remain the recmaster until the next cluster
764       election.
765
766       Nodes will by default have this capability, but it can be stripped off
767       nodes by the setting in the sysconfig file or by using this command.
768
769       See also "ctdb getcapabilities"
770
771   reloadnodes
772       This command is used when adding new nodes, or removing existing nodes
773       from an existing cluster.
774
775       Procedure to add nodes:
776
777        1. To expand an existing cluster, first ensure with ctdb status that
778           all nodes are up and running and that they are all healthy. Do not
779           try to expand a cluster unless it is completely healthy!
780
781        2. On all nodes, edit /etc/ctdb/nodes and add the new nodes at the end
782           of this file.
783
784        3. Verify that all the nodes have identical /etc/ctdb/nodes files
785           after adding the new nodes.
786
787        4. Run ctdb reloadnodes to force all nodes to reload the nodes file.
788
789        5. Use ctdb status on all nodes and verify that they now show the
790           additional nodes.
791
792        6. Install and configure the new node and bring it online.
793
794       Procedure to remove nodes:
795
796        1. To remove nodes from an existing cluster, first ensure with ctdb
797           status that all nodes, except the node to be deleted, are up and
798           running and that they are all healthy. Do not try to remove nodes
799           from a cluster unless the cluster is completely healthy!
800
801        2. Shutdown and power off the node to be removed.
802
803        3. On all other nodes, edit the /etc/ctdb/nodes file and comment out
804           the nodes to be removed.  Do not delete the lines for the deleted
805           nodes, just comment them out by adding a '#' at the beginning of
806           the lines.
807
808        4. Run ctdb reloadnodes to force all nodes to reload the nodes file.
809
810        5. Use ctdb status on all nodes and verify that the deleted nodes are
811           no longer listed.
812
813   reloadips [PNN-LIST]
814       This command reloads the public addresses configuration file on the
815       specified nodes. When it completes addresses will be reconfigured and
816       reassigned across the cluster as necessary.
817
818       This command is currently unable to make changes to the netmask or
819       interfaces associated with existing addresses. Such changes must be
820       made in 2 steps by deleting addresses in question and re-adding then.
821       Unfortunately this will disrupt connections to the changed addresses.
822
823   getdbmap
824       This command lists all clustered TDB databases that the CTDB daemon has
825       attached to. Some databases are flagged as PERSISTENT, this means that
826       the database stores data persistently and the data will remain across
827       reboots. One example of such a database is secrets.tdb where
828       information about how the cluster was joined to the domain is stored.
829       Some database are flagged as REPLICATED, this means that the data in
830       that database is replicated across all the nodes. But the data will not
831       remain across reboots. This type of database is used by CTDB to store
832       it's internal state.
833
834       If a PERSISTENT database is not in a healthy state the database is
835       flagged as UNHEALTHY. If there's at least one completely healthy node
836       running in the cluster, it's possible that the content is restored by a
837       recovery run automatically. Otherwise an administrator needs to analyze
838       the problem.
839
840       See also "ctdb getdbstatus", "ctdb backupdb", "ctdb restoredb", "ctdb
841       dumpbackup", "ctdb wipedb", "ctdb setvar AllowUnhealthyDBRead 1" and
842       (if samba or tdb-utils are installed) "tdbtool check".
843
844       Most databases are not persistent and only store the state information
845       that the currently running samba daemons need. These databases are
846       always wiped when ctdb/samba starts and when a node is rebooted.
847
848       Example
849               # ctdb getdbmap
850               Number of databases:10
851               dbid:0x435d3410 name:notify.tdb path:/var/lib/ctdb/notify.tdb.0
852               dbid:0x42fe72c5 name:locking.tdb path:/var/lib/ctdb/locking.tdb.0
853               dbid:0x1421fb78 name:brlock.tdb path:/var/lib/ctdb/brlock.tdb.0
854               dbid:0x17055d90 name:connections.tdb path:/var/lib/ctdb/connections.tdb.0
855               dbid:0xc0bdde6a name:sessionid.tdb path:/var/lib/ctdb/sessionid.tdb.0
856               dbid:0x122224da name:test.tdb path:/var/lib/ctdb/test.tdb.0
857               dbid:0x2672a57f name:idmap2.tdb path:/var/lib/ctdb/persistent/idmap2.tdb.0 PERSISTENT
858               dbid:0xb775fff6 name:secrets.tdb path:/var/lib/ctdb/persistent/secrets.tdb.0 PERSISTENT
859               dbid:0xe98e08b6 name:group_mapping.tdb path:/var/lib/ctdb/persistent/group_mapping.tdb.0 PERSISTENT
860               dbid:0x7bbbd26c name:passdb.tdb path:/var/lib/ctdb/persistent/passdb.tdb.0 PERSISTENT
861
862               # ctdb getdbmap  # example for unhealthy database
863               Number of databases:1
864               dbid:0xb775fff6 name:secrets.tdb path:/var/lib/ctdb/persistent/secrets.tdb.0 PERSISTENT UNHEALTHY
865
866               # ctdb -X getdbmap
867               |ID|Name|Path|Persistent|Unhealthy|
868               |0x7bbbd26c|passdb.tdb|/var/lib/ctdb/persistent/passdb.tdb.0|1|0|
869
870
871   backupdb DB FILE
872       Copy the contents of database DB to FILE. FILE can later be read back
873       using restoredb. This is mainly useful for backing up persistent
874       databases such as secrets.tdb and similar.
875
876   restoredb FILE [DB]
877       This command restores a persistent database that was previously backed
878       up using backupdb. By default the data will be restored back into the
879       same database as it was created from. By specifying dbname you can
880       restore the data into a different database.
881
882   setdbreadonly DB
883       This command will enable the read-only record support for a database.
884       This is an experimental feature to improve performance for contended
885       records primarily in locking.tdb and brlock.tdb. When enabling this
886       feature you must set it on all nodes in the cluster.
887
888   setdbsticky DB
889       This command will enable the sticky record support for the specified
890       database. This is an experimental feature to improve performance for
891       contended records primarily in locking.tdb and brlock.tdb. When
892       enabling this feature you must set it on all nodes in the cluster.
893

INTERNAL COMMANDS

895       Internal commands are used by CTDB's scripts and are not required for
896       managing a CTDB cluster. Their parameters and behaviour are subject to
897       change.
898
899   gettickles IPADDR
900       Show TCP connections that are registered with CTDB to be "tickled" if
901       there is a failover.
902
903   gratarp IPADDR INTERFACE
904       Send out a gratuitous ARP for the specified interface through the
905       specified interface. This command is mainly used by the ctdb
906       eventscripts.
907
908   pdelete DB KEY
909       Delete KEY from DB.
910
911   pfetch DB KEY
912       Print the value associated with KEY in DB.
913
914   pstore DB KEY FILE
915       Store KEY in DB with contents of FILE as the associated value.
916
917   ptrans DB [FILE]
918       Read a list of key-value pairs, one per line from FILE, and store them
919       in DB using a single transaction. An empty value is equivalent to
920       deleting the given key.
921
922       The key and value should be separated by spaces or tabs. Each key/value
923       should be a printable string enclosed in double-quotes.
924
925   runstate [setup|first_recovery|startup|running]
926       Print the runstate of the specified node. Runstates are used to
927       serialise important state transitions in CTDB, particularly during
928       startup.
929
930       If one or more optional runstate arguments are specified then the node
931       must be in one of these runstates for the command to succeed.
932
933       Example
934               # ctdb runstate
935               RUNNING
936
937
938   setifacelink IFACE up|down
939       Set the internal state of network interface IFACE. This is typically
940       used in the 10.interface script in the "monitor" event.
941
942       Example: ctdb setifacelink eth0 up
943
944   tickle
945       Read a list of TCP connections, one per line, from standard input and
946       send a TCP tickle to the source host for each connection. A connection
947       is specified as:
948
949                SRC-IPADDR:SRC-PORT DST-IPADDR:DST-PORT
950
951
952       A single connection can be specified on the command-line rather than on
953       standard input.
954
955       A TCP tickle is a TCP ACK packet with an invalid sequence and
956       acknowledge number and will when received by the source host result in
957       it sending an immediate correct ACK back to the other end.
958
959       TCP tickles are useful to "tickle" clients after a IP failover has
960       occurred since this will make the client immediately recognize the TCP
961       connection has been disrupted and that the client will need to
962       reestablish. This greatly speeds up the time it takes for a client to
963       detect and reestablish after an IP failover in the ctdb cluster.
964
965   version
966       Display the CTDB version.
967

DEBUGGING COMMANDS

969       These commands are primarily used for CTDB development and testing and
970       should not be used for normal administration.
971
972   OPTIONS
973       --print-emptyrecords
974           This enables printing of empty records when dumping databases with
975           the catdb, cattbd and dumpdbbackup commands. Records with empty
976           data segment are considered deleted by ctdb and cleaned by the
977           vacuuming mechanism, so this switch can come in handy for debugging
978           the vacuuming behaviour.
979
980       --print-datasize
981           This lets database dumps (catdb, cattdb, dumpdbbackup) print the
982           size of the record data instead of dumping the data contents.
983
984       --print-lmaster
985           This lets catdb print the lmaster for each record.
986
987       --print-hash
988           This lets database dumps (catdb, cattdb, dumpdbbackup) print the
989           hash for each record.
990
991       --print-recordflags
992           This lets catdb and dumpdbbackup print the record flags for each
993           record. Note that cattdb always prints the flags.
994
995   process-exists PID [SRVID]
996       This command checks if a specific process exists on the CTDB host. This
997       is mainly used by Samba to check if remote instances of samba are still
998       running or not. When the optional SRVID argument is specified, the
999       command check if a specific process exists on the CTDB host and has
1000       registered for specified SRVID.
1001
1002   getdbstatus DB
1003       This command displays more details about a database.
1004
1005       Example
1006               # ctdb getdbstatus test.tdb.0
1007               dbid: 0x122224da
1008               name: test.tdb
1009               path: /var/lib/ctdb/test.tdb.0
1010               PERSISTENT: no
1011               HEALTH: OK
1012
1013               # ctdb getdbstatus registry.tdb  # with a corrupted TDB
1014               dbid: 0xf2a58948
1015               name: registry.tdb
1016               path: /var/lib/ctdb/persistent/registry.tdb.0
1017               PERSISTENT: yes
1018               HEALTH: NO-HEALTHY-NODES - ERROR - Backup of corrupted TDB in '/var/lib/ctdb/persistent/registry.tdb.0.corrupted.20091208091949.0Z'
1019
1020
1021   catdb DB
1022       Print a dump of the clustered TDB database DB.
1023
1024   cattdb DB
1025       Print a dump of the contents of the local TDB database DB.
1026
1027   dumpdbbackup FILE
1028       Print a dump of the contents from database backup FILE, similar to
1029       catdb.
1030
1031   wipedb DB
1032       Remove all contents of database DB.
1033
1034   recover
1035       This command will trigger the recovery daemon to do a cluster recovery.
1036
1037   ipreallocate, sync
1038       This command will force the recovery master to perform a full ip
1039       reallocation process and redistribute all ip addresses. This is useful
1040       to "reset" the allocations back to its default state if they have been
1041       changed using the "moveip" command. While a "recover" will also perform
1042       this reallocation, a recovery is much more hevyweight since it will
1043       also rebuild all the databases.
1044
1045   attach DBNAME [persistent|replicated]
1046       Create a new CTDB database called DBNAME and attach to it on all nodes.
1047
1048   detach DB-LIST
1049       Detach specified non-persistent database(s) from the cluster. This
1050       command will disconnect specified database(s) on all nodes in the
1051       cluster. This command should only be used when none of the specified
1052       database(s) are in use.
1053
1054       All nodes should be active and tunable AllowClientDBAccess should be
1055       disabled on all nodes before detaching databases.
1056
1057   dumpmemory
1058       This is a debugging command. This command will make the ctdb daemon to
1059       write a fill memory allocation map to standard output.
1060
1061   rddumpmemory
1062       This is a debugging command. This command will dump the talloc memory
1063       allocation tree for the recovery daemon to standard output.
1064
1065   ban BANTIME
1066       Administratively ban a node for BANTIME seconds. The node will be
1067       unbanned after BANTIME seconds have elapsed.
1068
1069       A banned node does not participate in the cluster. It does not host any
1070       records for the clustered TDB and does not host any public IP
1071       addresses.
1072
1073       Nodes are automatically banned if they misbehave. For example, a node
1074       may be banned if it causes too many cluster recoveries.
1075
1076       To administratively exclude a node from a cluster use the stop command.
1077
1078   unban
1079       This command is used to unban a node that has either been
1080       administratively banned using the ban command or has been automatically
1081       banned.
1082

AUTHOR

1088       This documentation was written by Ronnie Sahlberg, Amitay Isaacs,
1089       Martin Schwenke
1090

COPYRIGHT

1092       Copyright © 2007 Andrew Tridgell, Ronnie Sahlberg
1093
1094       This program is free software; you can redistribute it and/or modify it
1095       under the terms of the GNU General Public License as published by the
1096       Free Software Foundation; either version 3 of the License, or (at your
1097       option) any later version.
1098
1099       This program is distributed in the hope that it will be useful, but
1100       WITHOUT ANY WARRANTY; without even the implied warranty of
1101       MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
1102       General Public License for more details.
1103
1104       You should have received a copy of the GNU General Public License along
1105       with this program; if not, see http://www.gnu.org/licenses.
1106
1107
1108
1109
1110ctdb                              11/13/2021                           CTDB(1)