ctdb(1) - f37

1CTDB(1)                  CTDB - clustered TDB database                 CTDB(1)
2
3
4

NAME

6       ctdb - CTDB management utility
7

SYNOPSIS

9       ctdb [OPTION...] {COMMAND} [COMMAND-ARGS]
10

DESCRIPTION

12       ctdb is a utility to view and manage a CTDB cluster.
13
14       The following terms are used when referring to nodes in a cluster:
15
16       PNN
17           Physical Node Number. The physical node number is an integer that
18           describes the node in the cluster. The first node has physical node
19           number 0. in a cluster.
20
21       PNN-LIST
22           This is either a single PNN, a comma-separate list of PNNs or
23           "all".
24
25       Commands that reference a database use the following terms:
26
27       DB
28           This is either a database name, such as locking.tdb or a database
29           ID such as "0x42fe72c5".
30
31       DB-LIST
32           A space separated list of at least one DB.
33

OPTIONS

35       -n PNN
36           The node specified by PNN should be queried for the requested
37           information. Default is to query the daemon running on the local
38           host.
39
40       -Y
41           Produce output in machine readable form for easier parsing by
42           scripts. This uses a field delimiter of ':'. Not all commands
43           support this option.
44
45       -x SEPARATOR
46           Use SEPARATOR to delimit fields in machine readable output. This
47           implies -Y.
48
49       -X
50           Produce output in machine readable form for easier parsing by
51           scripts. This uses a field delimiter of '|'. Not all commands
52           support this option.
53
54           This is equivalent to "-x|" and avoids some shell quoting issues.
55
56       -t TIMEOUT
57           Indicates that ctdb should wait up to TIMEOUT seconds for a
58           response to most commands sent to the CTDB daemon. The default is
59           10 seconds.
60
61       -T TIMELIMIT
62           Indicates that TIMELIMIT is the maximum run time (in seconds) for
63           the ctdb command. When TIMELIMIT is exceeded the ctdb command will
64           terminate with an error. The default is 120 seconds.
65
66       -? --help
67           Print some help text to the screen.
68
69       --usage
70           Print usage information to the screen.
71
72       -d --debug=DEBUGLEVEL
73           Change the debug level for the command. Default is NOTICE.
74

ADMINISTRATIVE COMMANDS

76       These are commands used to monitor and administer a CTDB cluster.
77
78   pnn
79       This command displays the PNN of the current node.
80
81   status
82       This command shows the current status of all CTDB nodes based on
83       information from the queried node.
84
85       Note: If the queried node is INACTIVE then the status might not be
86       current.
87
88       Node status
89           This includes the number of physical nodes and the status of each
90           node. See ctdb(7) for information about node states.
91
92       Generation
93           The generation id is a number that indicates the current generation
94           of a cluster instance. Each time a cluster goes through a
95           reconfiguration or a recovery its generation id will be changed.
96
97           This number does not have any particular meaning other than to keep
98           track of when a cluster has gone through a recovery. It is a random
99           number that represents the current instance of a ctdb cluster and
100           its databases. The CTDB daemon uses this number internally to be
101           able to tell when commands to operate on the cluster and the
102           databases was issued in a different generation of the cluster, to
103           ensure that commands that operate on the databases will not survive
104           across a cluster database recovery. After a recovery, all old
105           outstanding commands will automatically become invalid.
106
107           Sometimes this number will be shown as "INVALID". This only means
108           that the ctdbd daemon has started but it has not yet merged with
109           the cluster through a recovery. All nodes start with generation
110           "INVALID" and are not assigned a real generation id until they have
111           successfully been merged with a cluster through a recovery.
112
113       Virtual Node Number (VNN) map
114           Consists of the number of virtual nodes and mapping from virtual
115           node numbers to physical node numbers. Only nodes that are
116           participating in the VNN map can become lmaster for database
117           records.
118
119       Recovery mode
120           This is the current recovery mode of the cluster. There are two
121           possible modes:
122
123           NORMAL - The cluster is fully operational.
124
125           RECOVERY - The cluster databases have all been frozen, pausing all
126           services while the cluster awaits a recovery process to complete. A
127           recovery process should finish within seconds. If a cluster is
128           stuck in the RECOVERY state this would indicate a cluster
129           malfunction which needs to be investigated.
130
131           Once the leader detects an inconsistency, for example a node
132           becomes disconnected/connected, the recovery daemon will trigger a
133           cluster recovery process, where all databases are remerged across
134           the cluster. When this process starts, the leader will first
135           "freeze" all databases to prevent applications such as samba from
136           accessing the databases and it will also mark the recovery mode as
137           RECOVERY.
138
139           When the CTDB daemon starts up, it will start in RECOVERY mode.
140           Once the node has been merged into a cluster and all databases have
141           been recovered, the node mode will change into NORMAL mode and the
142           databases will be "thawed", allowing samba to access the databases
143           again.
144
145       Leader
146           This is the cluster node that is currently designated as the
147           leader. This node is responsible of monitoring the consistency of
148           the cluster and to perform the actual recovery process when
149           reqired.
150
151           Only one node at a time can be the designated leader. Which node is
152           designated the leader is decided by an election process in the
153           recovery daemons running on each node.
154
155       Example
156               # ctdb status
157               Number of nodes:4
158               pnn:0 192.168.2.200       OK (THIS NODE)
159               pnn:1 192.168.2.201       OK
160               pnn:2 192.168.2.202       OK
161               pnn:3 192.168.2.203       OK
162               Generation:1362079228
163               Size:4
164               hash:0 lmaster:0
165               hash:1 lmaster:1
166               hash:2 lmaster:2
167               hash:3 lmaster:3
168               Recovery mode:NORMAL (0)
169               Leader:0
170
171
172   nodestatus [PNN-LIST]
173       This command is similar to the status command. It displays the "node
174       status" subset of output. The main differences are:
175
176       •   The exit code is the bitwise-OR of the flags for each specified
177           node, while ctdb status exits with 0 if it was able to retrieve
178           status for all nodes.
179
180       •   ctdb status provides status information for all nodes.  ctdb
181           nodestatus defaults to providing status for only the current node.
182           If PNN-LIST is provided then status is given for the indicated
183           node(s).
184
185       A common invocation in scripts is ctdb nodestatus all to check whether
186       all nodes in a cluster are healthy.
187
188       Example
189               # ctdb nodestatus
190               pnn:0 10.0.0.30        OK (THIS NODE)
191
192               # ctdb nodestatus all
193               Number of nodes:2
194               pnn:0 10.0.0.30        OK (THIS NODE)
195               pnn:1 10.0.0.31        OK
196
197
198   leader
199       This command shows the pnn of the node which is currently the leader.
200
201       Note: If the queried node is INACTIVE then the status might not be
202       current.
203
204   uptime
205       This command shows the uptime for the ctdb daemon. When the last
206       recovery or ip-failover completed and how long it took. If the
207       "duration" is shown as a negative number, this indicates that there is
208       a recovery/failover in progress and it started that many seconds ago.
209
210       Example
211               # ctdb uptime
212               Current time of node          :                Thu Oct 29 10:38:54 2009
213               Ctdbd start time              : (000 16:54:28) Wed Oct 28 17:44:26 2009
214               Time of last recovery/failover: (000 16:53:31) Wed Oct 28 17:45:23 2009
215               Duration of last recovery/failover: 2.248552 seconds
216
217
218   listnodes
219       This command shows lists the ip addresses of all the nodes in the
220       cluster.
221
222       Example
223               # ctdb listnodes
224               192.168.2.200
225               192.168.2.201
226               192.168.2.202
227               192.168.2.203
228
229
230   natgw {leader|list|status}
231       This command shows different aspects of NAT gateway status. For an
232       overview of CTDB's NAT gateway functionality please see the NAT GATEWAY
233       section in ctdb(7).
234
235       leader
236           Show the PNN and private IP address of the current NAT gateway
237           leader node.
238
239           Example output:
240
241               1 192.168.2.201
242
243
244       list
245           List the private IP addresses of nodes in the current NAT gateway
246           group, annotating the leader node.
247
248           Example output:
249
250               192.168.2.200
251               192.168.2.201  LEADER
252               192.168.2.202
253               192.168.2.203
254
255
256       status
257           List the nodes in the current NAT gateway group and their status.
258
259           Example output:
260
261               pnn:0 192.168.2.200       UNHEALTHY (THIS NODE)
262               pnn:1 192.168.2.201       OK
263               pnn:2 192.168.2.202       OK
264               pnn:3 192.168.2.203       OK
265
266
267   ping
268       This command will "ping" specified CTDB nodes in the cluster to verify
269       that they are running.
270
271       Example
272               # ctdb ping
273               response from 0 time=0.000054 sec  (3 clients)
274
275
276   ifaces
277       This command will display the list of network interfaces, which could
278       host public addresses, along with their status.
279
280       Example
281               # ctdb ifaces
282               Interfaces on node 0
283               name:eth5 link:up references:2
284               name:eth4 link:down references:0
285               name:eth3 link:up references:1
286               name:eth2 link:up references:1
287
288               # ctdb -X ifaces
289               |Name|LinkStatus|References|
290               |eth5|1|2|
291               |eth4|0|0|
292               |eth3|1|1|
293               |eth2|1|1|
294
295
296   ip
297       This command will display the list of public addresses that are
298       provided by the cluster and which physical node is currently serving
299       this ip. By default this command will ONLY show those public addresses
300       that are known to the node itself. To see the full list of all public
301       ips across the cluster you must use "ctdb ip all".
302
303       Example
304               # ctdb ip -v
305               Public IPs on node 0
306               172.31.91.82 node[1] active[] available[eth2,eth3] configured[eth2,eth3]
307               172.31.91.83 node[0] active[eth3] available[eth2,eth3] configured[eth2,eth3]
308               172.31.91.84 node[1] active[] available[eth2,eth3] configured[eth2,eth3]
309               172.31.91.85 node[0] active[eth2] available[eth2,eth3] configured[eth2,eth3]
310               172.31.92.82 node[1] active[] available[eth5] configured[eth4,eth5]
311               172.31.92.83 node[0] active[eth5] available[eth5] configured[eth4,eth5]
312               172.31.92.84 node[1] active[] available[eth5] configured[eth4,eth5]
313               172.31.92.85 node[0] active[eth5] available[eth5] configured[eth4,eth5]
314
315               # ctdb -X ip -v
316               |Public IP|Node|ActiveInterface|AvailableInterfaces|ConfiguredInterfaces|
317               |172.31.91.82|1||eth2,eth3|eth2,eth3|
318               |172.31.91.83|0|eth3|eth2,eth3|eth2,eth3|
319               |172.31.91.84|1||eth2,eth3|eth2,eth3|
320               |172.31.91.85|0|eth2|eth2,eth3|eth2,eth3|
321               |172.31.92.82|1||eth5|eth4,eth5|
322               |172.31.92.83|0|eth5|eth5|eth4,eth5|
323               |172.31.92.84|1||eth5|eth4,eth5|
324               |172.31.92.85|0|eth5|eth5|eth4,eth5|
325
326
327   ipinfo IP
328       This command will display details about the specified public addresses.
329
330       Example
331               # ctdb ipinfo 172.31.92.85
332               Public IP[172.31.92.85] info on node 0
333               IP:172.31.92.85
334               CurrentNode:0
335               NumInterfaces:2
336               Interface[1]: Name:eth4 Link:down References:0
337               Interface[2]: Name:eth5 Link:up References:2 (active)
338
339
340   event run|status|script list|script enable|script disable
341       This command is used to control event daemon and to inspect status of
342       various events.
343
344       The commands below require a component to be specified. In the current
345       version the only valid component is legacy.
346
347       run TIMEOUT COMPONENT EVENT [ARGUMENTS]
348           This command can be used to manually run specified EVENT in
349           COMPONENT with optional ARGUMENTS. The event will be allowed to run
350           a maximum of TIMEOUT seconds. If TIMEOUT is 0, then there is no
351           time limit for running the event.
352
353       status COMPONENT EVENT
354           This command displays the last execution status of the specified
355           EVENT in COMPONENT.
356
357           The command will terminate with the exit status corresponding to
358           the overall status of event that is displayed.
359
360           The output is the list of event scripts executed. Each line shows
361           the name, status, duration and start time for each script.
362
363           Example
364
365               # ctdb event status legacy monitor
366               00.ctdb              OK         0.014 Sat Dec 17 19:39:11 2016
367               01.reclock           OK         0.013 Sat Dec 17 19:39:11 2016
368               05.system            OK         0.029 Sat Dec 17 19:39:11 2016
369               06.nfs               OK         0.014 Sat Dec 17 19:39:11 2016
370               10.interface         OK         0.037 Sat Dec 17 19:39:11 2016
371               11.natgw             OK         0.011 Sat Dec 17 19:39:11 2016
372               11.routing           OK         0.007 Sat Dec 17 19:39:11 2016
373               13.per_ip_routing    OK         0.007 Sat Dec 17 19:39:11 2016
374               20.multipathd        OK         0.007 Sat Dec 17 19:39:11 2016
375               31.clamd             OK         0.007 Sat Dec 17 19:39:11 2016
376               40.vsftpd            OK         0.013 Sat Dec 17 19:39:11 2016
377               41.httpd             OK         0.018 Sat Dec 17 19:39:11 2016
378               49.winbind           OK         0.023 Sat Dec 17 19:39:11 2016
379               50.samba             OK         0.100 Sat Dec 17 19:39:12 2016
380               60.nfs               OK         0.376 Sat Dec 17 19:39:12 2016
381               70.iscsi             OK         0.009 Sat Dec 17 19:39:12 2016
382               91.lvs               OK         0.007 Sat Dec 17 19:39:12 2016
383
384
385       script list COMPONENT
386           List the available event scripts in COMPONENT. Enabled scripts are
387           flagged with a '*'.
388
389           Generally, event scripts are provided by CTDB. However, local or
390           3rd party event scripts may also be available. These are shown in a
391           separate section after those provided by CTDB.
392
393           Example
394
395               # ctdb event script list legacy
396               * 00.ctdb
397               * 01.reclock
398               * 05.system
399               * 06.nfs
400               * 10.interface
401                 11.natgw
402                 11.routing
403                 13.per_ip_routing
404                 20.multipathd
405                 31.clamd
406                 40.vsftpd
407                 41.httpd
408               * 49.winbind
409               * 50.samba
410               * 60.nfs
411                 70.iscsi
412                 91.lvs
413
414               * 02.local
415
416
417       script enable COMPONENT SCRIPT
418           Enable the specified event SCRIPT in COMPONENT. Only enabled
419           scripts will be executed when running any event.
420
421       script disable COMPONENT SCRIPT
422           Disable the specified event SCRIPT in COMPONENT. This will prevent
423           the script from executing when running any event.
424
425   scriptstatus
426       This command displays which event scripts where run in the previous
427       monitoring cycle and the result of each script. If a script failed with
428       an error, causing the node to become unhealthy, the output from that
429       script is also shown.
430
431       This command is deprecated. It's provided for backward compatibility.
432       In place of ctdb scriptstatus, use ctdb event status.
433
434       Example
435               # ctdb scriptstatus
436               00.ctdb              OK         0.011 Sat Dec 17 19:40:46 2016
437               01.reclock           OK         0.010 Sat Dec 17 19:40:46 2016
438               05.system            OK         0.030 Sat Dec 17 19:40:46 2016
439               06.nfs               OK         0.014 Sat Dec 17 19:40:46 2016
440               10.interface         OK         0.041 Sat Dec 17 19:40:46 2016
441               11.natgw             OK         0.008 Sat Dec 17 19:40:46 2016
442               11.routing           OK         0.007 Sat Dec 17 19:40:46 2016
443               13.per_ip_routing    OK         0.007 Sat Dec 17 19:40:46 2016
444               20.multipathd        OK         0.007 Sat Dec 17 19:40:46 2016
445               31.clamd             OK         0.007 Sat Dec 17 19:40:46 2016
446               40.vsftpd            OK         0.013 Sat Dec 17 19:40:46 2016
447               41.httpd             OK         0.015 Sat Dec 17 19:40:46 2016
448               49.winbind           OK         0.022 Sat Dec 17 19:40:46 2016
449               50.samba             ERROR      0.077 Sat Dec 17 19:40:46 2016
450                 OUTPUT: ERROR: samba tcp port 445 is not responding
451
452
453   listvars
454       List all tuneable variables, except the values of the obsolete tunables
455       like VacuumMinInterval. The obsolete tunables can be retrieved only
456       explicitly with the "ctdb getvar" command.
457
458       Example
459               # ctdb listvars
460               SeqnumInterval          = 1000
461               ControlTimeout          = 60
462               TraverseTimeout         = 20
463               KeepaliveInterval       = 5
464               KeepaliveLimit          = 5
465               RecoverTimeout          = 120
466               RecoverInterval         = 1
467               ElectionTimeout         = 3
468               TakeoverTimeout         = 9
469               MonitorInterval         = 15
470               TickleUpdateInterval    = 20
471               EventScriptTimeout      = 30
472               MonitorTimeoutCount     = 20
473               RecoveryGracePeriod     = 120
474               RecoveryBanPeriod       = 300
475               DatabaseHashSize        = 100001
476               DatabaseMaxDead         = 5
477               RerecoveryTimeout       = 10
478               EnableBans              = 1
479               NoIPFailback            = 0
480               VerboseMemoryNames      = 0
481               RecdPingTimeout         = 60
482               RecdFailCount           = 10
483               LogLatencyMs            = 0
484               RecLockLatencyMs        = 1000
485               RecoveryDropAllIPs      = 120
486               VacuumInterval          = 10
487               VacuumMaxRunTime        = 120
488               RepackLimit             = 10000
489               VacuumFastPathCount     = 60
490               MaxQueueDropMsg         = 1000000
491               AllowUnhealthyDBRead    = 0
492               StatHistoryInterval     = 1
493               DeferredAttachTO        = 120
494               AllowClientDBAttach     = 1
495               RecoverPDBBySeqNum      = 1
496               DeferredRebalanceOnNodeAdd = 300
497               FetchCollapse           = 1
498               HopcountMakeSticky      = 50
499               StickyDuration          = 600
500               StickyPindown           = 200
501               NoIPTakeover            = 0
502               DBRecordCountWarn       = 100000
503               DBRecordSizeWarn        = 10000000
504               DBSizeWarn              = 100000000
505               PullDBPreallocation     = 10485760
506               LockProcessesPerDB      = 200
507               RecBufferSizeLimit      = 1000000
508               QueueBufferSize         = 1024
509               IPAllocAlgorithm        = 2
510
511
512   getvar NAME
513       Get the runtime value of a tuneable variable.
514
515       Example
516               # ctdb getvar MonitorInterval
517               MonitorInterval         = 15
518
519
520   setvar NAME VALUE
521       Set the runtime value of a tuneable variable.
522
523       Example
524               # ctdb setvar MonitorInterval 20
525
526
527   lvs {leader|list|status}
528       This command shows different aspects of LVS status. For an overview of
529       CTDB's LVS functionality please see the LVS section in ctdb(7).
530
531       leader
532           Shows the PNN of the current LVS leader node.
533
534           Example output:
535
536               2
537
538
539       list
540           Lists the currently usable LVS nodes.
541
542           Example output:
543
544               2 10.0.0.13
545               3 10.0.0.14
546
547
548       status
549           List the nodes in the current LVS group and their status.
550
551           Example output:
552
553               pnn:0 10.0.0.11        UNHEALTHY (THIS NODE)
554               pnn:1 10.0.0.12        UNHEALTHY
555               pnn:2 10.0.0.13        OK
556               pnn:3 10.0.0.14        OK
557
558
559   getcapabilities
560       This command shows the capabilities of the current node. See the
561       CAPABILITIES section in ctdb(7) for more details.
562
563       Example output:
564
565           LEADER: YES
566           LMASTER: YES
567
568
569   statistics
570       Collect statistics from the CTDB daemon about how many calls it has
571       served. Information about various fields in statistics can be found in
572       ctdb-statistics(7).
573
574       Example
575               # ctdb statistics
576               CTDB version 1
577               Current time of statistics  :                Tue Mar  8 15:18:51 2016
578               Statistics collected since  : (003 21:31:32) Fri Mar  4 17:47:19 2016
579                num_clients                        9
580                frozen                             0
581                recovering                         0
582                num_recoveries                     2
583                client_packets_sent          8170534
584                client_packets_recv          7166132
585                node_packets_sent           16549998
586                node_packets_recv            5244418
587                keepalive_packets_sent        201969
588                keepalive_packets_recv        201969
589                node
590                    req_call                      26
591                    reply_call                     0
592                    req_dmaster                    9
593                    reply_dmaster                 12
594                    reply_error                    0
595                    req_message              1339231
596                    req_control              8177506
597                    reply_control            6831284
598                client
599                    req_call                      15
600                    req_message               334809
601                    req_control              6831308
602                timeouts
603                    call                           0
604                    control                        0
605                    traverse                       0
606                locks
607                    num_calls                      8
608                    num_current                    0
609                    num_pending                    0
610                    num_failed                     0
611                total_calls                       15
612                pending_calls                      0
613                childwrite_calls                   0
614                pending_childwrite_calls             0
615                memory_used                   394879
616                max_hop_count                      1
617                total_ro_delegations               0
618                total_ro_revokes                   0
619                hop_count_buckets: 8 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0
620                lock_buckets: 0 0 8 0 0 0 0 0 0 0 0 0 0 0 0 0
621                locks_latency      MIN/AVG/MAX     0.010005/0.010418/0.011010 sec out of 8
622                reclock_ctdbd      MIN/AVG/MAX     0.002538/0.002538/0.002538 sec out of 1
623                reclock_recd       MIN/AVG/MAX     0.000000/0.000000/0.000000 sec out of 0
624                call_latency       MIN/AVG/MAX     0.000044/0.002142/0.011702 sec out of 15
625                childwrite_latency MIN/AVG/MAX     0.000000/0.000000/0.000000 sec out of 0
626
627
628   statisticsreset
629       This command is used to clear all statistics counters in a node.
630
631       Example: ctdb statisticsreset
632
633   dbstatistics DB
634       Display statistics about the database DB. Information about various
635       fields in dbstatistics can be found in ctdb-statistics(7).
636
637       Example
638               # ctdb dbstatistics locking.tdb
639               DB Statistics: locking.tdb
640                ro_delegations                     0
641                ro_revokes                         0
642                locks
643                    total                      14356
644                    failed                         0
645                    current                        0
646                    pending                        0
647                hop_count_buckets: 28087 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0
648                lock_buckets: 0 14188 38 76 32 19 3 0 0 0 0 0 0 0 0 0
649                locks_latency      MIN/AVG/MAX     0.001066/0.012686/4.202292 sec out of 14356
650                vacuum_latency     MIN/AVG/MAX     0.000472/0.002207/15.243570 sec out of 224530
651                Num Hot Keys:     1
652                    Count:8 Key:ff5bd7cb3ee3822edc1f0000000000000000000000000000
653
654
655   getreclock
656       Show details of the recovery lock, if any.
657
658       Example output:
659
660                /clusterfs/.ctdb/recovery.lock
661
662
663   getdebug
664       Get the current debug level for the node. the debug level controls what
665       information is written to the log file.
666
667       The debug levels are mapped to the corresponding syslog levels. When a
668       debug level is set, only those messages at that level and higher levels
669       will be printed.
670
671       The list of debug levels from highest to lowest are :
672
673       ERROR WARNING NOTICE INFO DEBUG
674
675   setdebug DEBUGLEVEL
676       Set the debug level of a node. This controls what information will be
677       logged.
678
679       The debuglevel is one of ERROR WARNING NOTICE INFO DEBUG
680
681   getpid
682       This command will return the process id of the ctdb daemon.
683
684   disable
685       This command is used to administratively disable a node in the cluster.
686       A disabled node will still participate in the cluster and host
687       clustered TDB records but its public ip address has been taken over by
688       a different node and it no longer hosts any services.
689
690   enable
691       Re-enable a node that has been administratively disabled.
692
693   stop
694       This command is used to administratively STOP a node in the cluster. A
695       STOPPED node is connected to the cluster but will not host any public
696       ip addresse, nor does it participate in the VNNMAP. The difference
697       between a DISABLED node and a STOPPED node is that a STOPPED node does
698       not host any parts of the database which means that a recovery is
699       required to stop/continue nodes.
700
701   continue
702       Re-start a node that has been administratively stopped.
703
704   addip IPADDR/mask IFACE
705       This command is used to add a new public ip to a node during runtime.
706       It should be followed by a ctdb ipreallocate. This allows public
707       addresses to be added to a cluster without having to restart the ctdb
708       daemons.
709
710       Note that this only updates the runtime instance of ctdb. Any changes
711       will be lost next time ctdb is restarted and the public addresses file
712       is re-read. If you want this change to be permanent you must also
713       update the public addresses file manually.
714
715   delip IPADDR
716       This command flags IPADDR for deletion from a node at runtime. It
717       should be followed by a ctdb ipreallocate. If IPADDR is currently
718       hosted by the node it is being removed from, this ensures that the IP
719       will first be failed over to another node, if possible, and that it is
720       then actually removed.
721
722       Note that this only updates the runtime instance of CTDB. Any changes
723       will be lost next time CTDB is restarted and the public addresses file
724       is re-read. If you want this change to be permanent you must also
725       update the public addresses file manually.
726
727   moveip IPADDR PNN
728       This command can be used to manually fail a public ip address to a
729       specific node.
730
731       In order to manually override the "automatic" distribution of public ip
732       addresses that ctdb normally provides, this command only works when you
733       have changed the tunables for the daemon to:
734
735       IPAllocAlgorithm != 0
736
737       NoIPFailback = 1
738
739   shutdown
740       This command will shutdown a specific CTDB daemon.
741
742   setlmasterrole on|off
743       This command is used to enable/disable the LMASTER capability for a
744       node at runtime. This capability determines whether or not a node can
745       be used as an LMASTER for records in the database. A node that does not
746       have the LMASTER capability will not show up in the vnnmap.
747
748       Nodes will by default have this capability, but it can be stripped off
749       nodes by the setting in the sysconfig file or by using this command.
750
751       Once this setting has been enabled/disabled, you need to perform a
752       recovery for it to take effect.
753
754       See also "ctdb getcapabilities"
755
756   setleaderrole on|off
757       This command is used to enable/disable the LEADER capability for a node
758       at runtime. This capability determines whether or not a node can be
759       elected leader of the cluster. A node that does not have the LEADER
760       capability can not be elected leader. If the current leader has this
761       capability removed then an election will occur.
762
763       Nodes have this capability enabled by default, but it can be removed
764       via the cluster:leader capability configuration setting or by using
765       this command.
766
767       See also "ctdb getcapabilities"
768
769   reloadnodes
770       This command is used when adding new nodes, or removing existing nodes
771       from an existing cluster.
772
773       Procedure to add nodes:
774
775        1. To expand an existing cluster, first ensure with ctdb status that
776           all nodes are up and running and that they are all healthy. Do not
777           try to expand a cluster unless it is completely healthy!
778
779        2. On all nodes, edit /etc/ctdb/nodes and add the new nodes at the end
780           of this file.
781
782        3. Verify that all the nodes have identical /etc/ctdb/nodes files
783           after adding the new nodes.
784
785        4. Run ctdb reloadnodes to force all nodes to reload the nodes file.
786
787        5. Use ctdb status on all nodes and verify that they now show the
788           additional nodes.
789
790        6. Install and configure the new node and bring it online.
791
792       Procedure to remove nodes:
793
794        1. To remove nodes from an existing cluster, first ensure with ctdb
795           status that all nodes, except the node to be deleted, are up and
796           running and that they are all healthy. Do not try to remove nodes
797           from a cluster unless the cluster is completely healthy!
798
799        2. Shutdown and power off the node to be removed.
800
801        3. On all other nodes, edit the /etc/ctdb/nodes file and comment out
802           the nodes to be removed.  Do not delete the lines for the deleted
803           nodes, just comment them out by adding a '#' at the beginning of
804           the lines.
805
806        4. Run ctdb reloadnodes to force all nodes to reload the nodes file.
807
808        5. Use ctdb status on all nodes and verify that the deleted nodes are
809           no longer listed.
810
811   reloadips [PNN-LIST]
812       This command reloads the public addresses configuration file on the
813       specified nodes. When it completes addresses will be reconfigured and
814       reassigned across the cluster as necessary.
815
816       This command is currently unable to make changes to the netmask or
817       interfaces associated with existing addresses. Such changes must be
818       made in 2 steps by deleting addresses in question and re-adding then.
819       Unfortunately this will disrupt connections to the changed addresses.
820
821   getdbmap
822       This command lists all clustered TDB databases that the CTDB daemon has
823       attached to. Some databases are flagged as PERSISTENT, this means that
824       the database stores data persistently and the data will remain across
825       reboots. One example of such a database is secrets.tdb where
826       information about how the cluster was joined to the domain is stored.
827       Some database are flagged as REPLICATED, this means that the data in
828       that database is replicated across all the nodes. But the data will not
829       remain across reboots. This type of database is used by CTDB to store
830       it's internal state.
831
832       If a PERSISTENT database is not in a healthy state the database is
833       flagged as UNHEALTHY. If there's at least one completely healthy node
834       running in the cluster, it's possible that the content is restored by a
835       recovery run automatically. Otherwise an administrator needs to analyze
836       the problem.
837
838       See also "ctdb getdbstatus", "ctdb backupdb", "ctdb restoredb", "ctdb
839       dumpbackup", "ctdb wipedb", "ctdb setvar AllowUnhealthyDBRead 1" and
840       (if samba or tdb-utils are installed) "tdbtool check".
841
842       Most databases are not persistent and only store the state information
843       that the currently running samba daemons need. These databases are
844       always wiped when ctdb/samba starts and when a node is rebooted.
845
846       Example
847               # ctdb getdbmap
848               Number of databases:10
849               dbid:0x435d3410 name:notify.tdb path:/var/lib/ctdb/notify.tdb.0
850               dbid:0x42fe72c5 name:locking.tdb path:/var/lib/ctdb/locking.tdb.0
851               dbid:0x1421fb78 name:brlock.tdb path:/var/lib/ctdb/brlock.tdb.0
852               dbid:0x17055d90 name:connections.tdb path:/var/lib/ctdb/connections.tdb.0
853               dbid:0xc0bdde6a name:sessionid.tdb path:/var/lib/ctdb/sessionid.tdb.0
854               dbid:0x122224da name:test.tdb path:/var/lib/ctdb/test.tdb.0
855               dbid:0x2672a57f name:idmap2.tdb path:/var/lib/ctdb/persistent/idmap2.tdb.0 PERSISTENT
856               dbid:0xb775fff6 name:secrets.tdb path:/var/lib/ctdb/persistent/secrets.tdb.0 PERSISTENT
857               dbid:0xe98e08b6 name:group_mapping.tdb path:/var/lib/ctdb/persistent/group_mapping.tdb.0 PERSISTENT
858               dbid:0x7bbbd26c name:passdb.tdb path:/var/lib/ctdb/persistent/passdb.tdb.0 PERSISTENT
859
860               # ctdb getdbmap  # example for unhealthy database
861               Number of databases:1
862               dbid:0xb775fff6 name:secrets.tdb path:/var/lib/ctdb/persistent/secrets.tdb.0 PERSISTENT UNHEALTHY
863
864               # ctdb -X getdbmap
865               |ID|Name|Path|Persistent|Unhealthy|
866               |0x7bbbd26c|passdb.tdb|/var/lib/ctdb/persistent/passdb.tdb.0|1|0|
867
868
869   backupdb DB FILE
870       Copy the contents of database DB to FILE. FILE can later be read back
871       using restoredb. This is mainly useful for backing up persistent
872       databases such as secrets.tdb and similar.
873
874   restoredb FILE [DB]
875       This command restores a persistent database that was previously backed
876       up using backupdb. By default the data will be restored back into the
877       same database as it was created from. By specifying dbname you can
878       restore the data into a different database.
879
880   setdbreadonly DB
881       This command will enable the read-only record support for a database.
882       This is an experimental feature to improve performance for contended
883       records primarily in locking.tdb and brlock.tdb. When enabling this
884       feature you must set it on all nodes in the cluster.
885
886   setdbsticky DB
887       This command will enable the sticky record support for the specified
888       database. This is an experimental feature to improve performance for
889       contended records primarily in locking.tdb and brlock.tdb. When
890       enabling this feature you must set it on all nodes in the cluster.
891

INTERNAL COMMANDS

893       Internal commands are used by CTDB's scripts and are not required for
894       managing a CTDB cluster. Their parameters and behaviour are subject to
895       change.
896
897   gettickles IPADDR
898       Show TCP connections that are registered with CTDB to be "tickled" if
899       there is a failover.
900
901   gratarp IPADDR INTERFACE
902       Send out a gratuitous ARP for the specified interface through the
903       specified interface. This command is mainly used by the ctdb
904       eventscripts.
905
906   pdelete DB KEY
907       Delete KEY from DB.
908
909   pfetch DB KEY
910       Print the value associated with KEY in DB.
911
912   pstore DB KEY FILE
913       Store KEY in DB with contents of FILE as the associated value.
914
915   ptrans DB [FILE]
916       Read a list of key-value pairs, one per line from FILE, and store them
917       in DB using a single transaction. An empty value is equivalent to
918       deleting the given key.
919
920       The key and value should be separated by spaces or tabs. Each key/value
921       should be a printable string enclosed in double-quotes.
922
923   runstate [setup|first_recovery|startup|running]
924       Print the runstate of the specified node. Runstates are used to
925       serialise important state transitions in CTDB, particularly during
926       startup.
927
928       If one or more optional runstate arguments are specified then the node
929       must be in one of these runstates for the command to succeed.
930
931       Example
932               # ctdb runstate
933               RUNNING
934
935
936   setifacelink IFACE up|down
937       Set the internal state of network interface IFACE. This is typically
938       used in the 10.interface script in the "monitor" event.
939
940       Example: ctdb setifacelink eth0 up
941
942   tickle
943       Read a list of TCP connections, one per line, from standard input and
944       send a TCP tickle to the source host for each connection. A connection
945       is specified as:
946
947                SRC-IPADDR:SRC-PORT DST-IPADDR:DST-PORT
948
949
950       A single connection can be specified on the command-line rather than on
951       standard input.
952
953       A TCP tickle is a TCP ACK packet with an invalid sequence and
954       acknowledge number and will when received by the source host result in
955       it sending an immediate correct ACK back to the other end.
956
957       TCP tickles are useful to "tickle" clients after a IP failover has
958       occurred since this will make the client immediately recognize the TCP
959       connection has been disrupted and that the client will need to
960       reestablish. This greatly speeds up the time it takes for a client to
961       detect and reestablish after an IP failover in the ctdb cluster.
962
963   version
964       Display the CTDB version.
965

DEBUGGING COMMANDS

967       These commands are primarily used for CTDB development and testing and
968       should not be used for normal administration.
969
970   OPTIONS
971       --print-emptyrecords
972           This enables printing of empty records when dumping databases with
973           the catdb, cattbd and dumpdbbackup commands. Records with empty
974           data segment are considered deleted by ctdb and cleaned by the
975           vacuuming mechanism, so this switch can come in handy for debugging
976           the vacuuming behaviour.
977
978       --print-datasize
979           This lets database dumps (catdb, cattdb, dumpdbbackup) print the
980           size of the record data instead of dumping the data contents.
981
982       --print-lmaster
983           This lets catdb print the lmaster for each record.
984
985       --print-hash
986           This lets database dumps (catdb, cattdb, dumpdbbackup) print the
987           hash for each record.
988
989       --print-recordflags
990           This lets catdb and dumpdbbackup print the record flags for each
991           record. Note that cattdb always prints the flags.
992
993   process-exists PID [SRVID]
994       This command checks if a specific process exists on the CTDB host. This
995       is mainly used by Samba to check if remote instances of samba are still
996       running or not. When the optional SRVID argument is specified, the
997       command check if a specific process exists on the CTDB host and has
998       registered for specified SRVID.
999
1000   getdbstatus DB
1001       This command displays more details about a database.
1002
1003       Example
1004               # ctdb getdbstatus test.tdb.0
1005               dbid: 0x122224da
1006               name: test.tdb
1007               path: /var/lib/ctdb/test.tdb.0
1008               PERSISTENT: no
1009               HEALTH: OK
1010
1011               # ctdb getdbstatus registry.tdb  # with a corrupted TDB
1012               dbid: 0xf2a58948
1013               name: registry.tdb
1014               path: /var/lib/ctdb/persistent/registry.tdb.0
1015               PERSISTENT: yes
1016               HEALTH: NO-HEALTHY-NODES - ERROR - Backup of corrupted TDB in '/var/lib/ctdb/persistent/registry.tdb.0.corrupted.20091208091949.0Z'
1017
1018
1019   catdb DB
1020       Print a dump of the clustered TDB database DB.
1021
1022   cattdb DB
1023       Print a dump of the contents of the local TDB database DB.
1024
1025   dumpdbbackup FILE
1026       Print a dump of the contents from database backup FILE, similar to
1027       catdb.
1028
1029   wipedb DB
1030       Remove all contents of database DB.
1031
1032   recover
1033       This command will trigger the recovery daemon to do a cluster recovery.
1034
1035   ipreallocate, sync
1036       This command will force the leader to perform a full ip reallocation
1037       process and redistribute all ip addresses. This is useful to "reset"
1038       the allocations back to its default state if they have been changed
1039       using the "moveip" command. While a "recover" will also perform this
1040       reallocation, a recovery is much more hevyweight since it will also
1041       rebuild all the databases.
1042
1043   attach DBNAME [persistent|replicated]
1044       Create a new CTDB database called DBNAME and attach to it on all nodes.
1045
1046   detach DB-LIST
1047       Detach specified non-persistent database(s) from the cluster. This
1048       command will disconnect specified database(s) on all nodes in the
1049       cluster. This command should only be used when none of the specified
1050       database(s) are in use.
1051
1052       All nodes should be active and tunable AllowClientDBAccess should be
1053       disabled on all nodes before detaching databases.
1054
1055   dumpmemory
1056       This is a debugging command. This command will make the ctdb daemon to
1057       write a fill memory allocation map to standard output.
1058
1059   rddumpmemory
1060       This is a debugging command. This command will dump the talloc memory
1061       allocation tree for the recovery daemon to standard output.
1062
1063   ban BANTIME
1064       Administratively ban a node for BANTIME seconds. The node will be
1065       unbanned after BANTIME seconds have elapsed.
1066
1067       A banned node does not participate in the cluster. It does not host any
1068       records for the clustered TDB and does not host any public IP
1069       addresses.
1070
1071       Nodes are automatically banned if they misbehave. For example, a node
1072       may be banned if it causes too many cluster recoveries.
1073
1074       To administratively exclude a node from a cluster use the stop command.
1075
1076   unban
1077       This command is used to unban a node that has either been
1078       administratively banned using the ban command or has been automatically
1079       banned.
1080

AUTHOR

1086       This documentation was written by Ronnie Sahlberg, Amitay Isaacs,
1087       Martin Schwenke
1088

COPYRIGHT

1090       Copyright © 2007 Andrew Tridgell, Ronnie Sahlberg
1091
1092       This program is free software; you can redistribute it and/or modify it
1093       under the terms of the GNU General Public License as published by the
1094       Free Software Foundation; either version 3 of the License, or (at your
1095       option) any later version.
1096
1097       This program is distributed in the hope that it will be useful, but
1098       WITHOUT ANY WARRANTY; without even the implied warranty of
1099       MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
1100       General Public License for more details.
1101
1102       You should have received a copy of the GNU General Public License along
1103       with this program; if not, see http://www.gnu.org/licenses.
1104
1105
1106
1107
1108ctdb                              01/26/2023                           CTDB(1)