ctdb(1) - f32

1CTDB(1)                  CTDB - clustered TDB database                 CTDB(1)
2
3
4

NAME

6       ctdb - CTDB management utility
7

SYNOPSIS

9       ctdb [OPTION...] {COMMAND} [COMMAND-ARGS]
10

DESCRIPTION

12       ctdb is a utility to view and manage a CTDB cluster.
13
14       The following terms are used when referring to nodes in a cluster:
15
16       PNN
17           Physical Node Number. The physical node number is an integer that
18           describes the node in the cluster. The first node has physical node
19           number 0. in a cluster.
20
21       PNN-LIST
22           This is either a single PNN, a comma-separate list of PNNs or
23           "all".
24
25       Commands that reference a database use the following terms:
26
27       DB
28           This is either a database name, such as locking.tdb or a database
29           ID such as "0x42fe72c5".
30
31       DB-LIST
32           A space separated list of at least one DB.
33

OPTIONS

35       -n PNN
36           The node specified by PNN should be queried for the requested
37           information. Default is to query the daemon running on the local
38           host.
39
40       -Y
41           Produce output in machine readable form for easier parsing by
42           scripts. This uses a field delimiter of ':'. Not all commands
43           support this option.
44
45       -x SEPARATOR
46           Use SEPARATOR to delimit fields in machine readable output. This
47           implies -Y.
48
49       -X
50           Produce output in machine readable form for easier parsing by
51           scripts. This uses a field delimiter of '|'. Not all commands
52           support this option.
53
54           This is equivalent to "-x|" and avoids some shell quoting issues.
55
56       -t TIMEOUT
57           Indicates that ctdb should wait up to TIMEOUT seconds for a
58           response to most commands sent to the CTDB daemon. The default is
59           10 seconds.
60
61       -T TIMELIMIT
62           Indicates that TIMELIMIT is the maximum run time (in seconds) for
63           the ctdb command. When TIMELIMIT is exceeded the ctdb command will
64           terminate with an error. The default is 120 seconds.
65
66       -? --help
67           Print some help text to the screen.
68
69       --usage
70           Print usage information to the screen.
71
72       -d --debug=DEBUGLEVEL
73           Change the debug level for the command. Default is NOTICE.
74

ADMINISTRATIVE COMMANDS

76       These are commands used to monitor and administer a CTDB cluster.
77
78   pnn
79       This command displays the PNN of the current node.
80
81   status
82       This command shows the current status of all CTDB nodes based on
83       information from the queried node.
84
85       Note: If the queried node is INACTIVE then the status might not be
86       current.
87
88       Node status
89           This includes the number of physical nodes and the status of each
90           node. See ctdb(7) for information about node states.
91
92       Generation
93           The generation id is a number that indicates the current generation
94           of a cluster instance. Each time a cluster goes through a
95           reconfiguration or a recovery its generation id will be changed.
96
97           This number does not have any particular meaning other than to keep
98           track of when a cluster has gone through a recovery. It is a random
99           number that represents the current instance of a ctdb cluster and
100           its databases. The CTDB daemon uses this number internally to be
101           able to tell when commands to operate on the cluster and the
102           databases was issued in a different generation of the cluster, to
103           ensure that commands that operate on the databases will not survive
104           across a cluster database recovery. After a recovery, all old
105           outstanding commands will automatically become invalid.
106
107           Sometimes this number will be shown as "INVALID". This only means
108           that the ctdbd daemon has started but it has not yet merged with
109           the cluster through a recovery. All nodes start with generation
110           "INVALID" and are not assigned a real generation id until they have
111           successfully been merged with a cluster through a recovery.
112
113       Virtual Node Number (VNN) map
114           Consists of the number of virtual nodes and mapping from virtual
115           node numbers to physical node numbers. Only nodes that are
116           participating in the VNN map can become lmaster for database
117           records.
118
119       Recovery mode
120           This is the current recovery mode of the cluster. There are two
121           possible modes:
122
123           NORMAL - The cluster is fully operational.
124
125           RECOVERY - The cluster databases have all been frozen, pausing all
126           services while the cluster awaits a recovery process to complete. A
127           recovery process should finish within seconds. If a cluster is
128           stuck in the RECOVERY state this would indicate a cluster
129           malfunction which needs to be investigated.
130
131           Once the recovery master detects an inconsistency, for example a
132           node becomes disconnected/connected, the recovery daemon will
133           trigger a cluster recovery process, where all databases are
134           remerged across the cluster. When this process starts, the recovery
135           master will first "freeze" all databases to prevent applications
136           such as samba from accessing the databases and it will also mark
137           the recovery mode as RECOVERY.
138
139           When the CTDB daemon starts up, it will start in RECOVERY mode.
140           Once the node has been merged into a cluster and all databases have
141           been recovered, the node mode will change into NORMAL mode and the
142           databases will be "thawed", allowing samba to access the databases
143           again.
144
145       Recovery master
146           This is the cluster node that is currently designated as the
147           recovery master. This node is responsible of monitoring the
148           consistency of the cluster and to perform the actual recovery
149           process when reqired.
150
151           Only one node at a time can be the designated recovery master.
152           Which node is designated the recovery master is decided by an
153           election process in the recovery daemons running on each node.
154
155       Example
156               # ctdb status
157               Number of nodes:4
158               pnn:0 192.168.2.200       OK (THIS NODE)
159               pnn:1 192.168.2.201       OK
160               pnn:2 192.168.2.202       OK
161               pnn:3 192.168.2.203       OK
162               Generation:1362079228
163               Size:4
164               hash:0 lmaster:0
165               hash:1 lmaster:1
166               hash:2 lmaster:2
167               hash:3 lmaster:3
168               Recovery mode:NORMAL (0)
169               Recovery master:0
170
171
172   nodestatus [PNN-LIST]
173       This command is similar to the status command. It displays the "node
174       status" subset of output. The main differences are:
175
176       ·   The exit code is the bitwise-OR of the flags for each specified
177           node, while ctdb status exits with 0 if it was able to retrieve
178           status for all nodes.
179
180       ·   ctdb status provides status information for all nodes.  ctdb
181           nodestatus defaults to providing status for only the current node.
182           If PNN-LIST is provided then status is given for the indicated
183           node(s).
184
185       A common invocation in scripts is ctdb nodestatus all to check whether
186       all nodes in a cluster are healthy.
187
188       Example
189               # ctdb nodestatus
190               pnn:0 10.0.0.30        OK (THIS NODE)
191
192               # ctdb nodestatus all
193               Number of nodes:2
194               pnn:0 10.0.0.30        OK (THIS NODE)
195               pnn:1 10.0.0.31        OK
196
197
198   recmaster
199       This command shows the pnn of the node which is currently the
200       recmaster.
201
202       Note: If the queried node is INACTIVE then the status might not be
203       current.
204
205   uptime
206       This command shows the uptime for the ctdb daemon. When the last
207       recovery or ip-failover completed and how long it took. If the
208       "duration" is shown as a negative number, this indicates that there is
209       a recovery/failover in progress and it started that many seconds ago.
210
211       Example
212               # ctdb uptime
213               Current time of node          :                Thu Oct 29 10:38:54 2009
214               Ctdbd start time              : (000 16:54:28) Wed Oct 28 17:44:26 2009
215               Time of last recovery/failover: (000 16:53:31) Wed Oct 28 17:45:23 2009
216               Duration of last recovery/failover: 2.248552 seconds
217
218
219   listnodes
220       This command shows lists the ip addresses of all the nodes in the
221       cluster.
222
223       Example
224               # ctdb listnodes
225               192.168.2.200
226               192.168.2.201
227               192.168.2.202
228               192.168.2.203
229
230
231   natgw {master|list|status}
232       This command shows different aspects of NAT gateway status. For an
233       overview of CTDB's NAT gateway functionality please see the NAT GATEWAY
234       section in ctdb(7).
235
236       master
237           Show the PNN and private IP address of the current NAT gateway
238           master node.
239
240           Example output:
241
242               1 192.168.2.201
243
244
245       list
246           List the private IP addresses of nodes in the current NAT gateway
247           group, annotating the master node.
248
249           Example output:
250
251               192.168.2.200
252               192.168.2.201  MASTER
253               192.168.2.202
254               192.168.2.203
255
256
257       status
258           List the nodes in the current NAT gateway group and their status.
259
260           Example output:
261
262               pnn:0 192.168.2.200       UNHEALTHY (THIS NODE)
263               pnn:1 192.168.2.201       OK
264               pnn:2 192.168.2.202       OK
265               pnn:3 192.168.2.203       OK
266
267
268   ping
269       This command will "ping" specified CTDB nodes in the cluster to verify
270       that they are running.
271
272       Example
273               # ctdb ping
274               response from 0 time=0.000054 sec  (3 clients)
275
276
277   ifaces
278       This command will display the list of network interfaces, which could
279       host public addresses, along with their status.
280
281       Example
282               # ctdb ifaces
283               Interfaces on node 0
284               name:eth5 link:up references:2
285               name:eth4 link:down references:0
286               name:eth3 link:up references:1
287               name:eth2 link:up references:1
288
289               # ctdb -X ifaces
290               |Name|LinkStatus|References|
291               |eth5|1|2|
292               |eth4|0|0|
293               |eth3|1|1|
294               |eth2|1|1|
295
296
297   ip
298       This command will display the list of public addresses that are
299       provided by the cluster and which physical node is currently serving
300       this ip. By default this command will ONLY show those public addresses
301       that are known to the node itself. To see the full list of all public
302       ips across the cluster you must use "ctdb ip all".
303
304       Example
305               # ctdb ip -v
306               Public IPs on node 0
307               172.31.91.82 node[1] active[] available[eth2,eth3] configured[eth2,eth3]
308               172.31.91.83 node[0] active[eth3] available[eth2,eth3] configured[eth2,eth3]
309               172.31.91.84 node[1] active[] available[eth2,eth3] configured[eth2,eth3]
310               172.31.91.85 node[0] active[eth2] available[eth2,eth3] configured[eth2,eth3]
311               172.31.92.82 node[1] active[] available[eth5] configured[eth4,eth5]
312               172.31.92.83 node[0] active[eth5] available[eth5] configured[eth4,eth5]
313               172.31.92.84 node[1] active[] available[eth5] configured[eth4,eth5]
314               172.31.92.85 node[0] active[eth5] available[eth5] configured[eth4,eth5]
315
316               # ctdb -X ip -v
317               |Public IP|Node|ActiveInterface|AvailableInterfaces|ConfiguredInterfaces|
318               |172.31.91.82|1||eth2,eth3|eth2,eth3|
319               |172.31.91.83|0|eth3|eth2,eth3|eth2,eth3|
320               |172.31.91.84|1||eth2,eth3|eth2,eth3|
321               |172.31.91.85|0|eth2|eth2,eth3|eth2,eth3|
322               |172.31.92.82|1||eth5|eth4,eth5|
323               |172.31.92.83|0|eth5|eth5|eth4,eth5|
324               |172.31.92.84|1||eth5|eth4,eth5|
325               |172.31.92.85|0|eth5|eth5|eth4,eth5|
326
327
328   ipinfo IP
329       This command will display details about the specified public addresses.
330
331       Example
332               # ctdb ipinfo 172.31.92.85
333               Public IP[172.31.92.85] info on node 0
334               IP:172.31.92.85
335               CurrentNode:0
336               NumInterfaces:2
337               Interface[1]: Name:eth4 Link:down References:0
338               Interface[2]: Name:eth5 Link:up References:2 (active)
339
340
341   event run|status|script list|script enable|script disable
342       This command is used to control event daemon and to inspect status of
343       various events.
344
345       The commands below require a component to be specified. In the current
346       version the only valid component is legacy.
347
348       run TIMEOUT COMPONENT EVENT [ARGUMENTS]
349           This command can be used to manually run specified EVENT in
350           COMPONENT with optional ARGUMENTS. The event will be allowed to run
351           a maximum of TIMEOUT seconds. If TIMEOUT is 0, then there is no
352           time limit for running the event.
353
354       status COMPONENT EVENT
355           This command displays the last execution status of the specified
356           EVENT in COMPONENT.
357
358           The command will terminate with the exit status corresponding to
359           the overall status of event that is displayed.
360
361           The output is the list of event scripts executed. Each line shows
362           the name, status, duration and start time for each script.
363
364           Example output:
365
366               00.ctdb              OK         0.014 Sat Dec 17 19:39:11 2016
367               01.reclock           OK         0.013 Sat Dec 17 19:39:11 2016
368               05.system            OK         0.029 Sat Dec 17 19:39:11 2016
369               06.nfs               OK         0.014 Sat Dec 17 19:39:11 2016
370               10.interface         OK         0.037 Sat Dec 17 19:39:11 2016
371               11.natgw             OK         0.011 Sat Dec 17 19:39:11 2016
372               11.routing           OK         0.007 Sat Dec 17 19:39:11 2016
373               13.per_ip_routing    OK         0.007 Sat Dec 17 19:39:11 2016
374               20.multipathd        OK         0.007 Sat Dec 17 19:39:11 2016
375               31.clamd             OK         0.007 Sat Dec 17 19:39:11 2016
376               40.vsftpd            OK         0.013 Sat Dec 17 19:39:11 2016
377               41.httpd             OK         0.018 Sat Dec 17 19:39:11 2016
378               49.winbind           OK         0.023 Sat Dec 17 19:39:11 2016
379               50.samba             OK         0.100 Sat Dec 17 19:39:12 2016
380               60.nfs               OK         0.376 Sat Dec 17 19:39:12 2016
381               70.iscsi             OK         0.009 Sat Dec 17 19:39:12 2016
382               91.lvs               OK         0.007 Sat Dec 17 19:39:12 2016
383
384
385       script list COMPONENT
386           List the available event scripts in COMPONENT. Enabled scripts are
387           flagged with a '*'.
388
389           Generally, event scripts are provided by CTDB. However, local or
390           3rd party event scripts may also be available. These are shown in a
391           separate section after those provided by CTDB.
392
393           Example output:
394
395               * 00.ctdb
396               * 01.reclock
397               * 05.system
398               * 06.nfs
399               * 10.interface
400                 11.natgw
401                 11.routing
402                 13.per_ip_routing
403                 20.multipathd
404                 31.clamd
405                 40.vsftpd
406                 41.httpd
407               * 49.winbind
408               * 50.samba
409               * 60.nfs
410                 70.iscsi
411                 91.lvs
412
413               * 02.local
414
415
416       script enable COMPONENT SCRIPT
417           Enable the specified event SCRIPT in COMPONENT. Only enabled
418           scripts will be executed when running any event.
419
420       script disable COMPONENT SCRIPT
421           Disable the specified event SCRIPT in COMPONENT. This will prevent
422           the script from executing when running any event.
423
424   scriptstatus
425       This command displays which event scripts where run in the previous
426       monitoring cycle and the result of each script. If a script failed with
427       an error, causing the node to become unhealthy, the output from that
428       script is also shown.
429
430       This command is deprecated. It's provided for backward compatibility.
431       In place of ctdb scriptstatus, use ctdb event status.
432
433       Example
434               # ctdb scriptstatus
435               00.ctdb              OK         0.011 Sat Dec 17 19:40:46 2016
436               01.reclock           OK         0.010 Sat Dec 17 19:40:46 2016
437               05.system            OK         0.030 Sat Dec 17 19:40:46 2016
438               06.nfs               OK         0.014 Sat Dec 17 19:40:46 2016
439               10.interface         OK         0.041 Sat Dec 17 19:40:46 2016
440               11.natgw             OK         0.008 Sat Dec 17 19:40:46 2016
441               11.routing           OK         0.007 Sat Dec 17 19:40:46 2016
442               13.per_ip_routing    OK         0.007 Sat Dec 17 19:40:46 2016
443               20.multipathd        OK         0.007 Sat Dec 17 19:40:46 2016
444               31.clamd             OK         0.007 Sat Dec 17 19:40:46 2016
445               40.vsftpd            OK         0.013 Sat Dec 17 19:40:46 2016
446               41.httpd             OK         0.015 Sat Dec 17 19:40:46 2016
447               49.winbind           OK         0.022 Sat Dec 17 19:40:46 2016
448               50.samba             ERROR      0.077 Sat Dec 17 19:40:46 2016
449                 OUTPUT: ERROR: samba tcp port 445 is not responding
450
451
452   listvars
453       List all tuneable variables, except the values of the obsolete tunables
454       like VacuumMinInterval. The obsolete tunables can be retrieved only
455       explicitly with the "ctdb getvar" command.
456
457       Example
458               # ctdb listvars
459               SeqnumInterval          = 1000
460               ControlTimeout          = 60
461               TraverseTimeout         = 20
462               KeepaliveInterval       = 5
463               KeepaliveLimit          = 5
464               RecoverTimeout          = 120
465               RecoverInterval         = 1
466               ElectionTimeout         = 3
467               TakeoverTimeout         = 9
468               MonitorInterval         = 15
469               TickleUpdateInterval    = 20
470               EventScriptTimeout      = 30
471               MonitorTimeoutCount     = 20
472               RecoveryGracePeriod     = 120
473               RecoveryBanPeriod       = 300
474               DatabaseHashSize        = 100001
475               DatabaseMaxDead         = 5
476               RerecoveryTimeout       = 10
477               EnableBans              = 1
478               NoIPFailback            = 0
479               VerboseMemoryNames      = 0
480               RecdPingTimeout         = 60
481               RecdFailCount           = 10
482               LogLatencyMs            = 0
483               RecLockLatencyMs        = 1000
484               RecoveryDropAllIPs      = 120
485               VacuumInterval          = 10
486               VacuumMaxRunTime        = 120
487               RepackLimit             = 10000
488               VacuumFastPathCount     = 60
489               MaxQueueDropMsg         = 1000000
490               AllowUnhealthyDBRead    = 0
491               StatHistoryInterval     = 1
492               DeferredAttachTO        = 120
493               AllowClientDBAttach     = 1
494               RecoverPDBBySeqNum      = 1
495               DeferredRebalanceOnNodeAdd = 300
496               FetchCollapse           = 1
497               HopcountMakeSticky      = 50
498               StickyDuration          = 600
499               StickyPindown           = 200
500               NoIPTakeover            = 0
501               DBRecordCountWarn       = 100000
502               DBRecordSizeWarn        = 10000000
503               DBSizeWarn              = 100000000
504               PullDBPreallocation     = 10485760
505               LockProcessesPerDB      = 200
506               RecBufferSizeLimit      = 1000000
507               QueueBufferSize         = 1024
508               IPAllocAlgorithm        = 2
509
510
511   getvar NAME
512       Get the runtime value of a tuneable variable.
513
514       Example
515               # ctdb getvar MonitorInterval
516               MonitorInterval         = 15
517
518
519   setvar NAME VALUE
520       Set the runtime value of a tuneable variable.
521
522       Example
523               # ctdb setvar MonitorInterval 20
524
525
526   lvs {master|list|status}
527       This command shows different aspects of LVS status. For an overview of
528       CTDB's LVS functionality please see the LVS section in ctdb(7).
529
530       master
531           Shows the PNN of the current LVS master node.
532
533           Example output:
534
535               2
536
537
538       list
539           Lists the currently usable LVS nodes.
540
541           Example output:
542
543               2 10.0.0.13
544               3 10.0.0.14
545
546
547       status
548           List the nodes in the current LVS group and their status.
549
550           Example output:
551
552               pnn:0 10.0.0.11        UNHEALTHY (THIS NODE)
553               pnn:1 10.0.0.12        UNHEALTHY
554               pnn:2 10.0.0.13        OK
555               pnn:3 10.0.0.14        OK
556
557
558   getcapabilities
559       This command shows the capabilities of the current node. See the
560       CAPABILITIES section in ctdb(7) for more details.
561
562       Example output:
563
564           RECMASTER: YES
565           LMASTER: YES
566
567
568   statistics
569       Collect statistics from the CTDB daemon about how many calls it has
570       served. Information about various fields in statistics can be found in
571       ctdb-statistics(7).
572
573       Example
574               # ctdb statistics
575               CTDB version 1
576               Current time of statistics  :                Tue Mar  8 15:18:51 2016
577               Statistics collected since  : (003 21:31:32) Fri Mar  4 17:47:19 2016
578                num_clients                        9
579                frozen                             0
580                recovering                         0
581                num_recoveries                     2
582                client_packets_sent          8170534
583                client_packets_recv          7166132
584                node_packets_sent           16549998
585                node_packets_recv            5244418
586                keepalive_packets_sent        201969
587                keepalive_packets_recv        201969
588                node
589                    req_call                      26
590                    reply_call                     0
591                    req_dmaster                    9
592                    reply_dmaster                 12
593                    reply_error                    0
594                    req_message              1339231
595                    req_control              8177506
596                    reply_control            6831284
597                client
598                    req_call                      15
599                    req_message               334809
600                    req_control              6831308
601                timeouts
602                    call                           0
603                    control                        0
604                    traverse                       0
605                locks
606                    num_calls                      8
607                    num_current                    0
608                    num_pending                    0
609                    num_failed                     0
610                total_calls                       15
611                pending_calls                      0
612                childwrite_calls                   0
613                pending_childwrite_calls             0
614                memory_used                   394879
615                max_hop_count                      1
616                total_ro_delegations               0
617                total_ro_revokes                   0
618                hop_count_buckets: 8 5 0 0 0 0 0 0 0 0 0 0 0 0 0 0
619                lock_buckets: 0 0 8 0 0 0 0 0 0 0 0 0 0 0 0 0
620                locks_latency      MIN/AVG/MAX     0.010005/0.010418/0.011010 sec out of 8
621                reclock_ctdbd      MIN/AVG/MAX     0.002538/0.002538/0.002538 sec out of 1
622                reclock_recd       MIN/AVG/MAX     0.000000/0.000000/0.000000 sec out of 0
623                call_latency       MIN/AVG/MAX     0.000044/0.002142/0.011702 sec out of 15
624                childwrite_latency MIN/AVG/MAX     0.000000/0.000000/0.000000 sec out of 0
625
626
627   statisticsreset
628       This command is used to clear all statistics counters in a node.
629
630       Example: ctdb statisticsreset
631
632   dbstatistics DB
633       Display statistics about the database DB. Information about various
634       fields in dbstatistics can be found in ctdb-statistics(7).
635
636       Example
637               # ctdb dbstatistics locking.tdb
638               DB Statistics: locking.tdb
639                ro_delegations                     0
640                ro_revokes                         0
641                locks
642                    total                      14356
643                    failed                         0
644                    current                        0
645                    pending                        0
646                hop_count_buckets: 28087 2 1 0 0 0 0 0 0 0 0 0 0 0 0 0
647                lock_buckets: 0 14188 38 76 32 19 3 0 0 0 0 0 0 0 0 0
648                locks_latency      MIN/AVG/MAX     0.001066/0.012686/4.202292 sec out of 14356
649                vacuum_latency     MIN/AVG/MAX     0.000472/0.002207/15.243570 sec out of 224530
650                Num Hot Keys:     1
651                    Count:8 Key:ff5bd7cb3ee3822edc1f0000000000000000000000000000
652
653
654   getreclock
655       Show details of the recovery lock, if any.
656
657       Example output:
658
659                /clusterfs/.ctdb/recovery.lock
660
661
662   getdebug
663       Get the current debug level for the node. the debug level controls what
664       information is written to the log file.
665
666       The debug levels are mapped to the corresponding syslog levels. When a
667       debug level is set, only those messages at that level and higher levels
668       will be printed.
669
670       The list of debug levels from highest to lowest are :
671
672       ERROR WARNING NOTICE INFO DEBUG
673
674   setdebug DEBUGLEVEL
675       Set the debug level of a node. This controls what information will be
676       logged.
677
678       The debuglevel is one of ERROR WARNING NOTICE INFO DEBUG
679
680   getpid
681       This command will return the process id of the ctdb daemon.
682
683   disable
684       This command is used to administratively disable a node in the cluster.
685       A disabled node will still participate in the cluster and host
686       clustered TDB records but its public ip address has been taken over by
687       a different node and it no longer hosts any services.
688
689   enable
690       Re-enable a node that has been administratively disabled.
691
692   stop
693       This command is used to administratively STOP a node in the cluster. A
694       STOPPED node is connected to the cluster but will not host any public
695       ip addresse, nor does it participate in the VNNMAP. The difference
696       between a DISABLED node and a STOPPED node is that a STOPPED node does
697       not host any parts of the database which means that a recovery is
698       required to stop/continue nodes.
699
700   continue
701       Re-start a node that has been administratively stopped.
702
703   addip IPADDR/mask IFACE
704       This command is used to add a new public ip to a node during runtime.
705       It should be followed by a ctdb ipreallocate. This allows public
706       addresses to be added to a cluster without having to restart the ctdb
707       daemons.
708
709       Note that this only updates the runtime instance of ctdb. Any changes
710       will be lost next time ctdb is restarted and the public addresses file
711       is re-read. If you want this change to be permanent you must also
712       update the public addresses file manually.
713
714   delip IPADDR
715       This command flags IPADDR for deletion from a node at runtime. It
716       should be followed by a ctdb ipreallocate. If IPADDR is currently
717       hosted by the node it is being removed from, this ensures that the IP
718       will first be failed over to another node, if possible, and that it is
719       then actually removed.
720
721       Note that this only updates the runtime instance of CTDB. Any changes
722       will be lost next time CTDB is restarted and the public addresses file
723       is re-read. If you want this change to be permanent you must also
724       update the public addresses file manually.
725
726   moveip IPADDR PNN
727       This command can be used to manually fail a public ip address to a
728       specific node.
729
730       In order to manually override the "automatic" distribution of public ip
731       addresses that ctdb normally provides, this command only works when you
732       have changed the tunables for the daemon to:
733
734       IPAllocAlgorithm != 0
735
736       NoIPFailback = 1
737
738   shutdown
739       This command will shutdown a specific CTDB daemon.
740
741   setlmasterrole on|off
742       This command is used to enable/disable the LMASTER capability for a
743       node at runtime. This capability determines whether or not a node can
744       be used as an LMASTER for records in the database. A node that does not
745       have the LMASTER capability will not show up in the vnnmap.
746
747       Nodes will by default have this capability, but it can be stripped off
748       nodes by the setting in the sysconfig file or by using this command.
749
750       Once this setting has been enabled/disabled, you need to perform a
751       recovery for it to take effect.
752
753       See also "ctdb getcapabilities"
754
755   setrecmasterrole on|off
756       This command is used to enable/disable the RECMASTER capability for a
757       node at runtime. This capability determines whether or not a node can
758       be used as an RECMASTER for the cluster. A node that does not have the
759       RECMASTER capability can not win a recmaster election. A node that
760       already is the recmaster for the cluster when the capability is
761       stripped off the node will remain the recmaster until the next cluster
762       election.
763
764       Nodes will by default have this capability, but it can be stripped off
765       nodes by the setting in the sysconfig file or by using this command.
766
767       See also "ctdb getcapabilities"
768
769   reloadnodes
770       This command is used when adding new nodes, or removing existing nodes
771       from an existing cluster.
772
773       Procedure to add nodes:
774
775        1. To expand an existing cluster, first ensure with ctdb status that
776           all nodes are up and running and that they are all healthy. Do not
777           try to expand a cluster unless it is completely healthy!
778
779        2. On all nodes, edit /etc/ctdb/nodes and add the new nodes at the end
780           of this file.
781
782        3. Verify that all the nodes have identical /etc/ctdb/nodes files
783           after adding the new nodes.
784
785        4. Run ctdb reloadnodes to force all nodes to reload the nodes file.
786
787        5. Use ctdb status on all nodes and verify that they now show the
788           additional nodes.
789
790        6. Install and configure the new node and bring it online.
791
792       Procedure to remove nodes:
793
794        1. To remove nodes from an existing cluster, first ensure with ctdb
795           status that all nodes, except the node to be deleted, are up and
796           running and that they are all healthy. Do not try to remove nodes
797           from a cluster unless the cluster is completely healthy!
798
799        2. Shutdown and power off the node to be removed.
800
801        3. On all other nodes, edit the /etc/ctdb/nodes file and comment out
802           the nodes to be removed.  Do not delete the lines for the deleted
803           nodes, just comment them out by adding a '#' at the beginning of
804           the lines.
805
806        4. Run ctdb reloadnodes to force all nodes to reload the nodes file.
807
808        5. Use ctdb status on all nodes and verify that the deleted nodes are
809           no longer listed.
810
811   reloadips [PNN-LIST]
812       This command reloads the public addresses configuration file on the
813       specified nodes. When it completes addresses will be reconfigured and
814       reassigned across the cluster as necessary.
815
816       This command is currently unable to make changes to the netmask or
817       interfaces associated with existing addresses. Such changes must be
818       made in 2 steps by deleting addresses in question and re-adding then.
819       Unfortunately this will disrupt connections to the changed addresses.
820
821   getdbmap
822       This command lists all clustered TDB databases that the CTDB daemon has
823       attached to. Some databases are flagged as PERSISTENT, this means that
824       the database stores data persistently and the data will remain across
825       reboots. One example of such a database is secrets.tdb where
826       information about how the cluster was joined to the domain is stored.
827       Some database are flagged as REPLICATED, this means that the data in
828       that database is replicated across all the nodes. But the data will not
829       remain across reboots. This type of database is used by CTDB to store
830       it's internal state.
831
832       If a PERSISTENT database is not in a healthy state the database is
833       flagged as UNHEALTHY. If there's at least one completely healthy node
834       running in the cluster, it's possible that the content is restored by a
835       recovery run automatically. Otherwise an administrator needs to analyze
836       the problem.
837
838       See also "ctdb getdbstatus", "ctdb backupdb", "ctdb restoredb", "ctdb
839       dumpbackup", "ctdb wipedb", "ctdb setvar AllowUnhealthyDBRead 1" and
840       (if samba or tdb-utils are installed) "tdbtool check".
841
842       Most databases are not persistent and only store the state information
843       that the currently running samba daemons need. These databases are
844       always wiped when ctdb/samba starts and when a node is rebooted.
845
846       Example
847               # ctdb getdbmap
848               Number of databases:10
849               dbid:0x435d3410 name:notify.tdb path:/var/lib/ctdb/notify.tdb.0
850               dbid:0x42fe72c5 name:locking.tdb path:/var/lib/ctdb/locking.tdb.0
851               dbid:0x1421fb78 name:brlock.tdb path:/var/lib/ctdb/brlock.tdb.0
852               dbid:0x17055d90 name:connections.tdb path:/var/lib/ctdb/connections.tdb.0
853               dbid:0xc0bdde6a name:sessionid.tdb path:/var/lib/ctdb/sessionid.tdb.0
854               dbid:0x122224da name:test.tdb path:/var/lib/ctdb/test.tdb.0
855               dbid:0x2672a57f name:idmap2.tdb path:/var/lib/ctdb/persistent/idmap2.tdb.0 PERSISTENT
856               dbid:0xb775fff6 name:secrets.tdb path:/var/lib/ctdb/persistent/secrets.tdb.0 PERSISTENT
857               dbid:0xe98e08b6 name:group_mapping.tdb path:/var/lib/ctdb/persistent/group_mapping.tdb.0 PERSISTENT
858               dbid:0x7bbbd26c name:passdb.tdb path:/var/lib/ctdb/persistent/passdb.tdb.0 PERSISTENT
859
860               # ctdb getdbmap  # example for unhealthy database
861               Number of databases:1
862               dbid:0xb775fff6 name:secrets.tdb path:/var/lib/ctdb/persistent/secrets.tdb.0 PERSISTENT UNHEALTHY
863
864               # ctdb -X getdbmap
865               |ID|Name|Path|Persistent|Unhealthy|
866               |0x7bbbd26c|passdb.tdb|/var/lib/ctdb/persistent/passdb.tdb.0|1|0|
867
868
869   backupdb DB FILE
870       Copy the contents of database DB to FILE. FILE can later be read back
871       using restoredb. This is mainly useful for backing up persistent
872       databases such as secrets.tdb and similar.
873
874   restoredb FILE [DB]
875       This command restores a persistent database that was previously backed
876       up using backupdb. By default the data will be restored back into the
877       same database as it was created from. By specifying dbname you can
878       restore the data into a different database.
879
880   setdbreadonly DB
881       This command will enable the read-only record support for a database.
882       This is an experimental feature to improve performance for contended
883       records primarily in locking.tdb and brlock.tdb. When enabling this
884       feature you must set it on all nodes in the cluster.
885
886   setdbsticky DB
887       This command will enable the sticky record support for the specified
888       database. This is an experimental feature to improve performance for
889       contended records primarily in locking.tdb and brlock.tdb. When
890       enabling this feature you must set it on all nodes in the cluster.
891

INTERNAL COMMANDS

893       Internal commands are used by CTDB's scripts and are not required for
894       managing a CTDB cluster. Their parameters and behaviour are subject to
895       change.
896
897   gettickles IPADDR
898       Show TCP connections that are registered with CTDB to be "tickled" if
899       there is a failover.
900
901   gratarp IPADDR INTERFACE
902       Send out a gratuitous ARP for the specified interface through the
903       specified interface. This command is mainly used by the ctdb
904       eventscripts.
905
906   pdelete DB KEY
907       Delete KEY from DB.
908
909   pfetch DB KEY
910       Print the value associated with KEY in DB.
911
912   pstore DB KEY FILE
913       Store KEY in DB with contents of FILE as the associated value.
914
915   ptrans DB [FILE]
916       Read a list of key-value pairs, one per line from FILE, and store them
917       in DB using a single transaction. An empty value is equivalent to
918       deleting the given key.
919
920       The key and value should be separated by spaces or tabs. Each key/value
921       should be a printable string enclosed in double-quotes.
922
923   runstate [setup|first_recovery|startup|running]
924       Print the runstate of the specified node. Runstates are used to
925       serialise important state transitions in CTDB, particularly during
926       startup.
927
928       If one or more optional runstate arguments are specified then the node
929       must be in one of these runstates for the command to succeed.
930
931       Example
932               # ctdb runstate
933               RUNNING
934
935
936   setifacelink IFACE up|down
937       Set the internal state of network interface IFACE. This is typically
938       used in the 10.interface script in the "monitor" event.
939
940       Example: ctdb setifacelink eth0 up
941
942   tickle
943       Read a list of TCP connections, one per line, from standard input and
944       send a TCP tickle to the source host for each connection. A connection
945       is specified as:
946
947                SRC-IPADDR:SRC-PORT DST-IPADDR:DST-PORT
948
949
950       A single connection can be specified on the command-line rather than on
951       standard input.
952
953       A TCP tickle is a TCP ACK packet with an invalid sequence and
954       acknowledge number and will when received by the source host result in
955       it sending an immediate correct ACK back to the other end.
956
957       TCP tickles are useful to "tickle" clients after a IP failover has
958       occurred since this will make the client immediately recognize the TCP
959       connection has been disrupted and that the client will need to
960       reestablish. This greatly speeds up the time it takes for a client to
961       detect and reestablish after an IP failover in the ctdb cluster.
962
963   version
964       Display the CTDB version.
965

DEBUGGING COMMANDS

967       These commands are primarily used for CTDB development and testing and
968       should not be used for normal administration.
969
970   OPTIONS
971       --print-emptyrecords
972           This enables printing of empty records when dumping databases with
973           the catdb, cattbd and dumpdbbackup commands. Records with empty
974           data segment are considered deleted by ctdb and cleaned by the
975           vacuuming mechanism, so this switch can come in handy for debugging
976           the vacuuming behaviour.
977
978       --print-datasize
979           This lets database dumps (catdb, cattdb, dumpdbbackup) print the
980           size of the record data instead of dumping the data contents.
981
982       --print-lmaster
983           This lets catdb print the lmaster for each record.
984
985       --print-hash
986           This lets database dumps (catdb, cattdb, dumpdbbackup) print the
987           hash for each record.
988
989       --print-recordflags
990           This lets catdb and dumpdbbackup print the record flags for each
991           record. Note that cattdb always prints the flags.
992
993   process-exists PID [SRVID]
994       This command checks if a specific process exists on the CTDB host. This
995       is mainly used by Samba to check if remote instances of samba are still
996       running or not. When the optional SRVID argument is specified, the
997       command check if a specific process exists on the CTDB host and has
998       registered for specified SRVID.
999
1000   getdbstatus DB
1001       This command displays more details about a database.
1002
1003       Example
1004               # ctdb getdbstatus test.tdb.0
1005               dbid: 0x122224da
1006               name: test.tdb
1007               path: /var/lib/ctdb/test.tdb.0
1008               PERSISTENT: no
1009               HEALTH: OK
1010
1011               # ctdb getdbstatus registry.tdb  # with a corrupted TDB
1012               dbid: 0xf2a58948
1013               name: registry.tdb
1014               path: /var/lib/ctdb/persistent/registry.tdb.0
1015               PERSISTENT: yes
1016               HEALTH: NO-HEALTHY-NODES - ERROR - Backup of corrupted TDB in '/var/lib/ctdb/persistent/registry.tdb.0.corrupted.20091208091949.0Z'
1017
1018
1019   catdb DB
1020       Print a dump of the clustered TDB database DB.
1021
1022   cattdb DB
1023       Print a dump of the contents of the local TDB database DB.
1024
1025   dumpdbbackup FILE
1026       Print a dump of the contents from database backup FILE, similar to
1027       catdb.
1028
1029   wipedb DB
1030       Remove all contents of database DB.
1031
1032   recover
1033       This command will trigger the recovery daemon to do a cluster recovery.
1034
1035   ipreallocate, sync
1036       This command will force the recovery master to perform a full ip
1037       reallocation process and redistribute all ip addresses. This is useful
1038       to "reset" the allocations back to its default state if they have been
1039       changed using the "moveip" command. While a "recover" will also perform
1040       this reallocation, a recovery is much more hevyweight since it will
1041       also rebuild all the databases.
1042
1043   attach DBNAME [persistent|replicated]
1044       Create a new CTDB database called DBNAME and attach to it on all nodes.
1045
1046   detach DB-LIST
1047       Detach specified non-persistent database(s) from the cluster. This
1048       command will disconnect specified database(s) on all nodes in the
1049       cluster. This command should only be used when none of the specified
1050       database(s) are in use.
1051
1052       All nodes should be active and tunable AllowClientDBAccess should be
1053       disabled on all nodes before detaching databases.
1054
1055   dumpmemory
1056       This is a debugging command. This command will make the ctdb daemon to
1057       write a fill memory allocation map to standard output.
1058
1059   rddumpmemory
1060       This is a debugging command. This command will dump the talloc memory
1061       allocation tree for the recovery daemon to standard output.
1062
1063   ban BANTIME
1064       Administratively ban a node for BANTIME seconds. The node will be
1065       unbanned after BANTIME seconds have elapsed.
1066
1067       A banned node does not participate in the cluster. It does not host any
1068       records for the clustered TDB and does not host any public IP
1069       addresses.
1070
1071       Nodes are automatically banned if they misbehave. For example, a node
1072       may be banned if it causes too many cluster recoveries.
1073
1074       To administratively exclude a node from a cluster use the stop command.
1075
1076   unban
1077       This command is used to unban a node that has either been
1078       administratively banned using the ban command or has been automatically
1079       banned.
1080

AUTHOR

1086       This documentation was written by Ronnie Sahlberg, Amitay Isaacs,
1087       Martin Schwenke
1088

COPYRIGHT

1090       Copyright © 2007 Andrew Tridgell, Ronnie Sahlberg
1091
1092       This program is free software; you can redistribute it and/or modify it
1093       under the terms of the GNU General Public License as published by the
1094       Free Software Foundation; either version 3 of the License, or (at your
1095       option) any later version.
1096
1097       This program is distributed in the hope that it will be useful, but
1098       WITHOUT ANY WARRANTY; without even the implied warranty of
1099       MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
1100       General Public License for more details.
1101
1102       You should have received a copy of the GNU General Public License along
1103       with this program; if not, see http://www.gnu.org/licenses.
1104
1105
1106
1107
1108ctdb                              04/28/2020                           CTDB(1)