ctdb-tunables(7)

1CTDB-TUNABLES(7)         CTDB - clustered TDB database        CTDB-TUNABLES(7)
2
3
4

NAME

6       ctdb-tunables - CTDB tunable configuration variables
7

DESCRIPTION

9       CTDB's behaviour can be configured by setting run-time tunable
10       variables. This lists and describes all tunables. See the ctdb(1)
11       listvars, setvar and getvar commands for more details.
12
13       Unless otherwise stated, tunables should be set to the same value on
14       all nodes. Setting tunables to different values across nodes may
15       produce unexpected results. Future releases may set (some or most)
16       tunables globally across the cluster but doing so is currently a manual
17       process.
18
19       Tunables can be set at startup from the /etc/ctdb/ctdb.tunables
20       configuration file.
21
22           TUNABLE=VALUE
23
24
25       Comment lines beginning with '#' are permitted. Whitespace may be used
26       for formatting/alignment. VALUE must be a non-negative integer and must
27       be the last thing on a line (i.e. no trailing garbage, trailing
28       comments are not permitted).
29
30       For example:
31
32           MonitorInterval=20
33
34
35       The available tunable variables are listed alphabetically below.
36
37   AllowClientDBAttach
38       Default: 1
39
40       When set to 0, clients are not allowed to attach to any databases. This
41       can be used to temporarily block any new processes from attaching to
42       and accessing the databases. This is mainly used for detaching a
43       volatile database using 'ctdb detach'.
44
45   AllowMixedVersions
46       Default: 0
47
48       CTDB will not allow incompatible versions to co-exist in a cluster. If
49       a version mismatch is found, then losing CTDB will shutdown. To disable
50       the incompatible version check, set this tunable to 1.
51
52       For version checking, CTDB uses major and minor version. For example,
53       CTDB 4.6.1 and CTDB 4.6.2 are matching versions; CTDB 4.5.x and CTDB
54       4.6.y do not match.
55
56       CTDB with version check support will lose to CTDB without version check
57       support. Between two different CTDB versions with version check
58       support, one running for less time will lose. If the running time for
59       both CTDB versions with version check support is equal (to seconds),
60       then the older version will lose. The losing CTDB daemon will shutdown.
61
62   AllowUnhealthyDBRead
63       Default: 0
64
65       When set to 1, ctdb allows database traverses to read unhealthy
66       databases. By default, ctdb does not allow reading records from
67       unhealthy databases.
68
69   ControlTimeout
70       Default: 60
71
72       This is the default setting for timeout for when sending a control
73       message to either the local or a remote ctdb daemon.
74
75   DatabaseHashSize
76       Default: 100001
77
78       Number of the hash chains for the local store of the tdbs that ctdb
79       manages.
80
81   DatabaseMaxDead
82       Default: 5
83
84       Maximum number of dead records per hash chain for the tdb databses
85       managed by ctdb.
86
87   DBRecordCountWarn
88       Default: 100000
89
90       When set to non-zero, ctdb will log a warning during recovery if a
91       database has more than this many records. This will produce a warning
92       if a database grows uncontrollably with orphaned records.
93
94   DBRecordSizeWarn
95       Default: 10000000
96
97       When set to non-zero, ctdb will log a warning during recovery if a
98       single record is bigger than this size. This will produce a warning if
99       a database record grows uncontrollably.
100
101   DBSizeWarn
102       Default: 1000000000
103
104       When set to non-zero, ctdb will log a warning during recovery if a
105       database size is bigger than this. This will produce a warning if a
106       database grows uncontrollably.
107
108   DeferredAttachTO
109       Default: 120
110
111       When databases are frozen we do not allow clients to attach to the
112       databases. Instead of returning an error immediately to the client, the
113       attach request from the client is deferred until the database becomes
114       available again at which stage we respond to the client.
115
116       This timeout controls how long we will defer the request from the
117       client before timing it out and returning an error to the client.
118
119   ElectionTimeout
120       Default: 3
121
122       The number of seconds to wait for the election of recovery master to
123       complete. If the election is not completed during this interval, then
124       that round of election fails and ctdb starts a new election.
125
126   EnableBans
127       Default: 1
128
129       This parameter allows ctdb to ban a node if the node is misbehaving.
130
131       When set to 0, this disables banning completely in the cluster and thus
132       nodes can not get banned, even it they break. Don't set to 0 unless you
133       know what you are doing.
134
135   EventScriptTimeout
136       Default: 30
137
138       Maximum time in seconds to allow an event to run before timing out.
139       This is the total time for all enabled scripts that are run for an
140       event, not just a single event script.
141
142       Note that timeouts are ignored for some events ("takeip", "releaseip",
143       "startrecovery", "recovered") and converted to success. The logic here
144       is that the callers of these events implement their own additional
145       timeout.
146
147   FetchCollapse
148       Default: 1
149
150       This parameter is used to avoid multiple migration requests for the
151       same record from a single node. All the record requests for the same
152       record are queued up and processed when the record is migrated to the
153       current node.
154
155       When many clients across many nodes try to access the same record at
156       the same time this can lead to a fetch storm where the record becomes
157       very active and bounces between nodes very fast. This leads to high CPU
158       utilization of the ctdbd daemon, trying to bounce that record around
159       very fast, and poor performance. This can improve performance and
160       reduce CPU utilization for certain workloads.
161
162   HopcountMakeSticky
163       Default: 50
164
165       For database(s) marked STICKY (using 'ctdb setdbsticky'), any record
166       that is migrating so fast that hopcount exceeds this limit is marked as
167       STICKY record for StickyDuration seconds. This means that after each
168       migration the sticky record will be kept on the node
169       StickyPindownmilliseconds and prevented from being migrated off the
170       node.
171
172       This will improve performance for certain workloads, such as
173       locking.tdb if many clients are opening/closing the same file
174       concurrently.
175
176   IPAllocAlgorithm
177       Default: 2
178
179       Selects the algorithm that CTDB should use when doing public IP address
180       allocation. Meaningful values are:
181
182       0
183           Deterministic IP address allocation.
184
185           This is a simple and fast option. However, it can cause unnecessary
186           address movement during fail-over because each address has a "home"
187           node. Works badly when some nodes do not have any addresses
188           defined. Should be used with care when addresses are defined across
189           multiple networks.
190
191       1
192           Non-deterministic IP address allocation.
193
194           This is a relatively fast option that attempts to do a minimise
195           unnecessary address movements. Addresses do not have a "home" node.
196           Rebalancing is limited but it usually adequate. Works badly when
197           addresses are defined across multiple networks.
198
199       2
200           LCP2 IP address allocation.
201
202           Uses a heuristic to assign addresses defined across multiple
203           networks, usually balancing addresses on each network evenly across
204           nodes. Addresses do not have a "home" node. Minimises unnecessary
205           address movements. The algorithm is complex, so is slower than
206           other choices for a large number of addresses. However, it can
207           calculate an optimal assignment of 900 addresses in under 10
208           seconds on modern hardware.
209
210       If the specified value is not one of these then the default will be
211       used.
212
213   KeepaliveInterval
214       Default: 5
215
216       How often in seconds should the nodes send keep-alive packets to each
217       other.
218
219   KeepaliveLimit
220       Default: 5
221
222       After how many keepalive intervals without any traffic should a node
223       wait until marking the peer as DISCONNECTED.
224
225       If a node has hung, it can take KeepaliveInterval * (KeepaliveLimit +
226       1) seconds before ctdb determines that the node is DISCONNECTED and
227       performs a recovery. This limit should not be set too high to enable
228       early detection and avoid any application timeouts (e.g. SMB1) to kick
229       in before the fail over is completed.
230
231   LockProcessesPerDB
232       Default: 200
233
234       This is the maximum number of lock helper processes ctdb will create
235       for obtaining record locks. When ctdb cannot get a record lock without
236       blocking, it creates a helper process that waits for the lock to be
237       obtained.
238
239   LogLatencyMs
240       Default: 0
241
242       When set to non-zero, ctdb will log if certains operations take longer
243       than this value, in milliseconds, to complete. These operations include
244       "process a record request from client", "take a record or database
245       lock", "update a persistent database record" and "vacuum a database".
246
247   MaxQueueDropMsg
248       Default: 1000000
249
250       This is the maximum number of messages to be queued up for a client
251       before ctdb will treat the client as hung and will terminate the client
252       connection.
253
254   MonitorInterval
255       Default: 15
256
257       How often should ctdb run the 'monitor' event in seconds to check for a
258       node's health.
259
260   MonitorTimeoutCount
261       Default: 20
262
263       How many 'monitor' events in a row need to timeout before a node is
264       flagged as UNHEALTHY. This setting is useful if scripts can not be
265       written so that they do not hang for benign reasons.
266
267   NoIPFailback
268       Default: 0
269
270       When set to 1, ctdb will not perform failback of IP addresses when a
271       node becomes healthy. When a node becomes UNHEALTHY, ctdb WILL perform
272       failover of public IP addresses, but when the node becomes HEALTHY
273       again, ctdb will not fail the addresses back.
274
275       Use with caution! Normally when a node becomes available to the cluster
276       ctdb will try to reassign public IP addresses onto the new node as a
277       way to distribute the workload evenly across the clusternode. Ctdb
278       tries to make sure that all running nodes have approximately the same
279       number of public addresses it hosts.
280
281       When you enable this tunable, ctdb will no longer attempt to rebalance
282       the cluster by failing IP addresses back to the new nodes. An
283       unbalanced cluster will therefore remain unbalanced until there is
284       manual intervention from the administrator. When this parameter is set,
285       you can manually fail public IP addresses over to the new node(s) using
286       the 'ctdb moveip' command.
287
288   NoIPTakeover
289       Default: 0
290
291       When set to 1, ctdb will not allow IP addresses to be failed over to
292       other nodes. Any IP addresses already hosted on healthy nodes will
293       remain. Any IP addresses hosted on unhealthy nodes will be released by
294       unhealthy nodes and will become un-hosted.
295
296   PullDBPreallocation
297       Default: 10*1024*1024
298
299       This is the size of a record buffer to pre-allocate for sending reply
300       to PULLDB control. Usually record buffer starts with size of the first
301       record and gets reallocated every time a new record is added to the
302       record buffer. For a large number of records, this can be very
303       inefficient to grow the record buffer one record at a time.
304
305   QueueBufferSize
306       Default: 1024
307
308       This is the maximum amount of data (in bytes) ctdb will read from a
309       socket at a time.
310
311       For a busy setup, if ctdb is not able to process the TCP sockets fast
312       enough (large amount of data in Recv-Q for tcp sockets), then this
313       tunable value should be increased. However, large values can keep ctdb
314       busy processing packets and prevent ctdb from handling other events.
315
316   RecBufferSizeLimit
317       Default: 1000000
318
319       This is the limit on the size of the record buffer to be sent in
320       various controls. This limit is used by new controls used for recovery
321       and controls used in vacuuming.
322
323   RecdFailCount
324       Default: 10
325
326       If the recovery daemon has failed to ping the main daemon for this many
327       consecutive intervals, the main daemon will consider the recovery
328       daemon as hung and will try to restart it to recover.
329
330   RecdPingTimeout
331       Default: 60
332
333       If the main daemon has not heard a "ping" from the recovery daemon for
334       this many seconds, the main daemon will log a message that the recovery
335       daemon is potentially hung. This also increments a counter which is
336       checked against RecdFailCount for detection of hung recovery daemon.
337
338   RecLockLatencyMs
339       Default: 1000
340
341       When using a reclock file for split brain prevention, if set to
342       non-zero this tunable will make the recovery daemon log a message if
343       the fcntl() call to lock/testlock the recovery file takes longer than
344       this number of milliseconds.
345
346   RecoverInterval
347       Default: 1
348
349       How frequently in seconds should the recovery daemon perform the
350       consistency checks to determine if it should perform a recovery.
351
352   RecoverTimeout
353       Default: 120
354
355       This is the default setting for timeouts for controls when sent from
356       the recovery daemon. We allow longer control timeouts from the recovery
357       daemon than from normal use since the recovery daemon often use
358       controls that can take a lot longer than normal controls.
359
360   RecoveryBanPeriod
361       Default: 300
362
363       The duration in seconds for which a node is banned if the node fails
364       during recovery. After this time has elapsed the node will
365       automatically get unbanned and will attempt to rejoin the cluster.
366
367       A node usually gets banned due to real problems with the node. Don't
368       set this value too small. Otherwise, a problematic node will try to
369       re-join cluster too soon causing unnecessary recoveries.
370
371   RecoveryDropAllIPs
372       Default: 120
373
374       If a node is stuck in recovery, or stopped, or banned, for this many
375       seconds, then ctdb will release all public addresses on that node.
376
377   RecoveryGracePeriod
378       Default: 120
379
380       During recoveries, if a node has not caused recovery failures during
381       the last grace period in seconds, any records of transgressions that
382       the node has caused recovery failures will be forgiven. This resets the
383       ban-counter back to zero for that node.
384
385   RepackLimit
386       Default: 10000
387
388       During vacuuming, if the number of freelist records are more than
389       RepackLimit, then the database is repacked to get rid of the freelist
390       records to avoid fragmentation.
391
392   RerecoveryTimeout
393       Default: 10
394
395       Once a recovery has completed, no additional recoveries are permitted
396       until this timeout in seconds has expired.
397
398   SeqnumInterval
399       Default: 1000
400
401       Some databases have seqnum tracking enabled, so that samba will be able
402       to detect asynchronously when there has been updates to the database.
403       Every time a database is updated its sequence number is increased.
404
405       This tunable is used to specify in milliseconds how frequently ctdb
406       will send out updates to remote nodes to inform them that the sequence
407       number is increased.
408
409   StatHistoryInterval
410       Default: 1
411
412       Granularity of the statistics collected in the statistics history. This
413       is reported by 'ctdb stats' command.
414
415   StickyDuration
416       Default: 600
417
418       Once a record has been marked STICKY, this is the duration in seconds,
419       the record will be flagged as a STICKY record.
420
421   StickyPindown
422       Default: 200
423
424       Once a STICKY record has been migrated onto a node, it will be pinned
425       down on that node for this number of milliseconds. Any request from
426       other nodes to migrate the record off the node will be deferred.
427
428   TakeoverTimeout
429       Default: 9
430
431       This is the duration in seconds in which ctdb tries to complete IP
432       failover.
433
434   TickleUpdateInterval
435       Default: 20
436
437       Every TickleUpdateInterval seconds, ctdb synchronizes the client
438       connection information across nodes.
439
440   TraverseTimeout
441       Default: 20
442
443       This is the duration in seconds for which a database traverse is
444       allowed to run. If the traverse does not complete during this interval,
445       ctdb will abort the traverse.
446
447   VacuumFastPathCount
448       Default: 60
449
450       During a vacuuming run, ctdb usually processes only the records marked
451       for deletion also called the fast path vacuuming. After finishing
452       VacuumFastPathCount number of fast path vacuuming runs, ctdb will
453       trigger a scan of complete database for any empty records that need to
454       be deleted.
455
456   VacuumInterval
457       Default: 10
458
459       Periodic interval in seconds when vacuuming is triggered for volatile
460       databases.
461
462   VacuumMaxRunTime
463       Default: 120
464
465       The maximum time in seconds for which the vacuuming process is allowed
466       to run. If vacuuming process takes longer than this value, then the
467       vacuuming process is terminated.
468
469   VerboseMemoryNames
470       Default: 0
471
472       When set to non-zero, ctdb assigns verbose names for some of the talloc
473       allocated memory objects. These names are visible in the talloc memory
474       report generated by 'ctdb dumpmemory'.
475

FILES>

477           /etc/ctdb/ctdb.tunables
478

AUTHOR

483       This documentation was written by Ronnie Sahlberg, Amitay Isaacs,
484       Martin Schwenke
485

COPYRIGHT

487       Copyright © 2007 Andrew Tridgell, Ronnie Sahlberg
488
489       This program is free software; you can redistribute it and/or modify it
490       under the terms of the GNU General Public License as published by the
491       Free Software Foundation; either version 3 of the License, or (at your
492       option) any later version.
493
494       This program is distributed in the hope that it will be useful, but
495       WITHOUT ANY WARRANTY; without even the implied warranty of
496       MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
497       General Public License for more details.
498
499       You should have received a copy of the GNU General Public License along
500       with this program; if not, see http://www.gnu.org/licenses.
501
502
503
504
505ctdb                              11/30/2023                  CTDB-TUNABLES(7)

NAME

DESCRIPTION

FILES>

SEE ALSO

AUTHOR

COPYRIGHT