ctdb-tunables(7)

1CTDB-TUNABLES(7)         CTDB - clustered TDB database        CTDB-TUNABLES(7)
2
3
4

NAME

6       ctdb-tunables - CTDB tunable configuration variables
7

DESCRIPTION

9       CTDB's behaviour can be configured by setting run-time tunable
10       variables. This lists and describes all tunables. See the ctdb(1)
11       listvars, setvar and getvar commands for more details.
12
13       Unless otherwise stated, tunables should be set to the same value on
14       all nodes. Setting tunables to different values across nodes may
15       produce unexpected results. Future releases may set (some or most)
16       tunables globally across the cluster but doing so is currently a manual
17       process.
18
19       Tunables can be set at startup from the /etc/ctdb/ctdb.tunables
20       configuration file.
21
22           TUNABLE=VALUE
23
24
25       For example:
26
27           MonitorInterval=20
28
29
30       The available tunable variables are listed alphabetically below.
31
32   AllowClientDBAttach
33       Default: 1
34
35       When set to 0, clients are not allowed to attach to any databases. This
36       can be used to temporarily block any new processes from attaching to
37       and accessing the databases. This is mainly used for detaching a
38       volatile database using 'ctdb detach'.
39
40   AllowMixedVersions
41       Default: 0
42
43       CTDB will not allow incompatible versions to co-exist in a cluster. If
44       a version mismatch is found, then losing CTDB will shutdown. To disable
45       the incompatible version check, set this tunable to 1.
46
47       For version checking, CTDB uses major and minor version. For example,
48       CTDB 4.6.1 and CTDB CTDB 4.6.2 are matching versions; CTDB 4.5.x and
49       CTDB 4.6.y do not match.
50
51       CTDB with version check support will lose to CTDB without version check
52       support. Between two different CTDB versions with version check
53       support, one running for less time will lose. If the running time for
54       both CTDB versions with version check support is equal (to seconds),
55       then the older version will lose. The losing CTDB daemon will shutdown.
56
57   AllowUnhealthyDBRead
58       Default: 0
59
60       When set to 1, ctdb allows database traverses to read unhealthy
61       databases. By default, ctdb does not allow reading records from
62       unhealthy databases.
63
64   ControlTimeout
65       Default: 60
66
67       This is the default setting for timeout for when sending a control
68       message to either the local or a remote ctdb daemon.
69
70   DatabaseHashSize
71       Default: 100001
72
73       Number of the hash chains for the local store of the tdbs that ctdb
74       manages.
75
76   DatabaseMaxDead
77       Default: 5
78
79       Maximum number of dead records per hash chain for the tdb databses
80       managed by ctdb.
81
82   DBRecordCountWarn
83       Default: 100000
84
85       When set to non-zero, ctdb will log a warning during recovery if a
86       database has more than this many records. This will produce a warning
87       if a database grows uncontrollably with orphaned records.
88
89   DBRecordSizeWarn
90       Default: 10000000
91
92       When set to non-zero, ctdb will log a warning during recovery if a
93       single record is bigger than this size. This will produce a warning if
94       a database record grows uncontrollably.
95
96   DBSizeWarn
97       Default: 1000000000
98
99       When set to non-zero, ctdb will log a warning during recovery if a
100       database size is bigger than this. This will produce a warning if a
101       database grows uncontrollably.
102
103   DeferredAttachTO
104       Default: 120
105
106       When databases are frozen we do not allow clients to attach to the
107       databases. Instead of returning an error immediately to the client, the
108       attach request from the client is deferred until the database becomes
109       available again at which stage we respond to the client.
110
111       This timeout controls how long we will defer the request from the
112       client before timing it out and returning an error to the client.
113
114   ElectionTimeout
115       Default: 3
116
117       The number of seconds to wait for the election of recovery master to
118       complete. If the election is not completed during this interval, then
119       that round of election fails and ctdb starts a new election.
120
121   EnableBans
122       Default: 1
123
124       This parameter allows ctdb to ban a node if the node is misbehaving.
125
126       When set to 0, this disables banning completely in the cluster and thus
127       nodes can not get banned, even it they break. Don't set to 0 unless you
128       know what you are doing.
129
130   EventScriptTimeout
131       Default: 30
132
133       Maximum time in seconds to allow an event to run before timing out.
134       This is the total time for all enabled scripts that are run for an
135       event, not just a single event script.
136
137       Note that timeouts are ignored for some events ("takeip", "releaseip",
138       "startrecovery", "recovered") and converted to success. The logic here
139       is that the callers of these events implement their own additional
140       timeout.
141
142   FetchCollapse
143       Default: 1
144
145       This parameter is used to avoid multiple migration requests for the
146       same record from a single node. All the record requests for the same
147       record are queued up and processed when the record is migrated to the
148       current node.
149
150       When many clients across many nodes try to access the same record at
151       the same time this can lead to a fetch storm where the record becomes
152       very active and bounces between nodes very fast. This leads to high CPU
153       utilization of the ctdbd daemon, trying to bounce that record around
154       very fast, and poor performance. This can improve performance and
155       reduce CPU utilization for certain workloads.
156
157   HopcountMakeSticky
158       Default: 50
159
160       For database(s) marked STICKY (using 'ctdb setdbsticky'), any record
161       that is migrating so fast that hopcount exceeds this limit is marked as
162       STICKY record for StickyDuration seconds. This means that after each
163       migration the sticky record will be kept on the node
164       StickyPindownmilliseconds and prevented from being migrated off the
165       node.
166
167       This will improve performance for certain workloads, such as
168       locking.tdb if many clients are opening/closing the same file
169       concurrently.
170
171   IPAllocAlgorithm
172       Default: 2
173
174       Selects the algorithm that CTDB should use when doing public IP address
175       allocation. Meaningful values are:
176
177       0
178           Deterministic IP address allocation.
179
180           This is a simple and fast option. However, it can cause unnecessary
181           address movement during fail-over because each address has a "home"
182           node. Works badly when some nodes do not have any addresses
183           defined. Should be used with care when addresses are defined across
184           multiple networks.
185
186       1
187           Non-deterministic IP address allocation.
188
189           This is a relatively fast option that attempts to do a minimise
190           unnecessary address movements. Addresses do not have a "home" node.
191           Rebalancing is limited but it usually adequate. Works badly when
192           addresses are defined across multiple networks.
193
194       2
195           LCP2 IP address allocation.
196
197           Uses a heuristic to assign addresses defined across multiple
198           networks, usually balancing addresses on each network evenly across
199           nodes. Addresses do not have a "home" node. Minimises unnecessary
200           address movements. The algorithm is complex, so is slower than
201           other choices for a large number of addresses. However, it can
202           calculate an optimal assignment of 900 addresses in under 10
203           seconds on modern hardware.
204
205       If the specified value is not one of these then the default will be
206       used.
207
208   KeepaliveInterval
209       Default: 5
210
211       How often in seconds should the nodes send keep-alive packets to each
212       other.
213
214   KeepaliveLimit
215       Default: 5
216
217       After how many keepalive intervals without any traffic should a node
218       wait until marking the peer as DISCONNECTED.
219
220       If a node has hung, it can take KeepaliveInterval * (KeepaliveLimit +
221       1) seconds before ctdb determines that the node is DISCONNECTED and
222       performs a recovery. This limit should not be set too high to enable
223       early detection and avoid any application timeouts (e.g. SMB1) to kick
224       in before the fail over is completed.
225
226   LockProcessesPerDB
227       Default: 200
228
229       This is the maximum number of lock helper processes ctdb will create
230       for obtaining record locks. When ctdb cannot get a record lock without
231       blocking, it creates a helper process that waits for the lock to be
232       obtained.
233
234   LogLatencyMs
235       Default: 0
236
237       When set to non-zero, ctdb will log if certains operations take longer
238       than this value, in milliseconds, to complete. These operations include
239       "process a record request from client", "take a record or database
240       lock", "update a persistent database record" and "vaccum a database".
241
242   MaxQueueDropMsg
243       Default: 1000000
244
245       This is the maximum number of messages to be queued up for a client
246       before ctdb will treat the client as hung and will terminate the client
247       connection.
248
249   MonitorInterval
250       Default: 15
251
252       How often should ctdb run the 'monitor' event in seconds to check for a
253       node's health.
254
255   MonitorTimeoutCount
256       Default: 20
257
258       How many 'monitor' events in a row need to timeout before a node is
259       flagged as UNHEALTHY. This setting is useful if scripts can not be
260       written so that they do not hang for benign reasons.
261
262   NoIPFailback
263       Default: 0
264
265       When set to 1, ctdb will not perform failback of IP addresses when a
266       node becomes healthy. When a node becomes UNHEALTHY, ctdb WILL perform
267       failover of public IP addresses, but when the node becomes HEALTHY
268       again, ctdb will not fail the addresses back.
269
270       Use with caution! Normally when a node becomes available to the cluster
271       ctdb will try to reassign public IP addresses onto the new node as a
272       way to distribute the workload evenly across the clusternode. Ctdb
273       tries to make sure that all running nodes have approximately the same
274       number of public addresses it hosts.
275
276       When you enable this tunable, ctdb will no longer attempt to rebalance
277       the cluster by failing IP addresses back to the new nodes. An
278       unbalanced cluster will therefore remain unbalanced until there is
279       manual intervention from the administrator. When this parameter is set,
280       you can manually fail public IP addresses over to the new node(s) using
281       the 'ctdb moveip' command.
282
283   NoIPTakeover
284       Default: 0
285
286       When set to 1, ctdb will not allow IP addresses to be failed over to
287       other nodes. Any IP addresses already hosted on healthy nodes will
288       remain. Any IP addresses hosted on unhealthy nodes will be released by
289       unhealthy nodes and will become un-hosted.
290
291   PullDBPreallocation
292       Default: 10*1024*1024
293
294       This is the size of a record buffer to pre-allocate for sending reply
295       to PULLDB control. Usually record buffer starts with size of the first
296       record and gets reallocated every time a new record is added to the
297       record buffer. For a large number of records, this can be very
298       inefficient to grow the record buffer one record at a time.
299
300   QueueBufferSize
301       Default: 1024
302
303       This is the maximum amount of data (in bytes) ctdb will read from a
304       socket at a time.
305
306       For a busy setup, if ctdb is not able to process the TCP sockets fast
307       enough (large amount of data in Recv-Q for tcp sockets), then this
308       tunable value should be increased. However, large values can keep ctdb
309       busy processing packets and prevent ctdb from handling other events.
310
311   RecBufferSizeLimit
312       Default: 1000000
313
314       This is the limit on the size of the record buffer to be sent in
315       various controls. This limit is used by new controls used for recovery
316       and controls used in vacuuming.
317
318   RecdFailCount
319       Default: 10
320
321       If the recovery daemon has failed to ping the main dameon for this many
322       consecutive intervals, the main daemon will consider the recovery
323       daemon as hung and will try to restart it to recover.
324
325   RecdPingTimeout
326       Default: 60
327
328       If the main dameon has not heard a "ping" from the recovery dameon for
329       this many seconds, the main dameon will log a message that the recovery
330       daemon is potentially hung. This also increments a counter which is
331       checked against RecdFailCount for detection of hung recovery daemon.
332
333   RecLockLatencyMs
334       Default: 1000
335
336       When using a reclock file for split brain prevention, if set to
337       non-zero this tunable will make the recovery dameon log a message if
338       the fcntl() call to lock/testlock the recovery file takes longer than
339       this number of milliseconds.
340
341   RecoverInterval
342       Default: 1
343
344       How frequently in seconds should the recovery daemon perform the
345       consistency checks to determine if it should perform a recovery.
346
347   RecoverTimeout
348       Default: 120
349
350       This is the default setting for timeouts for controls when sent from
351       the recovery daemon. We allow longer control timeouts from the recovery
352       daemon than from normal use since the recovery dameon often use
353       controls that can take a lot longer than normal controls.
354
355   RecoveryBanPeriod
356       Default: 300
357
358       The duration in seconds for which a node is banned if the node fails
359       during recovery. After this time has elapsed the node will
360       automatically get unbanned and will attempt to rejoin the cluster.
361
362       A node usually gets banned due to real problems with the node. Don't
363       set this value too small. Otherwise, a problematic node will try to
364       re-join cluster too soon causing unnecessary recoveries.
365
366   RecoveryDropAllIPs
367       Default: 120
368
369       If a node is stuck in recovery, or stopped, or banned, for this many
370       seconds, then ctdb will release all public addresses on that node.
371
372   RecoveryGracePeriod
373       Default: 120
374
375       During recoveries, if a node has not caused recovery failures during
376       the last grace period in seconds, any records of transgressions that
377       the node has caused recovery failures will be forgiven. This resets the
378       ban-counter back to zero for that node.
379
380   RepackLimit
381       Default: 10000
382
383       During vacuuming, if the number of freelist records are more than
384       RepackLimit, then the database is repacked to get rid of the freelist
385       records to avoid fragmentation.
386
387       Databases are repacked only if both RepackLimit and VacuumLimit are
388       exceeded.
389
390   RerecoveryTimeout
391       Default: 10
392
393       Once a recovery has completed, no additional recoveries are permitted
394       until this timeout in seconds has expired.
395
396   SeqnumInterval
397       Default: 1000
398
399       Some databases have seqnum tracking enabled, so that samba will be able
400       to detect asynchronously when there has been updates to the database.
401       Every time a database is updated its sequence number is increased.
402
403       This tunable is used to specify in milliseconds how frequently ctdb
404       will send out updates to remote nodes to inform them that the sequence
405       number is increased.
406
407   StatHistoryInterval
408       Default: 1
409
410       Granularity of the statistics collected in the statistics history. This
411       is reported by 'ctdb stats' command.
412
413   StickyDuration
414       Default: 600
415
416       Once a record has been marked STICKY, this is the duration in seconds,
417       the record will be flagged as a STICKY record.
418
419   StickyPindown
420       Default: 200
421
422       Once a STICKY record has been migrated onto a node, it will be pinned
423       down on that node for this number of milliseconds. Any request from
424       other nodes to migrate the record off the node will be deferred.
425
426   TakeoverTimeout
427       Default: 9
428
429       This is the duration in seconds in which ctdb tries to complete IP
430       failover.
431
432   TickleUpdateInterval
433       Default: 20
434
435       Every TickleUpdateInterval seconds, ctdb synchronizes the client
436       connection information across nodes.
437
438   TraverseTimeout
439       Default: 20
440
441       This is the duration in seconds for which a database traverse is
442       allowed to run. If the traverse does not complete during this interval,
443       ctdb will abort the traverse.
444
445   VacuumFastPathCount
446       Default: 60
447
448       During a vacuuming run, ctdb usually processes only the records marked
449       for deletion also called the fast path vacuuming. After finishing
450       VacuumFastPathCount number of fast path vacuuming runs, ctdb will
451       trigger a scan of complete database for any empty records that need to
452       be deleted.
453
454   VacuumInterval
455       Default: 10
456
457       Periodic interval in seconds when vacuuming is triggered for volatile
458       databases.
459
460   VacuumLimit
461       Default: 5000
462
463       During vacuuming, if the number of deleted records are more than
464       VacuumLimit, then databases are repacked to avoid fragmentation.
465
466       Databases are repacked only if both RepackLimit and VacuumLimit are
467       exceeded.
468
469   VacuumMaxRunTime
470       Default: 120
471
472       The maximum time in seconds for which the vacuuming process is allowed
473       to run. If vacuuming process takes longer than this value, then the
474       vacuuming process is terminated.
475
476   VerboseMemoryNames
477       Default: 0
478
479       When set to non-zero, ctdb assigns verbose names for some of the talloc
480       allocated memory objects. These names are visible in the talloc memory
481       report generated by 'ctdb dumpmemory'.
482

FILES>

484           /etc/ctdb/ctdb.tunables
485

AUTHOR

490       This documentation was written by Ronnie Sahlberg, Amitay Isaacs,
491       Martin Schwenke
492

COPYRIGHT

494       Copyright © 2007 Andrew Tridgell, Ronnie Sahlberg
495
496       This program is free software; you can redistribute it and/or modify it
497       under the terms of the GNU General Public License as published by the
498       Free Software Foundation; either version 3 of the License, or (at your
499       option) any later version.
500
501       This program is distributed in the hope that it will be useful, but
502       WITHOUT ANY WARRANTY; without even the implied warranty of
503       MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
504       General Public License for more details.
505
506       You should have received a copy of the GNU General Public License along
507       with this program; if not, see http://www.gnu.org/licenses.
508
509
510
511
512ctdb                              05/14/2019                  CTDB-TUNABLES(7)

NAME

DESCRIPTION

FILES>

SEE ALSO

AUTHOR

COPYRIGHT