1CTDB-TUNABLES(7) CTDB - clustered TDB database CTDB-TUNABLES(7)
2
3
4
6 ctdb-tunables - CTDB tunable configuration variables
7
9 CTDB's behaviour can be configured by setting run-time tunable
10 variables. This lists and describes all tunables. See the
11 ctdb(1)listvars, setvar and getvar commands for more details.
12
13 Unless otherwise stated, tunables should be set to the same value on
14 all nodes. Setting tunables to different values across nodes may
15 produce unexpected results. Future releases may set (some or most)
16 tunables globally across the cluster but doing so is currently a manual
17 process.
18
19 The tunable variables are listed alphabetically.
20
21 AllowClientDBAttach
22 Default: 1
23
24 When set to 0, clients are not allowed to attach to any databases. This
25 can be used to temporarily block any new processes from attaching to
26 and accessing the databases. This is mainly used for detaching a
27 volatile database using 'ctdb detach'.
28
29 AllowMixedVersions
30 Default: 0
31
32 CTDB will not allow incompatible versions to co-exist in a cluster. If
33 a version mismatch is found, then losing CTDB will shutdown. To disable
34 the incompatible version check, set this tunable to 1.
35
36 For version checking, CTDB uses major and minor version. For example,
37 CTDB 4.6.1 and CTDB CTDB 4.6.2 are matching versions; CTDB 4.5.x and
38 CTDB 4.6.y do not match.
39
40 CTDB with version check support will lose to CTDB without version check
41 support. Between two different CTDB versions with version check
42 support, one running for less time will lose. If the running time for
43 both CTDB versions with version check support is equal (to seconds),
44 then the older version will lose. The losing CTDB daemon will shutdown.
45
46 AllowUnhealthyDBRead
47 Default: 0
48
49 When set to 1, ctdb allows database traverses to read unhealthy
50 databases. By default, ctdb does not allow reading records from
51 unhealthy databases.
52
53 ControlTimeout
54 Default: 60
55
56 This is the default setting for timeout for when sending a control
57 message to either the local or a remote ctdb daemon.
58
59 DatabaseHashSize
60 Default: 100001
61
62 Number of the hash chains for the local store of the tdbs that ctdb
63 manages.
64
65 DatabaseMaxDead
66 Default: 5
67
68 Maximum number of dead records per hash chain for the tdb databses
69 managed by ctdb.
70
71 DBRecordCountWarn
72 Default: 100000
73
74 When set to non-zero, ctdb will log a warning during recovery if a
75 database has more than this many records. This will produce a warning
76 if a database grows uncontrollably with orphaned records.
77
78 DBRecordSizeWarn
79 Default: 10000000
80
81 When set to non-zero, ctdb will log a warning during recovery if a
82 single record is bigger than this size. This will produce a warning if
83 a database record grows uncontrollably.
84
85 DBSizeWarn
86 Default: 1000000000
87
88 When set to non-zero, ctdb will log a warning during recovery if a
89 database size is bigger than this. This will produce a warning if a
90 database grows uncontrollably.
91
92 DeferredAttachTO
93 Default: 120
94
95 When databases are frozen we do not allow clients to attach to the
96 databases. Instead of returning an error immediately to the client, the
97 attach request from the client is deferred until the database becomes
98 available again at which stage we respond to the client.
99
100 This timeout controls how long we will defer the request from the
101 client before timing it out and returning an error to the client.
102
103 DisableIPFailover
104 Default: 0
105
106 When set to non-zero, ctdb will not perform failover or failback. Even
107 if a node fails while holding public IPs, ctdb will not recover the IPs
108 or assign them to another node.
109
110 When this tunable is enabled, ctdb will no longer attempt to recover
111 the cluster by failing IP addresses over to other nodes. This leads to
112 a service outage until the administrator has manually performed IP
113 failover to replacement nodes using the 'ctdb moveip' command.
114
115 ElectionTimeout
116 Default: 3
117
118 The number of seconds to wait for the election of recovery master to
119 complete. If the election is not completed during this interval, then
120 that round of election fails and ctdb starts a new election.
121
122 EnableBans
123 Default: 1
124
125 This parameter allows ctdb to ban a node if the node is misbehaving.
126
127 When set to 0, this disables banning completely in the cluster and thus
128 nodes can not get banned, even it they break. Don't set to 0 unless you
129 know what you are doing.
130
131 EventScriptTimeout
132 Default: 30
133
134 Maximum time in seconds to allow an event to run before timing out.
135 This is the total time for all enabled scripts that are run for an
136 event, not just a single event script.
137
138 Note that timeouts are ignored for some events ("takeip", "releaseip",
139 "startrecovery", "recovered") and converted to success. The logic here
140 is that the callers of these events implement their own additional
141 timeout.
142
143 FetchCollapse
144 Default: 1
145
146 This parameter is used to avoid multiple migration requests for the
147 same record from a single node. All the record requests for the same
148 record are queued up and processed when the record is migrated to the
149 current node.
150
151 When many clients across many nodes try to access the same record at
152 the same time this can lead to a fetch storm where the record becomes
153 very active and bounces between nodes very fast. This leads to high CPU
154 utilization of the ctdbd daemon, trying to bounce that record around
155 very fast, and poor performance. This can improve performance and
156 reduce CPU utilization for certain workloads.
157
158 HopcountMakeSticky
159 Default: 50
160
161 For database(s) marked STICKY (using 'ctdb setdbsticky'), any record
162 that is migrating so fast that hopcount exceeds this limit is marked as
163 STICKY record for StickyDuration seconds. This means that after each
164 migration the sticky record will be kept on the node
165 StickyPindownmilliseconds and prevented from being migrated off the
166 node.
167
168 This will improve performance for certain workloads, such as
169 locking.tdb if many clients are opening/closing the same file
170 concurrently.
171
172 IPAllocAlgorithm
173 Default: 2
174
175 Selects the algorithm that CTDB should use when doing public IP address
176 allocation. Meaningful values are:
177
178 0
179 Deterministic IP address allocation.
180
181 This is a simple and fast option. However, it can cause unnecessary
182 address movement during fail-over because each address has a "home"
183 node. Works badly when some nodes do not have any addresses
184 defined. Should be used with care when addresses are defined across
185 multiple networks.
186
187 1
188 Non-deterministic IP address allocation.
189
190 This is a relatively fast option that attempts to do a minimise
191 unnecessary address movements. Addresses do not have a "home" node.
192 Rebalancing is limited but it usually adequate. Works badly when
193 addresses are defined across multiple networks.
194
195 2
196 LCP2 IP address allocation.
197
198 Uses a heuristic to assign addresses defined across multiple
199 networks, usually balancing addresses on each network evenly across
200 nodes. Addresses do not have a "home" node. Minimises unnecessary
201 address movements. The algorithm is complex, so is slower than
202 other choices for a large number of addresses. However, it can
203 calculate an optimal assignment of 900 addresses in under 10
204 seconds on modern hardware.
205
206 If the specified value is not one of these then the default will be
207 used.
208
209 KeepaliveInterval
210 Default: 5
211
212 How often in seconds should the nodes send keep-alive packets to each
213 other.
214
215 KeepaliveLimit
216 Default: 5
217
218 After how many keepalive intervals without any traffic should a node
219 wait until marking the peer as DISCONNECTED.
220
221 If a node has hung, it can take KeepaliveInterval * (KeepaliveLimit +
222 1) seconds before ctdb determines that the node is DISCONNECTED and
223 performs a recovery. This limit should not be set too high to enable
224 early detection and avoid any application timeouts (e.g. SMB1) to kick
225 in before the fail over is completed.
226
227 LockProcessesPerDB
228 Default: 200
229
230 This is the maximum number of lock helper processes ctdb will create
231 for obtaining record locks. When ctdb cannot get a record lock without
232 blocking, it creates a helper process that waits for the lock to be
233 obtained.
234
235 LogLatencyMs
236 Default: 0
237
238 When set to non-zero, ctdb will log if certains operations take longer
239 than this value, in milliseconds, to complete. These operations include
240 "process a record request from client", "take a record or database
241 lock", "update a persistent database record" and "vaccum a database".
242
243 MaxQueueDropMsg
244 Default: 1000000
245
246 This is the maximum number of messages to be queued up for a client
247 before ctdb will treat the client as hung and will terminate the client
248 connection.
249
250 MonitorInterval
251 Default: 15
252
253 How often should ctdb run the 'monitor' event in seconds to check for a
254 node's health.
255
256 MonitorTimeoutCount
257 Default: 20
258
259 How many 'monitor' events in a row need to timeout before a node is
260 flagged as UNHEALTHY. This setting is useful if scripts can not be
261 written so that they do not hang for benign reasons.
262
263 NoIPFailback
264 Default: 0
265
266 When set to 1, ctdb will not perform failback of IP addresses when a
267 node becomes healthy. When a node becomes UNHEALTHY, ctdb WILL perform
268 failover of public IP addresses, but when the node becomes HEALTHY
269 again, ctdb will not fail the addresses back.
270
271 Use with caution! Normally when a node becomes available to the cluster
272 ctdb will try to reassign public IP addresses onto the new node as a
273 way to distribute the workload evenly across the clusternode. Ctdb
274 tries to make sure that all running nodes have approximately the same
275 number of public addresses it hosts.
276
277 When you enable this tunable, ctdb will no longer attempt to rebalance
278 the cluster by failing IP addresses back to the new nodes. An
279 unbalanced cluster will therefore remain unbalanced until there is
280 manual intervention from the administrator. When this parameter is set,
281 you can manually fail public IP addresses over to the new node(s) using
282 the 'ctdb moveip' command.
283
284 NoIPHostOnAllDisabled
285 Default: 0
286
287 If no nodes are HEALTHY then by default ctdb will happily host public
288 IPs on disabled (unhealthy or administratively disabled) nodes. This
289 can cause problems, for example if the underlying cluster filesystem is
290 not mounted. When set to 1 and a node is disabled, any IPs hosted by
291 this node will be released and the node will not takeover any IPs until
292 it is no longer disabled.
293
294 NoIPTakeover
295 Default: 0
296
297 When set to 1, ctdb will not allow IP addresses to be failed over to
298 other nodes. Any IP addresses already hosted on healthy nodes will
299 remain. Usually IP addresses hosted on unhealthy nodes will also
300 remain, if NoIPHostOnAllDisabled is 0. However, if
301 NoIPHostOnAllDisabled is 1 then IP addresses will be released by
302 unhealthy nodes and will become un-hosted.
303
304 PullDBPreallocation
305 Default: 10*1024*1024
306
307 This is the size of a record buffer to pre-allocate for sending reply
308 to PULLDB control. Usually record buffer starts with size of the first
309 record and gets reallocated every time a new record is added to the
310 record buffer. For a large number of records, this can be very
311 inefficient to grow the record buffer one record at a time.
312
313 QueueBufferSize
314 Default: 1024
315
316 This is the maximum amount of data (in bytes) ctdb will read from a
317 socket at a time.
318
319 For a busy setup, if ctdb is not able to process the TCP sockets fast
320 enough (large amount of data in Recv-Q for tcp sockets), then this
321 tunable value should be increased. However, large values can keep ctdb
322 busy processing packets and prevent ctdb from handling other events.
323
324 RecBufferSizeLimit
325 Default: 1000000
326
327 This is the limit on the size of the record buffer to be sent in
328 various controls. This limit is used by new controls used for recovery
329 and controls used in vacuuming.
330
331 RecdFailCount
332 Default: 10
333
334 If the recovery daemon has failed to ping the main dameon for this many
335 consecutive intervals, the main daemon will consider the recovery
336 daemon as hung and will try to restart it to recover.
337
338 RecdPingTimeout
339 Default: 60
340
341 If the main dameon has not heard a "ping" from the recovery dameon for
342 this many seconds, the main dameon will log a message that the recovery
343 daemon is potentially hung. This also increments a counter which is
344 checked against RecdFailCount for detection of hung recovery daemon.
345
346 RecLockLatencyMs
347 Default: 1000
348
349 When using a reclock file for split brain prevention, if set to
350 non-zero this tunable will make the recovery dameon log a message if
351 the fcntl() call to lock/testlock the recovery file takes longer than
352 this number of milliseconds.
353
354 RecoverInterval
355 Default: 1
356
357 How frequently in seconds should the recovery daemon perform the
358 consistency checks to determine if it should perform a recovery.
359
360 RecoverTimeout
361 Default: 120
362
363 This is the default setting for timeouts for controls when sent from
364 the recovery daemon. We allow longer control timeouts from the recovery
365 daemon than from normal use since the recovery dameon often use
366 controls that can take a lot longer than normal controls.
367
368 RecoveryBanPeriod
369 Default: 300
370
371 The duration in seconds for which a node is banned if the node fails
372 during recovery. After this time has elapsed the node will
373 automatically get unbanned and will attempt to rejoin the cluster.
374
375 A node usually gets banned due to real problems with the node. Don't
376 set this value too small. Otherwise, a problematic node will try to
377 re-join cluster too soon causing unnecessary recoveries.
378
379 RecoveryDropAllIPs
380 Default: 120
381
382 If a node is stuck in recovery, or stopped, or banned, for this many
383 seconds, then ctdb will release all public addresses on that node.
384
385 RecoveryGracePeriod
386 Default: 120
387
388 During recoveries, if a node has not caused recovery failures during
389 the last grace period in seconds, any records of transgressions that
390 the node has caused recovery failures will be forgiven. This resets the
391 ban-counter back to zero for that node.
392
393 RepackLimit
394 Default: 10000
395
396 During vacuuming, if the number of freelist records are more than
397 RepackLimit, then the database is repacked to get rid of the freelist
398 records to avoid fragmentation.
399
400 Databases are repacked only if both RepackLimit and VacuumLimit are
401 exceeded.
402
403 RerecoveryTimeout
404 Default: 10
405
406 Once a recovery has completed, no additional recoveries are permitted
407 until this timeout in seconds has expired.
408
409 SeqnumInterval
410 Default: 1000
411
412 Some databases have seqnum tracking enabled, so that samba will be able
413 to detect asynchronously when there has been updates to the database.
414 Everytime a database is updated its sequence number is increased.
415
416 This tunable is used to specify in milliseconds how frequently ctdb
417 will send out updates to remote nodes to inform them that the sequence
418 number is increased.
419
420 StatHistoryInterval
421 Default: 1
422
423 Granularity of the statistics collected in the statistics history. This
424 is reported by 'ctdb stats' command.
425
426 StickyDuration
427 Default: 600
428
429 Once a record has been marked STICKY, this is the duration in seconds,
430 the record will be flagged as a STICKY record.
431
432 StickyPindown
433 Default: 200
434
435 Once a STICKY record has been migrated onto a node, it will be pinned
436 down on that node for this number of milliseconds. Any request from
437 other nodes to migrate the record off the node will be deferred.
438
439 TakeoverTimeout
440 Default: 9
441
442 This is the duration in seconds in which ctdb tries to complete IP
443 failover.
444
445 TDBMutexEnabled
446 Default: 1
447
448 This parameter enables TDB_MUTEX_LOCKING feature on volatile databases
449 if the robust mutexes are supported. This optimizes the record locking
450 using robust mutexes and is much more efficient that using posix locks.
451
452 TickleUpdateInterval
453 Default: 20
454
455 Every TickleUpdateInterval seconds, ctdb synchronizes the client
456 connection information across nodes.
457
458 TraverseTimeout
459 Default: 20
460
461 This is the duration in seconds for which a database traverse is
462 allowed to run. If the traverse does not complete during this interval,
463 ctdb will abort the traverse.
464
465 VacuumFastPathCount
466 Default: 60
467
468 During a vacuuming run, ctdb usually processes only the records marked
469 for deletion also called the fast path vacuuming. After finishing
470 VacuumFastPathCount number of fast path vacuuming runs, ctdb will
471 trigger a scan of complete database for any empty records that need to
472 be deleted.
473
474 VacuumInterval
475 Default: 10
476
477 Periodic interval in seconds when vacuuming is triggered for volatile
478 databases.
479
480 VacuumLimit
481 Default: 5000
482
483 During vacuuming, if the number of deleted records are more than
484 VacuumLimit, then databases are repacked to avoid fragmentation.
485
486 Databases are repacked only if both RepackLimit and VacuumLimit are
487 exceeded.
488
489 VacuumMaxRunTime
490 Default: 120
491
492 The maximum time in seconds for which the vacuuming process is allowed
493 to run. If vacuuming process takes longer than this value, then the
494 vacuuming process is terminated.
495
496 VerboseMemoryNames
497 Default: 0
498
499 When set to non-zero, ctdb assigns verbose names for some of the talloc
500 allocated memory objects. These names are visible in the talloc memory
501 report generated by 'ctdb dumpmemory'.
502
504 ctdb(1), ctdbd(1), ctdbd.conf(5), ctdb(7), http://ctdb.samba.org/
505
507 This documentation was written by Ronnie Sahlberg, Amitay Isaacs,
508 Martin Schwenke
509
511 Copyright © 2007 Andrew Tridgell, Ronnie Sahlberg
512
513 This program is free software; you can redistribute it and/or modify it
514 under the terms of the GNU General Public License as published by the
515 Free Software Foundation; either version 3 of the License, or (at your
516 option) any later version.
517
518 This program is distributed in the hope that it will be useful, but
519 WITHOUT ANY WARRANTY; without even the implied warranty of
520 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
521 General Public License for more details.
522
523 You should have received a copy of the GNU General Public License along
524 with this program; if not, see http://www.gnu.org/licenses.
525
526
527
528
529ctdb 10/30/2018 CTDB-TUNABLES(7)