1CTDB-TUNABLES(7) CTDB - clustered TDB database CTDB-TUNABLES(7)
2
3
4
6 ctdb-tunables - CTDB tunable configuration variables
7
9 CTDB's behaviour can be configured by setting run-time tunable
10 variables. This lists and describes all tunables. See the ctdb(1)
11 listvars, setvar and getvar commands for more details.
12
13 Unless otherwise stated, tunables should be set to the same value on
14 all nodes. Setting tunables to different values across nodes may
15 produce unexpected results. Future releases may set (some or most)
16 tunables globally across the cluster but doing so is currently a manual
17 process.
18
19 Tunables can be set at startup from the /etc/ctdb/ctdb.tunables
20 configuration file.
21
22 TUNABLE=VALUE
23
24
25 For example:
26
27 MonitorInterval=20
28
29
30 The available tunable variables are listed alphabetically below.
31
32 AllowClientDBAttach
33 Default: 1
34
35 When set to 0, clients are not allowed to attach to any databases. This
36 can be used to temporarily block any new processes from attaching to
37 and accessing the databases. This is mainly used for detaching a
38 volatile database using 'ctdb detach'.
39
40 AllowMixedVersions
41 Default: 0
42
43 CTDB will not allow incompatible versions to co-exist in a cluster. If
44 a version mismatch is found, then losing CTDB will shutdown. To disable
45 the incompatible version check, set this tunable to 1.
46
47 For version checking, CTDB uses major and minor version. For example,
48 CTDB 4.6.1 and CTDB 4.6.2 are matching versions; CTDB 4.5.x and CTDB
49 4.6.y do not match.
50
51 CTDB with version check support will lose to CTDB without version check
52 support. Between two different CTDB versions with version check
53 support, one running for less time will lose. If the running time for
54 both CTDB versions with version check support is equal (to seconds),
55 then the older version will lose. The losing CTDB daemon will shutdown.
56
57 AllowUnhealthyDBRead
58 Default: 0
59
60 When set to 1, ctdb allows database traverses to read unhealthy
61 databases. By default, ctdb does not allow reading records from
62 unhealthy databases.
63
64 ControlTimeout
65 Default: 60
66
67 This is the default setting for timeout for when sending a control
68 message to either the local or a remote ctdb daemon.
69
70 DatabaseHashSize
71 Default: 100001
72
73 Number of the hash chains for the local store of the tdbs that ctdb
74 manages.
75
76 DatabaseMaxDead
77 Default: 5
78
79 Maximum number of dead records per hash chain for the tdb databses
80 managed by ctdb.
81
82 DBRecordCountWarn
83 Default: 100000
84
85 When set to non-zero, ctdb will log a warning during recovery if a
86 database has more than this many records. This will produce a warning
87 if a database grows uncontrollably with orphaned records.
88
89 DBRecordSizeWarn
90 Default: 10000000
91
92 When set to non-zero, ctdb will log a warning during recovery if a
93 single record is bigger than this size. This will produce a warning if
94 a database record grows uncontrollably.
95
96 DBSizeWarn
97 Default: 1000000000
98
99 When set to non-zero, ctdb will log a warning during recovery if a
100 database size is bigger than this. This will produce a warning if a
101 database grows uncontrollably.
102
103 DeferredAttachTO
104 Default: 120
105
106 When databases are frozen we do not allow clients to attach to the
107 databases. Instead of returning an error immediately to the client, the
108 attach request from the client is deferred until the database becomes
109 available again at which stage we respond to the client.
110
111 This timeout controls how long we will defer the request from the
112 client before timing it out and returning an error to the client.
113
114 ElectionTimeout
115 Default: 3
116
117 The number of seconds to wait for the election of recovery master to
118 complete. If the election is not completed during this interval, then
119 that round of election fails and ctdb starts a new election.
120
121 EnableBans
122 Default: 1
123
124 This parameter allows ctdb to ban a node if the node is misbehaving.
125
126 When set to 0, this disables banning completely in the cluster and thus
127 nodes can not get banned, even it they break. Don't set to 0 unless you
128 know what you are doing.
129
130 EventScriptTimeout
131 Default: 30
132
133 Maximum time in seconds to allow an event to run before timing out.
134 This is the total time for all enabled scripts that are run for an
135 event, not just a single event script.
136
137 Note that timeouts are ignored for some events ("takeip", "releaseip",
138 "startrecovery", "recovered") and converted to success. The logic here
139 is that the callers of these events implement their own additional
140 timeout.
141
142 FetchCollapse
143 Default: 1
144
145 This parameter is used to avoid multiple migration requests for the
146 same record from a single node. All the record requests for the same
147 record are queued up and processed when the record is migrated to the
148 current node.
149
150 When many clients across many nodes try to access the same record at
151 the same time this can lead to a fetch storm where the record becomes
152 very active and bounces between nodes very fast. This leads to high CPU
153 utilization of the ctdbd daemon, trying to bounce that record around
154 very fast, and poor performance. This can improve performance and
155 reduce CPU utilization for certain workloads.
156
157 HopcountMakeSticky
158 Default: 50
159
160 For database(s) marked STICKY (using 'ctdb setdbsticky'), any record
161 that is migrating so fast that hopcount exceeds this limit is marked as
162 STICKY record for StickyDuration seconds. This means that after each
163 migration the sticky record will be kept on the node
164 StickyPindownmilliseconds and prevented from being migrated off the
165 node.
166
167 This will improve performance for certain workloads, such as
168 locking.tdb if many clients are opening/closing the same file
169 concurrently.
170
171 IPAllocAlgorithm
172 Default: 2
173
174 Selects the algorithm that CTDB should use when doing public IP address
175 allocation. Meaningful values are:
176
177 0
178 Deterministic IP address allocation.
179
180 This is a simple and fast option. However, it can cause unnecessary
181 address movement during fail-over because each address has a "home"
182 node. Works badly when some nodes do not have any addresses
183 defined. Should be used with care when addresses are defined across
184 multiple networks.
185
186 1
187 Non-deterministic IP address allocation.
188
189 This is a relatively fast option that attempts to do a minimise
190 unnecessary address movements. Addresses do not have a "home" node.
191 Rebalancing is limited but it usually adequate. Works badly when
192 addresses are defined across multiple networks.
193
194 2
195 LCP2 IP address allocation.
196
197 Uses a heuristic to assign addresses defined across multiple
198 networks, usually balancing addresses on each network evenly across
199 nodes. Addresses do not have a "home" node. Minimises unnecessary
200 address movements. The algorithm is complex, so is slower than
201 other choices for a large number of addresses. However, it can
202 calculate an optimal assignment of 900 addresses in under 10
203 seconds on modern hardware.
204
205 If the specified value is not one of these then the default will be
206 used.
207
208 KeepaliveInterval
209 Default: 5
210
211 How often in seconds should the nodes send keep-alive packets to each
212 other.
213
214 KeepaliveLimit
215 Default: 5
216
217 After how many keepalive intervals without any traffic should a node
218 wait until marking the peer as DISCONNECTED.
219
220 If a node has hung, it can take KeepaliveInterval * (KeepaliveLimit +
221 1) seconds before ctdb determines that the node is DISCONNECTED and
222 performs a recovery. This limit should not be set too high to enable
223 early detection and avoid any application timeouts (e.g. SMB1) to kick
224 in before the fail over is completed.
225
226 LockProcessesPerDB
227 Default: 200
228
229 This is the maximum number of lock helper processes ctdb will create
230 for obtaining record locks. When ctdb cannot get a record lock without
231 blocking, it creates a helper process that waits for the lock to be
232 obtained.
233
234 LogLatencyMs
235 Default: 0
236
237 When set to non-zero, ctdb will log if certains operations take longer
238 than this value, in milliseconds, to complete. These operations include
239 "process a record request from client", "take a record or database
240 lock", "update a persistent database record" and "vacuum a database".
241
242 MaxQueueDropMsg
243 Default: 1000000
244
245 This is the maximum number of messages to be queued up for a client
246 before ctdb will treat the client as hung and will terminate the client
247 connection.
248
249 MonitorInterval
250 Default: 15
251
252 How often should ctdb run the 'monitor' event in seconds to check for a
253 node's health.
254
255 MonitorTimeoutCount
256 Default: 20
257
258 How many 'monitor' events in a row need to timeout before a node is
259 flagged as UNHEALTHY. This setting is useful if scripts can not be
260 written so that they do not hang for benign reasons.
261
262 NoIPFailback
263 Default: 0
264
265 When set to 1, ctdb will not perform failback of IP addresses when a
266 node becomes healthy. When a node becomes UNHEALTHY, ctdb WILL perform
267 failover of public IP addresses, but when the node becomes HEALTHY
268 again, ctdb will not fail the addresses back.
269
270 Use with caution! Normally when a node becomes available to the cluster
271 ctdb will try to reassign public IP addresses onto the new node as a
272 way to distribute the workload evenly across the clusternode. Ctdb
273 tries to make sure that all running nodes have approximately the same
274 number of public addresses it hosts.
275
276 When you enable this tunable, ctdb will no longer attempt to rebalance
277 the cluster by failing IP addresses back to the new nodes. An
278 unbalanced cluster will therefore remain unbalanced until there is
279 manual intervention from the administrator. When this parameter is set,
280 you can manually fail public IP addresses over to the new node(s) using
281 the 'ctdb moveip' command.
282
283 NoIPTakeover
284 Default: 0
285
286 When set to 1, ctdb will not allow IP addresses to be failed over to
287 other nodes. Any IP addresses already hosted on healthy nodes will
288 remain. Any IP addresses hosted on unhealthy nodes will be released by
289 unhealthy nodes and will become un-hosted.
290
291 PullDBPreallocation
292 Default: 10*1024*1024
293
294 This is the size of a record buffer to pre-allocate for sending reply
295 to PULLDB control. Usually record buffer starts with size of the first
296 record and gets reallocated every time a new record is added to the
297 record buffer. For a large number of records, this can be very
298 inefficient to grow the record buffer one record at a time.
299
300 QueueBufferSize
301 Default: 1024
302
303 This is the maximum amount of data (in bytes) ctdb will read from a
304 socket at a time.
305
306 For a busy setup, if ctdb is not able to process the TCP sockets fast
307 enough (large amount of data in Recv-Q for tcp sockets), then this
308 tunable value should be increased. However, large values can keep ctdb
309 busy processing packets and prevent ctdb from handling other events.
310
311 RecBufferSizeLimit
312 Default: 1000000
313
314 This is the limit on the size of the record buffer to be sent in
315 various controls. This limit is used by new controls used for recovery
316 and controls used in vacuuming.
317
318 RecdFailCount
319 Default: 10
320
321 If the recovery daemon has failed to ping the main daemon for this many
322 consecutive intervals, the main daemon will consider the recovery
323 daemon as hung and will try to restart it to recover.
324
325 RecdPingTimeout
326 Default: 60
327
328 If the main daemon has not heard a "ping" from the recovery daemon for
329 this many seconds, the main daemon will log a message that the recovery
330 daemon is potentially hung. This also increments a counter which is
331 checked against RecdFailCount for detection of hung recovery daemon.
332
333 RecLockLatencyMs
334 Default: 1000
335
336 When using a reclock file for split brain prevention, if set to
337 non-zero this tunable will make the recovery daemon log a message if
338 the fcntl() call to lock/testlock the recovery file takes longer than
339 this number of milliseconds.
340
341 RecoverInterval
342 Default: 1
343
344 How frequently in seconds should the recovery daemon perform the
345 consistency checks to determine if it should perform a recovery.
346
347 RecoverTimeout
348 Default: 120
349
350 This is the default setting for timeouts for controls when sent from
351 the recovery daemon. We allow longer control timeouts from the recovery
352 daemon than from normal use since the recovery daemon often use
353 controls that can take a lot longer than normal controls.
354
355 RecoveryBanPeriod
356 Default: 300
357
358 The duration in seconds for which a node is banned if the node fails
359 during recovery. After this time has elapsed the node will
360 automatically get unbanned and will attempt to rejoin the cluster.
361
362 A node usually gets banned due to real problems with the node. Don't
363 set this value too small. Otherwise, a problematic node will try to
364 re-join cluster too soon causing unnecessary recoveries.
365
366 RecoveryDropAllIPs
367 Default: 120
368
369 If a node is stuck in recovery, or stopped, or banned, for this many
370 seconds, then ctdb will release all public addresses on that node.
371
372 RecoveryGracePeriod
373 Default: 120
374
375 During recoveries, if a node has not caused recovery failures during
376 the last grace period in seconds, any records of transgressions that
377 the node has caused recovery failures will be forgiven. This resets the
378 ban-counter back to zero for that node.
379
380 RepackLimit
381 Default: 10000
382
383 During vacuuming, if the number of freelist records are more than
384 RepackLimit, then the database is repacked to get rid of the freelist
385 records to avoid fragmentation.
386
387 RerecoveryTimeout
388 Default: 10
389
390 Once a recovery has completed, no additional recoveries are permitted
391 until this timeout in seconds has expired.
392
393 SeqnumInterval
394 Default: 1000
395
396 Some databases have seqnum tracking enabled, so that samba will be able
397 to detect asynchronously when there has been updates to the database.
398 Every time a database is updated its sequence number is increased.
399
400 This tunable is used to specify in milliseconds how frequently ctdb
401 will send out updates to remote nodes to inform them that the sequence
402 number is increased.
403
404 StatHistoryInterval
405 Default: 1
406
407 Granularity of the statistics collected in the statistics history. This
408 is reported by 'ctdb stats' command.
409
410 StickyDuration
411 Default: 600
412
413 Once a record has been marked STICKY, this is the duration in seconds,
414 the record will be flagged as a STICKY record.
415
416 StickyPindown
417 Default: 200
418
419 Once a STICKY record has been migrated onto a node, it will be pinned
420 down on that node for this number of milliseconds. Any request from
421 other nodes to migrate the record off the node will be deferred.
422
423 TakeoverTimeout
424 Default: 9
425
426 This is the duration in seconds in which ctdb tries to complete IP
427 failover.
428
429 TickleUpdateInterval
430 Default: 20
431
432 Every TickleUpdateInterval seconds, ctdb synchronizes the client
433 connection information across nodes.
434
435 TraverseTimeout
436 Default: 20
437
438 This is the duration in seconds for which a database traverse is
439 allowed to run. If the traverse does not complete during this interval,
440 ctdb will abort the traverse.
441
442 VacuumFastPathCount
443 Default: 60
444
445 During a vacuuming run, ctdb usually processes only the records marked
446 for deletion also called the fast path vacuuming. After finishing
447 VacuumFastPathCount number of fast path vacuuming runs, ctdb will
448 trigger a scan of complete database for any empty records that need to
449 be deleted.
450
451 VacuumInterval
452 Default: 10
453
454 Periodic interval in seconds when vacuuming is triggered for volatile
455 databases.
456
457 VacuumMaxRunTime
458 Default: 120
459
460 The maximum time in seconds for which the vacuuming process is allowed
461 to run. If vacuuming process takes longer than this value, then the
462 vacuuming process is terminated.
463
464 VerboseMemoryNames
465 Default: 0
466
467 When set to non-zero, ctdb assigns verbose names for some of the talloc
468 allocated memory objects. These names are visible in the talloc memory
469 report generated by 'ctdb dumpmemory'.
470
472 /etc/ctdb/ctdb.tunables
473
475 ctdb(1), ctdbd(1), ctdb.conf(5), ctdb(7), http://ctdb.samba.org/
476
478 This documentation was written by Ronnie Sahlberg, Amitay Isaacs,
479 Martin Schwenke
480
482 Copyright © 2007 Andrew Tridgell, Ronnie Sahlberg
483
484 This program is free software; you can redistribute it and/or modify it
485 under the terms of the GNU General Public License as published by the
486 Free Software Foundation; either version 3 of the License, or (at your
487 option) any later version.
488
489 This program is distributed in the hope that it will be useful, but
490 WITHOUT ANY WARRANTY; without even the implied warranty of
491 MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
492 General Public License for more details.
493
494 You should have received a copy of the GNU General Public License along
495 with this program; if not, see http://www.gnu.org/licenses.
496
497
498
499
500ctdb 06/01/2021 CTDB-TUNABLES(7)