1COROSYNC-QDEVICE(8) System Manager's Manual COROSYNC-QDEVICE(8)
2
3
4
6 corosync-qdevice - QDevice daemon
7
9 corosync-qdevice [-dfh] [-S option=value[,option2=value2,...]]
10
11
13 corosync-qdevice is a daemon running on each node of a cluster. It pro‐
14 vides a configured number of votes to the quorum subsystem based on a
15 third-party arbitrator's decision. Its primary use is to allow a clus‐
16 ter to sustain more node failures than standard quorum rules allow. It
17 is recommended for clusters with an even number of nodes and highly
18 recommended for 2 node clusters.
19
21 -d Forcefully turn on debug information without the need to change
22 corosync.conf. For bumping syslog messages priority to info,
23 use this parameter twice.
24
25 -f Do not daemonize, run in the foreground.
26
27 -h Show short help text
28
29 -S Set advanced settings described in its own section below. This
30 option shouldn't be generally used because most of the options
31 are not safe to change.
32
34 corosync-qdevice reads its configuration from corosync.conf file.
35
36 The main configuration is within quorum.device sub-key. Each model also
37 has its own configuration within a similarly named sub-key.
38
39 model Specifies the model to be used. This parameter is required.
40 corosync-qdevice is modular and is able to support multiple dif‐
41 ferent models. The model basically defines what type of arbitra‐
42 tor is used. Currently only net is supported.
43
44 timeout
45 Specifies how often corosync-qdevice should call the votequo‐
46 rum_qdevice_poll function. It is also used by the net model to
47 adjust its hearbeat timeout. It is recommended that you don't
48 change this value. Default is 10000.
49
50 sync_timeout
51 Specifies how often corosync-qdevice should call the votequo‐
52 rum_qdevice_poll function during a sync phase. It is recommended
53 that you don't change this value. Default is 30000.
54
55 votes The number of votes provided to the cluster by qdevice. Default
56 is (number_of_nodes - 1) or generally sum(votes_per_node) - 1.
57
58
59 quorum.device.heuristics subkey holds the configuration of the heuris‐
60 tics. Heuristics are set of commands executed locally on startup, clus‐
61 ter membership change, successful connect to corosync-qnetd and option‐
62 ally also at regular times. Commands are executed in parallel. When
63 all commands finish successfully (their return error code is zero) on
64 time, heuristics have passed, otherwise they have failed. The heuris‐
65 tics result is sent to corosync-qnetd and there it's used in calcula‐
66 tions to determine which partition should be quorate.
67
68 timeout
69 Specifies maximum time in milliseconds how long corosync-qdevice
70 waits till the heuristics commands finish. If some command
71 doesn't finish before the timeout, it's killed and heuristics
72 fail. This timeout is used for heuristics executed at regular
73 times. Default value is half of the quorum.device.timeout, so
74 5000.
75
76 sync_timeout
77 Similar to quorum.device.heuristics.timeout but used during mem‐
78 bership changes. Default value is half of the quorum.de‐
79 vice.sync_timeout, so 15000.
80
81 interval
82 Specifies interval between two regular heuristics execution. De‐
83 fault value is 3 * quorum.device.timeout, so 30000.
84
85 mode Can be one of on, sync or off and specifies mode of operation of
86 heuristics. Default is off, which means heuristics are disabled.
87 When sync is set, heuristics are executed only during startup,
88 membership change and when connection to corosync-qnetd is es‐
89 tablished. When heuristics should be running also on regular ba‐
90 sis, this option should be set to on value.
91
92 exec_NAME
93 defines executables. NAME can be arbitrary valid cmap key name
94 string and it has no special meaning. The value of this vari‐
95 able must contain a command to execute. The value is parsed
96 (split) into arguments similarly as Bourne shell would do. Quot‐
97 ing is possible by using backslash and double quotes.
98
99
100 quorum.device.net subkey holds the configuration for model net.
101
102 tls Can be one of on, off or required and specifies if tls should be
103 used. on means a connection with TLS is attempted first, but if
104 the server doesn't advertise TLS support then non-TLS will be
105 used. off is used then TLS is not required and it's then not
106 even tried. This mode is the only one which doesn't need a prop‐
107 erly initialized NSS database. required means TLS is required
108 and if the server doesn't support TLS, qdevice will exit with
109 error message. Default is on.
110
111 host Specifies the IP address or host name of the qnetd server to be
112 used. This parameter is required.
113
114 port Specifies TCP port of qnetd server. Default is 5403.
115
116 algorithm
117 Decision algorithm. Can be one of the ffsplit or lms. (actually
118 there are also test and 2nodelms, both of which are mainly for
119 developers and shouldn't be used for production clusters). For
120 a description of what each algorithm means and how the algo‐
121 rithms differ see their individual sections. Default value is
122 ffsplit.
123
124 tie_breaker
125 can be one of lowest, highest or valid_node_id (number) values.
126 It's used as a fallback if qdevice has to decide between two or
127 more equal partitions. lowest means the partition with the low‐
128 est node id is chosen. highest means the partition with highest
129 node id is chosen. And valid_node_id means that the partition
130 containing the node with the given node id is chosen. Default
131 is lowest.
132
133 connect_timeout
134 Timeout when corosync-qdevice is trying to connect to corosync-
135 qnetd host. Default is 0.8 * quorum.device.timeout.
136
137 force_ip_version
138 can be one of 0|4|6 and forces the software to use the given IP
139 version. 0 (default value) means IPv6 is preferred and IPv4
140 should be used as a fallback.
141
142 keep_active_partition_tie_breaker
143 Can be one of on or off and specifies if keep active partition
144 tie breaker should be used. When this option is enabled and tie
145 happens QNetd will prefer partition with members of previously
146 active (quorate) partition. This is hard-coded behavior of LMS
147 algorithm so this setting affects only FFSplit algorithm. De‐
148 fault is on.
149
150
151 Logging configuration is within the logging directive. corosync-qde‐
152 vice parses and supports only debug option. The logger_subsys sub-di‐
153 rective can be also used if subsys is set to QDEVICE.
154
155
156 For corosync-qdevice to work correctly, the nodelist directive has to
157 be used and properly configured. Also the net model requires that
158 totem.cluster_name option is set.
159
160
162 For model net to work using TLS, it's necessary to create the NSS data‐
163 base, import Qnetd CA certificate, and get/distribute a valid client
164 certificate.
165
166 If pcs is used (recommended) the following steps are not needed because
167 pcs does them automatically.
168
169 corosync-qdevice-net-certutil is the tool to perform required actions
170 semi-automatically. Please consult the help output of it and its man
171 page. For a first time configuration it may make sense to start with
172 the -Q option.
173
174 If TLS is not required just edit corosync.conf file and set quorum.de‐
175 vice.net.tls to off.
176
177 Depending on configuration of NSS (stored in nss.config file usually in
178 /etc/crypto-policies/back-ends/ directory) disabled ciphers or too
179 short keys may be rejected. Proper solution is to regenerate NSS data‐
180 bases for both corosync-qnetd and corosync-qdevice daemons. As a quick
181 workaround it's also possible to set environment variable NSS_IG‐
182 NORE_SYSTEM_POLICY=1 before running corosync-qdevice daemon.
183
184 When NSS is updated it may also be needed to upgrade database into new
185 format. There is no consensus on recommended way, but following command
186 seems to work just fine (if qdevice sysconfdir is set to /etc)
187
188 # certutil -N -d /etc/corosync/qdevice/net/nssdb -f /etc/corosync/qdevice/net/nssdb/pwdfile.txt
189
190
192 Algorithms are used to change behavior of how corosync-qnetd provides
193 votes to a given node/partition. Currently there are two algorithms
194 supported.
195
196 ffsplit
197 This one makes sense only for clusters with an even number of
198 nodes. It provides exactly one vote to the partition with the
199 highest number of active nodes. If there are two exactly similar
200 partitions, it provides its vote to the partition with higher
201 score. The score is computed as (number_of_connected_nodes +
202 number_of_connected_nodes_with_passed_heuristics - num‐
203 ber_of_connected_nodes_with_failed_heuristics) If the scores are
204 equal, the vote is provided to partition with the most clients
205 connected to the qnetd server. If this number is also equal,
206 then the tie_breaker is used. It is able to transition its vote
207 if the currently active partition becomes partitioned and a non-
208 active partition still has at least 50% of the active nodes. Be‐
209 cause of this, a vote is not provided if the qnetd connection is
210 not active.
211
212 To use this algorithm it's required to set the number of votes
213 per node to 1 (default) and the qdevice number of votes has to
214 be also 1. This is achieved by setting quorum.device.votes key
215 in corosync.conf file to 1.
216
217 lms Last-man-standing. If the node is the only one left in the clus‐
218 ter that can see the qnetd server then we return a vote.
219
220 If more than one node can see the qnetd server but some nodes
221 can't see each other then the cluster is divided up into 'parti‐
222 tions' based on their ring_id and this algorithm returns a vote
223 to the partition with highest heuristics score (computed the
224 same way as for the ffsplit algorithm), or if there is more than
225 1 partition with equal scores, the largest active partition or,
226 if there is more than 1 equal partition, the partition that con‐
227 tains the tie_breaker node (lowest, highest, etc). For LMS to
228 work, the number of qdevice votes has to be set to default (so
229 just delete quorum.device.votes key from corosync.conf).
230
231
233 Set by using -S option. The default value is shown in parentheses) Op‐
234 tions beginning with net_ prefix are specific to model net.
235
236 lock_file
237 Lock file location. (/var/run/corosync-qdevice/corosync-qde‐
238 vice.pid)
239
240 local_socket_file
241 Internal IPC socket file location. (/var/run/corosync-qde‐
242 vice/corosync-qdevice.sock)
243
244 local_socket_backlog
245 Parameter passed to listen syscall. (10)
246
247 max_cs_try_again
248 How many times to retry the call to a corosync function which
249 has returned CS_ERR_TRY_AGAIN. (10)
250
251 votequorum_device_name
252 Name used for qdevice registration. (Qdevice)
253
254 ipc_max_clients
255 Maximum allowed simultaneous IPC clients. (10)
256
257 ipc_max_receive_size
258 Maximum size of a message received by IPC client. (4096)
259
260 ipc_max_send_size
261 Maximum size of a message allowed to be sent to an IPC client.
262 (65536)
263
264 master_wins
265 Force enable/disable master wins. (default is model)
266
267 heuristics_ipc_max_send_buffers
268 Maximum number of heuristics worker send buffers. (128)
269
270 heuristics_ipc_max_send_receive_size
271 Maximum size of a message allowed to be send to, or received
272 from heuristics worker. (4096)
273
274 heuristics_min_timeout
275 Minimum heuristics timeout accepted by client in ms. (1000)
276
277 heuristics_max_timeout
278 Maximum heuristics timeout accepted by client in ms. (120000)
279
280 heuristics_min_interval
281 Minimum heuristics interval accepted by client in ms. (1000)
282
283 heuristics_max_interval
284 Maximum heuristics interval accepted by client in ms. (3600000)
285
286 heuristics_max_execs
287 Maximum number of exec_ commands. (32)
288
289 heuristics_use_execvp
290 Use execvp instead of execv for executing commands. (off)
291
292 heuristics_max_processes
293 Maximum number of processes running at one time. (160)
294
295 heuristics_kill_list_interval
296 Interval between status is gathered and eventually signal is
297 sent to processes which didn't finished on time in ms. (5000)
298
299 net_nss_db_dir
300 NSS database directory. (/etc/corosync/qdevice/net/nssdb)
301
302 net_initial_msg_receive_size
303 Initial (used during connection parameters negotiation) maximum
304 size of the receive buffer for message (maximum allowed message
305 size received from qnetd). (32768)
306
307 net_initial_msg_send_size
308 Initial (used during connection parameter negotiation) maximum
309 size of one send buffer (message) to be sent to server. (32768)
310
311 net_min_msg_send_size
312 Minimum required size of one send buffer (message) to be sent to
313 server. (32768)
314
315 net_max_msg_receive_size
316 Maximum allowed size of receive buffer for a message sent by
317 server. (16777216)
318
319 net_max_send_buffers
320 Maximum number of send buffers. (10)
321
322 net_nss_qnetd_cn
323 Canonical name of qnetd server certificate. (Qnetd Server)
324
325 net_nss_client_cert_nickname
326 NSS nickname of qdevice client certificate. (Cluster Cert)
327
328 net_heartbeat_interval_min
329 Minimum heartbeat timeout accepted by client in ms. (1000)
330
331 net_heartbeat_interval_max
332 Maximum heartbeat timeout accepted by client in ms. (120000)
333
334 net_min_connect_timeout
335 Minimum connection timeout accepted by client in ms. (1000)
336
337 net_max_connect_timeout
338 Maximum connection timeout accepted by client in ms. (120000)
339
340 net_test_algorithm_enabled
341 Enable test algorithm. (if built with --enable-debug on, other‐
342 wise off)
343
344
346 Define qdevice with net model connecting to qnetd running on qnetd.ex‐
347 ample.org host, using ffsplit algorithm. Heuristics is set to sync
348 mode and executes two commands.
349
350 quorum {
351 provider: corosync_votequorum
352 device {
353 votes: 1
354 model: net
355 net {
356 tls: on
357 host: qnetd.example.org
358 algorithm: ffsplit
359 }
360 heuristics {
361 mode: sync
362 exec_ping: /bin/ping -q -c 1 "www.example.org"
363 exec_test_txt_exists: /usr/bin/test -f /tmp/test.txt
364 }
365 }
366
368 corosync-qdevice-tool(8) corosync-qdevice-net-certutil(8) corosync-
369 qnetd(8) corosync.conf(5) votequorum_qdevice_poll(3)
370
372 Jan Friesse
373
374 2020-10-27 COROSYNC-QDEVICE(8)