1COROSYNC-QDEVICE(8) System Manager's Manual COROSYNC-QDEVICE(8)
2
3
4
6 corosync-qdevice - QDevice daemon
7
9 corosync-qdevice [-dfh] [-S option=value[,option2=value2,...]]
10
11
13 corosync-qdevice is a daemon running on each node of a cluster. It pro‐
14 vides a configured number of votes to the quorum subsystem based on a
15 third-party arbitrator's decision. Its primary use is to allow a clus‐
16 ter to sustain more node failures than standard quorum rules allow. It
17 is recommended for clusters with an even number of nodes and highly
18 recommended for 2 node clusters.
19
21 -d Forcefully turn on debug information without the need to change
22 corosync.conf.
23
24 -f Do not daemonize, run in the foreground.
25
26 -h Show short help text
27
28 -S Set advanced settings described in its own section below. This
29 option shouldn't be generally used because most of the options
30 are not safe to change.
31
33 corosync-qdevice reads its configuration from corosync.conf file.
34
35 The main configuration is within quorum.device sub-key. Each model also
36 has its own configuration within a similarly named sub-key.
37
38 model Specifies the model to be used. This parameter is required.
39 corosync-qdevice is modular and is able to support multiple dif‐
40 ferent models. The model basically defines what type of arbitra‐
41 tor is used. Currently only net is supported.
42
43 timeout
44 Specifies how often corosync-qdevice should call the votequo‐
45 rum_poll function. It is also used by the net model to adjust
46 its hearbeat timeout. It is recommended that you don't change
47 this value. Default is 10000.
48
49 sync_timeout
50 Specifies how often corosync-qdevice should call the votequo‐
51 rum_poll function during a sync phase. It is recommended that
52 you don't change this value. Default is 30000.
53
54 votes The number of votes provided to the cluster by qdevice. Default
55 is (number_of_nodes - 1) or generally sum(votes_per_node) - 1.
56
57
58 quorum.device.heuristics subkey holds the configuration of the heuris‐
59 tics. Heuristics are set of commands executed locally on startup, clus‐
60 ter membership change, successful connect to corosync-qnetd and option‐
61 ally also at regular times. Commands are executed in parallel. When
62 all commands finish successfully (their return error code is zero) on
63 time, heuristics have passed, otherwise they have failed. The heuris‐
64 tics result is sent to corosync-qnetd and there it's used in calcula‐
65 tions to determine which partition should be quorate.
66
67 timeout
68 Specifies maximum time in milliseconds how long corosync-qdevice
69 waits till the heuristics commands finish. If some command
70 doesn't finish before the timeout, it's killed and heuristics
71 fail. This timeout is used for heuristics executed at regular
72 times. Default value is half of the quorum.device.timeout, so
73 5000.
74
75 sync_timeout
76 Similar to quorum.device.heuristics.timeout but used during mem‐
77 bership changes. Default value is half of the quo‐
78 rum.device.sync_timeout, so 15000.
79
80 interval
81 Specifies interval between two regular heuristics execution.
82 Default value is 3 * quorum.device.timeout, so 30000.
83
84 mode Can be one of on, sync or off and specifies mode of operation of
85 heuristics. Default is off, which means heuristics are disabled.
86 When sync is set, heuristics are executed only during startup,
87 membership change and when connection to corosync-qnetd is
88 established. When heuristics should be running also on regular
89 basis, this option should be set to on value.
90
91 exec_NAME
92 defines executables. NAME can be arbitrary valid cmap key name
93 string and it has no special meaning. The value of this vari‐
94 able must contain a command to execute. The value is parsed
95 (split) into arguments similarly as Bourne shell would do. Quot‐
96 ing is possible by using backslash and double quotes.
97
98
99 quorum.device.net subkey holds the configuration for model net.
100
101 tls Can be one of on, off or required and specifies if tls should be
102 used. on means a connection with TLS is attempted first, but if
103 the server doesn't advertise TLS support then non-TLS will be
104 used. off is used then TLS is not required and it's then not
105 even tried. This mode is the only one which doesn't need a prop‐
106 erly initialized NSS database. required means TLS is required
107 and if the server doesn't support TLS, qdevice will exit with
108 error message. Default is on.
109
110 host Specifies the IP address or host name of the qnetd server to be
111 used. This parameter is required.
112
113 port Specifies TCP port of qnetd server. Default is 5403.
114
115 algorithm
116 Decision algorithm. Can be one of the ffsplit or lms. (actually
117 there are also test and 2nodelms, both of which are mainly for
118 developers and shouldn't be used for production clusters). For
119 a description of what each algorithm means and how the algo‐
120 rithms differ see their individual sections. Default value is
121 ffsplit.
122
123 tie_breaker
124 can be one of lowest, highest or valid_node_id (number) values.
125 It's used as a fallback if qdevice has to decide between two or
126 more equal partitions. lowest means the partition with the low‐
127 est node id is chosen. highest means the partition with highest
128 node id is chosen. And valid_node_id means that the partition
129 containing the node with the given node id is chosen. Default
130 is lowest.
131
132 connect_timeout
133 Timeout when corosync-qdevice is trying to connect to corosync-
134 qnetd host. Default is 0.8 * quorum.sync_timeout.
135
136 force_ip_version
137 can be one of 0|4|6 and forces the software to use the given IP
138 version. 0 (default value) means IPv6 is preferred and IPv4
139 should be used as a fallback.
140
141
142 Logging configuration is within the logging directive. corosync-qde‐
143 vice parses and supports most of the options with exception of to_log‐
144 file, logfile and logfile_priority. The logger_subsys sub-directive
145 can be also used if subsys is set to QDEVICE.
146
147
148 For corosync-qdevice to work correctly, the nodelist directive has to
149 be used and properly configured. Also the net model requires that
150 totem.cluster_name option is set.
151
152
154 For model net to work using TLS, it's necessary to create the NSS data‐
155 base, import Qnetd CA certificate, and get/distribute a valid client
156 certificate.
157
158 If pcs is used (recommended) the following steps are not needed because
159 pcs does them automatically.
160
161 corosync-qdevice-net-certutil is the tool to perform required actions
162 semi-automatically. Please consult the help output of it and its man
163 page. For a first time configuration it may make sense to start with
164 the -Q option.
165
166 If TLS is not required just edit corosync.conf file and set quo‐
167 rum.device.net.tls to off.
168
169 Depending on configuration of NSS (stored in nss.config file usually in
170 /etc/crypto-policies/back-ends/ directory) disabled ciphers or too
171 short keys may be rejected. Proper solution is to regenerate NSS data‐
172 bases for both corosync-qnetd and corosync-qdevice daemons. As a quick
173 workaround it's also possible to set environment variable
174 NSS_IGNORE_SYSTEM_POLICY=1 before running corosync-qdevice daemon.
175
176 When NSS is updated it may also be needed to upgrade database into new
177 format. There is no consensus on recommended way, but following command
178 seems to work just fine (if qdevice sysconfdir is set to /etc)
179
180 # certutil -N -d /etc/corosync/qdevice/net/nssdb -f /etc/corosync/qdevice/net/nssdb/pwdfile.txt
181
182
184 Algorithms are used to change behavior of how corosync-qnetd provides
185 votes to a given node/partition. Currently there are two algorithms
186 supported.
187
188 ffsplit
189 This one makes sense only for clusters with an even number of
190 nodes. It provides exactly one vote to the partition with the
191 highest number of active nodes. If there are two exactly similar
192 partitions, it provides its vote to the partition with higher
193 score. The score is computed as (number_of_connected_nodes +
194 number_of_connected_nodes_with_passed_heuristics - num‐
195 ber_of_connected_nodes_with_failed_heuristics) If the scores are
196 equal, the vote is provided to partition with the most clients
197 connected to the qnetd server. If this number is also equal,
198 then the tie_breaker is used. It is able to transition its vote
199 if the currently active partition becomes partitioned and a non-
200 active partition still has at least 50% of the active nodes.
201 Because of this, a vote is not provided if the qnetd connection
202 is not active.
203
204 To use this algorithm it's required to set the number of votes
205 per node to 1 (default) and the qdevice number of votes has to
206 be also 1. This is achieved by setting quorum.device.votes key
207 in corosync.conf file to 1.
208
209 lms Last-man-standing. If the node is the only one left in the clus‐
210 ter that can see the qnetd server then we return a vote.
211
212 If more than one node can see the qnetd server but some nodes
213 can't see each other then the cluster is divided up into 'parti‐
214 tions' based on their ring_id and this algorithm returns a vote
215 to the partition with highest heuristics score (computed the
216 same way as for the ffsplit algorithm), or if there is more than
217 1 partition with equal scores, the largest active partition or,
218 if there is more than 1 equal partition, the partition that con‐
219 tains the tie_breaker node (lowest, highest, etc). For LMS to
220 work, the number of qdevice votes has to be set to default (so
221 just delete quorum.device.votes key from corosync.conf).
222
223
225 Set by using -S option. The default value is shown in parentheses)
226 Options beginning with net_ prefix are specific to model net.
227
228 lock_file
229 Lock file location. (/var/run/corosync-qdevice/corosync-qde‐
230 vice.pid)
231
232 local_socket_file
233 Internal IPC socket file location. (/var/run/corosync-qde‐
234 vice/corosync-qdevice.sock)
235
236 local_socket_backlog
237 Parameter passed to listen syscall. (10)
238
239 max_cs_try_again
240 How many times to retry the call to a corosync function which
241 has returned CS_ERR_TRY_AGAIN. (10)
242
243 votequorum_device_name
244 Name used for qdevice registration. (Qdevice)
245
246 ipc_max_clients
247 Maximum allowed simultaneous IPC clients. (10)
248
249 ipc_max_receive_size
250 Maximum size of a message received by IPC client. (4096)
251
252 ipc_max_send_size
253 Maximum size of a message allowed to be sent to an IPC client.
254 (65536)
255
256 master_wins
257 Force enable/disable master wins. (default is model)
258
259 heuristics_ipc_max_send_buffers
260 Maximum number of heuristics worker send buffers. (128)
261
262 heuristics_ipc_max_send_receive_size
263 Maximum size of a message allowed to be send to, or received
264 from heuristics worker. (4096)
265
266 heuristics_min_timeout
267 Minimum heuristics timeout accepted by client in ms. (1000)
268
269 heuristics_max_timeout
270 Maximum heuristics timeout accepted by client in ms. (120000)
271
272 heuristics_min_interval
273 Minimum heuristics interval accepted by client in ms. (1000)
274
275 heuristics_max_interval
276 Maximum heuristics interval accepted by client in ms. (3600000)
277
278 heuristics_max_execs
279 Maximum number of exec_ commands. (32)
280
281 heuristics_use_execvp
282 Use execvp instead of execv for executing commands. (off)
283
284 heuristics_max_processes
285 Maximum number of processes running at one time. (160)
286
287 heuristics_kill_list_interval
288 Interval between status is gathered and eventually signal is
289 sent to processes which didn't finished on time in ms. (5000)
290
291 net_nss_db_dir
292 NSS database directory. (/etc/corosync/qdevice/net/nssdb)
293
294 net_initial_msg_receive_size
295 Initial (used during connection parameters negotiation) maximum
296 size of the receive buffer for message (maximum allowed message
297 size received from qnetd). (32768)
298
299 net_initial_msg_send_size
300 Initial (used during connection parameter negotiation) maximum
301 size of one send buffer (message) to be sent to server. (32768)
302
303 net_min_msg_send_size
304 Minimum required size of one send buffer (message) to be sent to
305 server. (32768)
306
307 net_max_msg_receive_size
308 Maximum allowed size of receive buffer for a message sent by
309 server. (16777216)
310
311 net_max_send_buffers
312 Maximum number of send buffers. (10)
313
314 net_nss_qnetd_cn
315 Canonical name of qnetd server certificate. (Qnetd Server)
316
317 net_nss_client_cert_nickname
318 NSS nickname of qdevice client certificate. (Cluster Cert)
319
320 net_heartbeat_interval_min
321 Minimum heartbeat timeout accepted by client in ms. (1000)
322
323 net_heartbeat_interval_max
324 Maximum heartbeat timeout accepted by client in ms. (120000)
325
326 net_min_connect_timeout
327 Minimum connection timeout accepted by client in ms. (1000)
328
329 net_max_connect_timeout
330 Maximum connection timeout accepted by client in ms. (120000)
331
332 net_test_algorithm_enabled
333 Enable test algorithm. (if built with --enable-debug on, other‐
334 wise off)
335
336
338 Define qdevice with net model connecting to qnetd running on
339 qnetd.example.org host, using ffsplit algorithm. Heuristics is set to
340 sync mode and executes two commands.
341
342 quorum {
343 provider: corosync_votequorum
344 device {
345 votes: 1
346 model: net
347 net {
348 tls: on
349 host: qnetd.example.org
350 algorithm: ffsplit
351 }
352 heuristics {
353 mode: sync
354 exec_ping: /bin/ping -q -c 1 "www.example.org"
355 exec_test_txt_exists: /usr/bin/test -f /tmp/test.txt
356 }
357 }
358
360 corosync-qdevice-tool(8) corosync-qdevice-net-certutil(8) corosync-
361 qnetd(8) corosync.conf(5)
362
364 Jan Friesse
365
366 2018-08-09 COROSYNC-QDEVICE(8)