1COROSYNC-QDEVICE(8)         System Manager's Manual        COROSYNC-QDEVICE(8)
2
3
4

NAME

6       corosync-qdevice - QDevice daemon
7

SYNOPSIS

9       corosync-qdevice [-dfh] [-S option=value[,option2=value2,...]]
10
11

DESCRIPTION

13       corosync-qdevice is a daemon running on each node of a cluster. It pro‐
14       vides a configured number of votes to the quorum subsystem based  on  a
15       third-party  arbitrator's decision. Its primary use is to allow a clus‐
16       ter to sustain more node failures than standard quorum rules allow.  It
17       is  recommended  for  clusters  with an even number of nodes and highly
18       recommended for 2 node clusters.
19

OPTIONS

21       -d     Forcefully turn on debug information without the need to  change
22              corosync.conf.
23
24       -f     Do not daemonize, run in the foreground.
25
26       -h     Show short help text
27
28       -S     Set  advanced  settings described in its own section below. This
29              option shouldn't be generally used because most of  the  options
30              are not safe to change.
31

CONFIGURATION

33       corosync-qdevice reads its configuration from corosync.conf file.
34
35       The main configuration is within quorum.device sub-key. Each model also
36       has its own configuration within a similarly named sub-key.
37
38       model  Specifies the model to be  used.  This  parameter  is  required.
39              corosync-qdevice is modular and is able to support multiple dif‐
40              ferent models. The model basically defines what type of arbitra‐
41              tor is used. Currently only net is supported.
42
43       timeout
44              Specifies  how  often  corosync-qdevice should call the votequo‐
45              rum_poll function. It is also used by the net  model  to  adjust
46              its  hearbeat  timeout.  It is recommended that you don't change
47              this value.  Default is 10000.
48
49       sync_timeout
50              Specifies how often corosync-qdevice should  call  the  votequo‐
51              rum_poll  function  during  a sync phase. It is recommended that
52              you don't change this value.  Default is 30000.
53
54       votes  The number of votes provided to the cluster by qdevice.  Default
55              is (number_of_nodes - 1) or generally sum(votes_per_node) - 1.
56
57
58       quorum.device.heuristics  subkey holds the configuration of the heuris‐
59       tics. Heuristics are set of commands executed locally on startup, clus‐
60       ter membership change, successful connect to corosync-qnetd and option‐
61       ally also at regular  times.  When  all  commands  finish  successfully
62       (their return error code is zero) on time, heuristics have passed, oth‐
63       erwise they have failed. The heuristics result  is  sent  to  corosync-
64       qnetd  and there it's used in calculations to determine which partition
65       should be quorate.
66
67       timeout
68              Specifies maximum time in milliseconds how long corosync-qdevice
69              waits  till  the  heuristics  commands  finish.  If some command
70              doesn't finish before the timeout, it's  killed  and  heuristics
71              fail.  This  timeout  is used for heuristics executed at regular
72              times.  Default value is half of the  quorum.device.timeout,  so
73              5000.
74
75       sync_timeout
76              Similar to quorum.device.heuristics.timeout but used during mem‐
77              bership  changes.  Default   value   is   half   of   the   quo‐
78              rum.device.sync_timeout, so 15000.
79
80       interval
81              Specifies  interval  between  two  regular heuristics execution.
82              Default value is 3 * quorum.device.timeout, so 30000.
83
84       mode   Can be on of on, sync or off and specifies mode of operation  of
85              heuristics.  Default  is off what means heuristics are disabled.
86              When sync is set, heuristics are executed only  during  startup,
87              membership  change  and  when  connection  to  corosync-qnetd is
88              established. When heuristics should be running also  on  regular
89              basis, this option should be set to on value.
90
91       exec_NAME
92              defines  executables.  NAME can be arbitrary valid cmap key name
93              string and it has no special meaning.  The value of  this  vari‐
94              able  must  contain  a  command  to  execute. The alue is parsed
95              (split) into arguments similarly as Bourne shell would do. Quot‐
96              ing is possible by using backslash and double quotes.
97
98
99       quorum.device.net subkey holds the configuration for model 'net'.
100
101       tls    Can be one of on, off or required and specifies if tls should be
102              used.  on means a connection with TLS is attempted first, but if
103              the  server  doesn't  advertise TLS support then non-TLS will be
104              used.  off is used then TLS is not required and  it's  then  not
105              even tried. This mode is the only one which doesn't need a prop‐
106              erly initialized NSS database.  required means TLS  is  required
107              and  if  the  server doesn't support TLS, qdevice will exit with
108              error message. Default is on.
109
110       host   Specifies the IP address or host name of the qnetd server to  be
111              used. This parameter is required.
112
113       port   Specifies TCP port of qnetd server. Default is 5403.
114
115       algorithm
116              Decision algorithm. Can be one of the ffsplit or lms.  (actually
117              there are also test and 2nodelms , both of which are mainly  for
118              developers and shouldn't be used for production clusters). For a
119              description of what each algorithm means and how the  algorithms
120              differ see their individual sections.  Default value is ffsplit.
121
122       tie_breaker
123              can  be one of lowest, highest or valid_node_id (number) values.
124              It's used as a fallback if qdevice has to decide between two  or
125              more equal partitions.  lowest means the partition with the low‐
126              est node id is chosen.  highest means the partition with highest
127              node  id  is  chosen. And valid_node_id means that the partition
128              containing the node with the given node id is  chosen.   Default
129              is 'lowest'.
130
131       connect_timeout
132              Timeout  when corosync-qdevice is trying to connect to corosync-
133              qnetd host. Default is 0.8 * quorum.sync_timeout.
134
135       force_ip_version
136              can be one of 0|4|6 and forces the software to use the given  IP
137              version.   0  (default  value)  means  IPv6 is prefered and IPv4
138              should be used as a fallback.
139
140
141       Logging configuration is within the logging  directive.   corosync-qde‐
142       vice  parses and supports most of the options with exception of to_log‐
143       file, logfile and logfile_priority.   The  logger_subsys  sub-directive
144       can be also used if subsys is set to QDEVICE.
145
146
147       For  corosync-qdevice  to work correctly, the nodelist directive has to
148       be used and properly configured.  Also  the  net  model  requires  that
149       totem.cluster_name option is set.
150
151

MODEL NET TLS CONFIGURATION

153       For model net to work using TLS, it's necessary to create the NSS data‐
154       base, import Qnetd CA certificate, and get/distribute  a  valid  client
155       certificate.
156
157       If pcs is used (recommended) the following steps are not needed because
158       pcs does them automatically.
159
160       corosync-qdevice-net-certutil is the tool to perform  required  actions
161       semi-automatically.  Please  consult  the help output of it and its man
162       page. For a first time configuration it may make sense  to  start  with
163       the -Q option.
164
165       If  TLS  is  not  required  just  edit  corosync.conf file and set quo‐
166       rum.device.net.tls to off.
167
168

MODEL NET ALGORITHMS

170       Algorithms are used to change behavior of how  corosync-qnetd  provides
171       votes  to  a  given  node/partition. Currently there are two algorithms
172       supported.
173
174       ffsplit
175              This one makes sense only for clusters with an  even  number  of
176              nodes.  It  provides  exactly one vote to the partition with the
177              highest number of active nodes. If there are two exactly similar
178              partitions,  it  provides  its vote to the partition with higher
179              score. The score is  computed  as  (number_of_connected_nodes  +
180              number_of_connected_nodes_with_passed_heuristics      -     num‐
181              ber_of_connected_nodes_with_failed_heuristics) If the scores are
182              equal,  the  vote is provided to partition with the most clients
183              connected to the qnetd server. If this  number  is  also  equal,
184              then  the tie_breaker is used. It is able to transition its vote
185              if the currently active partition becomes partitioned and a non-
186              active  partition  still  has  at least 50% of the active nodes.
187              Because of this, a vote is not provided if the qnetd  connection
188              is not active.
189
190              To  use  this algorithm it's required to set the number of votes
191              per node to 1 (default) and the qdevice number of votes  has  to
192              be  also  1. This is achieved by setting quorum.device.votes key
193              in corosync.conf file to 1.
194
195       lms    Last-man-standing. If the node is the only one left in the clus‐
196              ter that can see the qnetd server then we return a vote.
197
198              If  more  than  one node can see the qnetd server but some nodes
199              can't see each other then the cluster is divided up into 'parti‐
200              tions'  based on their ring_id and this algorithm returns a vote
201              to the partition with highest  heuristics  score  (computed  the
202              same way as for the ffsplit algorithm), or if there is more than
203              1 partition with equal scores, the largest active partition  or,
204              if there is more than 1 equal partition, the partition that con‐
205              tains the tie_breaker node (lowest, highest, etc).  For  LMS  to
206              work,  the  number of qdevice votes has to be set to default (so
207              just delete quorum.device.votes key from corosync.conf).
208
209

ADVANCED SETTINGS

211       Set by using -S option. The default  value  is  shown  in  parentheses)
212       Options beginning with net_ prefix are specific to model net.
213
214       lock_file
215              Lock   file  location.  (/var/run/corosync-qdevice/corosync-qde‐
216              vice.pid)
217
218       local_socket_file
219              Internal  IPC  socket  file  location.   (/var/run/corosync-qde‐
220              vice/corosync-qdevice.sock)
221
222       local_socket_backlog
223              Parameter passed to listen syscall. (10)
224
225       max_cs_try_again
226              How  many  times  to retry the call to a corosync function which
227              has returned CS_ERR_TRY_AGAIN. (10)
228
229       votequorum_device_name
230              Name used for qdevice registration. (Qdevice)
231
232       ipc_max_clients
233              Maximum allowed simultaneous IPC clients. (10)
234
235       ipc_max_receive_size
236              Maximum size of a message received by IPC client. (4096)
237
238       ipc_max_send_size
239              Maximum size of a message allowed to be sent to an  IPC  client.
240              (65536)
241
242       master_wins
243              Force enable/disable master wins. (default is model)
244
245       heuristics_ipc_max_send_buffers
246              Maximum number of heuristics worker send buffers. (128)
247
248       heuristics_ipc_max_send_receive_size
249              Maximum  size  of  a  message allowed to be send to, or received
250              from heuristics worker. (4096)
251
252       heuristics_min_timeout
253              Minimum heuristics timeout accepted by client in ms. (1000)
254
255       heuristics_max_timeout
256              Maximum heuristics timeout accepted by client in ms. (120000)
257
258       heuristics_min_interval
259              Minimum heuristics interval accepted by client in ms. (1000)
260
261       heuristics_max_interval
262              Maximum heuristics interval accepted by client in ms. (3600000)
263
264       heuristics_max_execs
265              Maximum number of exec_ commands. (32)
266
267       heuristics_use_execvp
268              Use execvp instead of execv for executing commands. (off)
269
270       heuristics_max_processes
271              Maximum number of processes running at one time.  (160)  heuris‐
272              tics_kill_list_interval  Interval between status is gathered and
273              eventually signal is sent to processes which didn't finished  on
274              time in ms. (5000)
275
276       net_nss_db_dir
277              NSS database directory. (/etc/corosync/qdevice/net/nssdb)
278
279       net_initial_msg_receive_size
280              Initial  (used during connection parameters negotiation) maximum
281              size of the receive buffer for message (maximum allowed  message
282              size received from qnetd). (32768)
283
284       net_initial_msg_send_size
285              Initial  (used  during connection parameter negotiation) maximum
286              size of one send buffer (message) to be sent to server. (32768)
287
288       net_min_msg_send_size
289              Minimum required size of one send buffer (message) to be sent to
290              server. (32768)
291
292       net_max_msg_receive_size
293              Maximum  allowed  size  of  receive buffer for a message sent by
294              server. (16777216)
295
296       net_max_send_buffers
297              Maximum number of send buffers. (10)
298
299       net_nss_qnetd_cn
300              Canonical name of qnetd server certificate. (Qnetd Server)
301
302       net_nss_client_cert_nickname
303              NSS nickname of qdevice client certificate. (Cluster Cert)
304
305       net_heartbeat_interval_min
306              Minimum heartbeat timeout accepted by client in ms. (1000)
307
308       net_heartbeat_interval_max
309              Maximum heartbeat timeout accepted by client in ms. (120000)
310
311       net_min_connect_timeout
312              Minimum connection timeout accepted by client in ms. (1000)
313
314       net_max_connect_timeout
315              Maximum connection timeout accepted by client in ms. (120000)
316
317       net_test_algorithm_enabled
318              Enable test algorithm. (if built with --enable-debug on,  other‐
319              wise off)
320
321

EXAMPLE

323       Define   qdevice   with  net  model  connecting  to  qnetd  running  on
324       qnetd.example.org host, using ffsplit algorithm.  Heuristics is set  to
325       sync mode and executes two commands.
326
327       quorum {
328         provider: corosync_votequorum
329         device {
330           votes: 1
331           model: net
332           net {
333             tls: on
334             host: qnetd.example.org
335             algorithm: ffsplit
336           }
337           heuristics {
338             mode: sync
339             exec_ping: /bin/ping -q -c 1 "www.example.org"
340             exec_test_txt_exists: /usr/bin/test -f /tmp/test.txt
341           }
342       }
343

SEE ALSO

345       corosync-qdevice-tool(8)   corosync-qdevice-net-certutil(8)   corosync-
346       qnetd(8) corosync.conf(5)
347

AUTHOR

349       Jan Friesse
350
351                                  2017-10-17               COROSYNC-QDEVICE(8)
Impressum