openais.conf(5)

1OPENAIS_CONF(5)           Openais Programmer's Manual          OPENAIS_CONF(5)
2
3
4

NAME

6       openais.conf - openais executive configuration file
7
8

SYNOPSIS

10       /etc/ais/openais.conf
11
12

DESCRIPTION

14       The  openais.conf instructs the openais executive about various parame‐
15       ters needed to control the openais executive.  The  configuration  file
16       consists  of  bracketed  top  level directives.  The possible directive
17       choices are totem  { } , logging { } , event { } , and amf { }.
18        These directives are described below.
19
20
21       totem { }
22              This top level directive contains configuration options for  the
23              totem protocol.
24
25       logging { }
26              This top level directive contains configuration options for log‐
27              ging.
28
29       event { }
30              This top level directive contains configuration options for  the
31              event service.
32
33       amf { }
34              This  top level directive contains configuration options for the
35              AMF service.
36
37
38       Within the totem directive, an interface directive is required.   There
39       is also one configuration option which is required:
40
41       Within  the  interface sub-directive of totem there are four parameters
42       which are required:
43
44
45       ringnumber
46              This specifies the ring number for the  interface.   When  using
47              the redundant ring protocol, each interface should specify sepa‐
48              rate ring numbers to uniquely identify to the membership  proto‐
49              col which interface to use for which redundant ring.
50
51
52       bindnetaddr
53              This  specifies  the  address which the openais executive should
54              bind.  This address should always end in  zero.   If  the  totem
55              traffic  should  be routed over 192.168.5.92, set bindnetaddr to
56              192.168.5.0.
57
58              This may also be an IPV6 address, in which case IPV6  networking
59              will  be used.  In this case, the full address must be specified
60              and there is no automatic selection  of  the  network  interface
61              within a specific subnet as with IPv4.
62
63              If IPv6 networking is used, the nodeid field must be specified.
64
65
66       mcastaddr
67              This  is  the  multicast address used by openais executive.  The
68              default should work for most networks, but the network  adminis‐
69              trator  should  be  queried  about  a  multicast address to use.
70              Avoid 224.x.x.x because this is a "config" multicast address.
71
72              This may also be an IPV6 multicast address, in which  case  IPV6
73              networking will be used.  If IPv6 networking is used, the nodeid
74              field must be specified.
75
76
77       mcastport
78              This specifies the UDP port number.  It is possible to  use  the
79              same  multicast  address  on a network with the openais services
80              configured for different UDP ports.
81
82
83       Within the totem directive, there are seven  configuration  options  of
84       which one is required, five are optional, and one is required when IPV6
85       is configured in the interface subdirective.   The  required  directive
86       controls  the  version of the totem configuration.  The optional option
87       unless using IPV6 directive controls identification of  the  processor.
88       The  optional options control secrecy and authentication, the redundant
89       ring mode of operation, maximum network  MTU,  and  number  of  sending
90       threads, and the nodeid field.
91
92
93       version
94              This specifies the version of the configuration file.  Currently
95              the only valid version for this directive is 2.
96
97
98       nodeid This configuration  option  is  optional  when  using  IPv4  and
99              required when using IPv6.  This is a 32 bit value specifying the
100              node identifier delivered to the cluster membership service.  If
101              this  is not specified with IPv4, the node id will be determined
102              from the 32 bit IP address the system to  which  the  system  is
103              bound  with  ring identifier of 0.  The node identifier value of
104              zero is reserved and should not be used.
105
106
107       secauth
108              This specifies that HMAC/SHA1 authentication should be  used  to
109              authenticate  all  messages.  It further specifies that all data
110              should be encrypted with the sober128  encryption  algorithm  to
111              protect data from eavesdropping.
112
113              Enabling this option adds a 36 byte header to every message sent
114              by totem which reduces total throughput.  Encryption and authen‐
115              tication  consume  75% of CPU cycles in aisexec as measured with
116              gprof when enabled.
117
118              For 100mbit  networks  with  1500  MTU  frame  transmissions:  A
119              throughput of 9mb/sec is possible with 100% cpu utilization when
120              this option is enabled on 3ghz cpus.  A throughput  of  10mb/sec
121              is  possible wth 20% cpu utilization when this optin is disabled
122              on 3ghz cpus.
123
124              For gig-e networks with large frame transmissions: A  throughput
125              of  20mb/sec  is  possible  when  this option is enabled on 3ghz
126              cpus.  A throughput of 60mb/sec is possible when this option  is
127              disabled on 3ghz cpus.
128
129              The default is on.
130
131
132       rrp_mode
133              This  specifies  the  mode of redundant ring, which may be none,
134              active, or passive.  Active replication  offers  slightly  lower
135              latency from transmit to delivery in faulty network environments
136              but with less performance.  Passive replication may nearly  dou‐
137              ble  the  speed  of  the  totem protocol if the protocol doesn't
138              become cpu bound.  The final option is none, in which case  only
139              one  network  interface will be used to operate the totem proto‐
140              col.
141
142              If only one interface directive is specified, none is  automati‐
143              cally  chosen.   If multiple interface directives are specified,
144              only active or passive may be chosen.
145
146
147       netmtu This specifies the network maximum transmit unit.  To  set  this
148              value  beyond  1500,  the  regular  frame MTU, requires ethernet
149              devices that support large, or also called  jumbo,  frames.   If
150              any device in the network doesn't support large frames, the pro‐
151              tocol will not operate properly.  The hosts must also have their
152              mtu size set from 1500 to whatever frame size is specified here.
153
154              Please  note  while some NICs or switches claim large frame sup‐
155              port, they support 9000 MTU as the maximum frame size  including
156              the  IP  header.   Setting the netmtu and host MTUs to 9000 will
157              cause totem to use the full 9000 bytes of the frame.  Then Linux
158              will  add  a  18 byte header moving the full frame size to 9018.
159              As a result some hardware will not operate  properly  with  this
160              size  of data.  A netmtu of 8982 seems to work for the few large
161              frame devices that have been tested.  Some  manufacturers  claim
162              large  frame  support  when  in fact they support frame sizes of
163              4500 bytes.
164
165              Increasing the MTU from 1500 to 8982 doubles throughput  perfor‐
166              mance  from  30MB/sec to 60MB/sec as measured with evsbench with
167              175000 byte messages with the secauth directive set to off.
168
169              When sending multicast traffic, if the network frequently recon‐
170              figures,  chances  are  that  some device in the network doesn't
171              support large frames.
172
173              Choose hardware carefully if intending to use large  frame  sup‐
174              port.
175
176              The default is 1500.
177
178
179       threads
180              This directive controls how many threads are used to encrypt and
181              send multicast messages.  If secauth is off, the  protocol  will
182              never  use  threaded  sending.  If secauth is on, this directive
183              allows systems to be  configured  to  use  multiple  threads  to
184              encrypt and send multicast messages.
185
186              A  thread  directive of 0 indicates that no threaded send should
187              be used.  This mode offers best performance for non-SMP systems.
188
189              The default is 0.
190
191
192       vsftype
193              This directive controls the virtual synchrony filter  type  used
194              to  identify  a  primary component.  The preferred choice is YKD
195              dynamic linear voting, however,  for  clusters  larger  then  32
196              nodes  YKD  consumes  alot  of memory.  For large scale clusters
197              that are created by changing the MAX_PROCESSORS_COUNT #define in
198              the  C code totem.h file, the virtual synchrony filter "none" is
199              recommended but then AMF and DLCK services (which are  currently
200              experimental) are not safe for use.
201
202              The default is ykd.  The vsftype can also be set to none.
203
204              Within  the  totem  directive,  there  are several configuration
205              options which are used to control the operation of the protocol.
206              It  is  generally  not recommended to change any of these values
207              without proper guidance and sufficient testing.   Some  networks
208              may  require larger values if suffering from frequent reconfigu‐
209              rations.  Some applications may require faster failure detection
210              times which can be achieved by reducing the token timeout.
211
212
213       token  This  timeout  specifies  in  milliseconds until a token loss is
214              declared after not receiving a token.  This is  the  time  spent
215              detecting a failure of a processor in the current configuration.
216              Reforming a new configuration takes  about  50  milliseconds  in
217              addition to this timeout.
218
219              The default is 1000 milliseconds.
220
221
222       token_retransmit
223              This  timeout  specifies  in  milliseconds after how long before
224              receiving a token the token  is  retransmitted.   This  will  be
225              automatically calculated if token is modified.  It is not recom‐
226              mended to alter this value without  guidance  from  the  openais
227              community.
228
229              The default is 238 milliseconds.
230
231
232       hold   This timeout specifies in milliseconds how long the token should
233              be held by the representative when the  protocol  is  under  low
234              utilization.   It is not recommended to alter this value without
235              guidance from the openais community.
236
237              The default is 180 milliseconds.
238
239
240       retransmits_before_loss
241              This value identifies  how  many  token  retransmits  should  be
242              attempted  before forming a new configuration.  If this value is
243              set, retransmit and hold will be automatically  calculated  from
244              retransmits_before_loss and token.
245
246              The default is 4 retransmissions.
247
248
249       join   This timeout specifies in milliseconds how long to wait for join
250              messages in the membership protocol.
251
252              The default is 100 milliseconds.
253
254
255       send_join
256              This timeout specifies in milliseconds an upper range between  0
257              and  send_join  to wait before sending a join message.  For con‐
258              figurations with less then 32 nodes, this parameter is not  nec‐
259              essary.  For larger rings, this parameter is necessary to ensure
260              the NIC is not overflowed with join messages on formation  of  a
261              new  ring.  A reasonable value for large rings (128 nodes) would
262              be 80msec.  Other timer values must also change if this value is
263              changed.  Seek advice from the openais mailing list if trying to
264              run larger configurations.
265
266              The default is 0 milliseconds.
267
268
269       consensus
270              This timeout specifies in milliseconds how long to wait for con‐
271              sensus  to be achieved before starting a new round of membership
272              configuration.
273
274              The default is 200 milliseconds.
275
276
277       merge  This timeout specifies in milliseconds how long to  wait  before
278              checking  for  a  partition  when  no multicast traffic is being
279              sent.  If multicast traffic is being sent, the  merge  detection
280              happens automatically as a function of the protocol.
281
282              The default is 200 milliseconds.
283
284
285       downcheck
286              This  timeout  specifies in milliseconds how long to wait before
287              checking that a network interface is back up after it  has  been
288              downed.
289
290              The default is 1000 millseconds.
291
292
293       fail_to_recv_const
294              This  constant specifies how many rotations of the token without
295              receiving any of the messages when messages should  be  received
296              may occur before a new configuration is formed.
297
298              The default is 50 failures to receive a message.
299
300
301       seqno_unchanged_const
302              This  constant specifies how many rotations of the token without
303              any multicast traffic should occur before  the  merge  detection
304              timeout is started.
305
306              The default is 30 rotations.
307
308
309       heartbeat_failures_allowed
310              [HeartBeating  mechanism]  Configures  the optional HeartBeating
311              mechanism for faster failure detection. Keep in mind that engag‐
312              ing  this  mechanism  in  lossy networks could cause faulty loss
313              declaration as the mechanism relies on the  network  for  heart‐
314              beating.
315
316              So as a rule of thumb use this mechanism if you require improved
317              failure in low to medium utilized networks.
318
319              This constant specifies the number  of  heartbeat  failures  the
320              system should tolerate before declaring heartbeat failure e.g 3.
321              Also if this value is not set or is 0 then the heartbeat  mecha‐
322              nism  is  not  engaged  in  the system and token rotation is the
323              method of failure detection
324
325              The default is 0 (disabled).
326
327
328       max_network_delay
329              [HeartBeating mechanism] This constant specifies in milliseconds
330              the  approximate  delay that your network takes to transport one
331              packet from one machine to another. This value is to be  set  by
332              system  engineers  and  please  dont  change if not sure as this
333              effects the failure detection mechanism using heartbeat.
334
335              The default is 50 milliseconds.
336
337
338       window_size
339              This constant specifies the maximum number of messages that  may
340              be  sent  on  one  token  rotation.   If  all processors perform
341              equally well, this value  could  be  large  (300),  which  would
342              introduce  higher  latency from origination to delivery for very
343              large  rings.   To  reduce  latency  in  large  rings(16+),  the
344              defaults  are a safe compromise.  If 1 or more slow processor(s)
345              are present among fast  processors,  window_size  should  be  no
346              larger  then  256000  /  netmtu  to avoid overflow of the kernel
347              receive buffers.  The user is notified of this by the display of
348              a retransmit list in the notification logs.  There is no loss of
349              data, but performance is reduced when these errors occur.
350
351              The default is 50 messages.
352
353
354       max_messages
355              This constant specifies the maximum number of messages that  may
356              be  sent by one processor on receipt of the token.  The max_mes‐
357              sages parameter is limited to 256000 / netmtu to  prevent  over‐
358              flow of the kernel transmit buffers.
359
360              The default is 17 messages.
361
362
363       rrp_problem_count_timeout
364              This  specifies  the  time in milliseconds to wait before decre‐
365              menting the problem count by 1 for a particular ring to ensure a
366              link is not marked faulty for transient network failures.
367
368              The default is 1000 milliseconds.
369
370
371       rrp_problem_count_threshold
372              This  specifies the number of times a problem is detected with a
373              link before setting the link faulty.  Once a link is set faulty,
374              no  more data is transmitted upon it.  Also, the problem counter
375              is no longer decremented when the problem count timeout expires.
376
377              A problem is detected whenever all tokens  from  the  proceeding
378              processor     have     not     been    received    within    the
379              rrp_token_expired_timeout.   The  rrp_problem_count_threshold  *
380              rrp_token_expired_timeout should be atleast 50 milliseconds less
381              then the token timeout, or a complete reconfiguration may occur.
382
383              The default is 20 problem counts.
384
385
386       rrp_token_expired_timeout
387              This specifies the time in milliseconds to increment the problem
388              counter  for  the  redundant  ring  protocol  after  not  having
389              received a token from all rings for a particular processor.
390
391              This value will automatically be calculated from the token time‐
392              out  and  problem_count_threshold  but may be overridden.  It is
393              not recommended to override this value without guidance from the
394              openais community.
395
396              The default is 47 milliseconds.
397
398
399       Within  the  logging  directive,  there are seven configuration options
400       which are all optional:
401
402       to_stderr
403
404       to_file
405
406       to_syslog
407              These specify the destination of logging output. Any combination
408              of these options may be specified. Valid options are yes and no.
409
410              The default is syslog and stderr.
411
412
413
414       logfile
415              If  the  to_file directive is set to yes , this option specifies
416              the pathname of the log file.
417
418              No default.
419
420
421       debug  This specifies whether debug output is logged for all  services.
422              This  is generally a bad idea, unless there is some specific bug
423              or problem that must be found in the executive. Set the value to
424              on  to  debug, off to turn off debugging. If enabled, individual
425              loggers can be disabled using a logger_subsys directive.
426
427              The default is off.
428
429
430       timestamp
431              This specifies that a timestamp is placed on all log messages.
432
433              The default is off.
434
435
436       fileline
437              This specifies that file and line should be printed  instead  of
438              logger name.
439
440              The default is off.
441
442
443       syslog_facility
444              This  specifies  the  syslog facility type that will be used for
445              any messages sent to syslog. options are daemon, local0, local1,
446              local2, local3, local4, local5, local6 & local7.
447
448              The default is daemon.
449
450
451       Within the logging directive, logger directives are optional.
452
453       Within  the logger_subsys sub-directive of logging there are three con‐
454       figuration options:
455
456
457       subsys This specifies the subsystem identity (name) for  which  logging
458              is specified. This is the name used by a service in the log_init
459              () call. E.g. 'CKPT'. This directive is required.
460
461
462       debug  This specifies whether debug output is logged for this  particu‐
463              lar logger.
464
465              The default is off.
466
467
468       syslog_level
469              This  specifies  the syslog level for this particular subsystem.
470              Ignored if debug is on.  Possible values are: alert, crit, debug
471              (same as debug = on), emerg, err, info, notice, warning.
472
473              The default is: info.
474
475
476       tags   This  specifies  which tags should be traced for this particular
477              logger.  Set debug directive to on in order  to  enable  tracing
478              using tags.  Values are specified using a vertical bar as a log‐
479              ical OR separator:
480
481              enter|leave|trace1|trace2|trace3|...
482
483              The default is none.
484
485
486       Within the event directive, there are two configuration  options  which
487       are all optional:
488
489       delivery_queue_size
490              This  directive describes the full size of the outgoing delivery
491              queue to the application.  If applications are slow  to  process
492              messages,  they  will  be  delivered  event  loss  messages.  By
493              increasing this value, the applications that are slowly process‐
494              ing messages may have an opportunity to catch up.
495
496
497       delivery_queue_resume
498              This  directive describes when new events can be accepted by the
499              event service when the delivery queue count of pending  messages
500              has reached this value.  Please note this is not cluster wide.
501
502
503       Within  the  amf  directive, there is one configuration option which is
504       optional:
505
506       mode   This can either contain the value  enabled  or  disabled.   When
507              enabled,  AMF  will  start  the  applications  specified  in the
508              /etc/ais/amf.conf file.  The default is disabled.
509
510

FILES

512       /etc/ais/openais.conf
513              The openais executive configuration file.
514
515       /etc/ais/amf.conf
516              The openais AMF configuration file.
517
518

NAME

SYNOPSIS

DESCRIPTION

FILES

SEE ALSO