1EVS_OVERVIEW(8)   Corosync Cluster Engine Programmer's Manual  EVS_OVERVIEW(8)
2
3
4

NAME

6       evs_overview - EvS Library Overview
7

OVERVIEW

9       The  EVS  library is delivered with the corosync project.  This library
10       is used to create distributed applications that operate properly during
11       partitions, merges, and faults.
12
13       The  library provides a mechanism to: * handle abstraction for multiple
14       instances of an EVS library in one application  *  Deliver  messages  *
15       Deliver  configuration changes * join one or more groups * leave one or
16       more groups * send messages to one or more groups *  send  messages  to
17       currently joined groups
18
19       The  EVS library implements a messaging model known as Extended Virtual
20       Synchrony.  This model allows one sender to transmit to many  receivers
21       using  standard UDP/IP.  UDP/IP is unreliable and unordered, so the EVS
22       library applies ordering and reliability to messages.  Hardware  multi‐
23       cast  is  used  to avoid duplicated packets with two or more receivers.
24       Erroneous messages are corrected automatically by the library.
25
26       Certain guarantees are provided by the EVS library.   These  guarantees
27       are related to message delivery and configuration change delivery.
28

DEFINITIONS

30       multicast
31              A  multicast  occurs  when  a network interface card sends a UDP
32              packet to multiple receivers simulatenously.
33
34       processor
35              A processor is the entity that  executes  the  extended  virtual
36              synchrony algorithms.
37
38       configuration
39              A  configuration  is  the  current description of the processors
40              executing the extended virtual syncrhony algorithm.
41
42       configuration change
43              A configuration change occurs when a new configuration is deliv‐
44              ered.
45
46       partition
47              A  partition occurs when a configuration splits into two or more
48              configurations, or a processor fails or is  stopped  and  leaves
49              the configuration.
50
51       merge  A  merge  occurs  when  two  or  more configurations join into a
52              larger new configuration.  When a new processor starts up, it is
53              treated  as  a configuration with only one processor and a merge
54              occurs.
55
56       fifo ordering
57              A message is FIFO ordered when one sender and one receiver agree
58              on the order of the messages sent.
59
60       agreed ordering
61              A  message  is  AGREED  ordered when all processors agree on the
62              order of the messages sent.
63
64       safe ordering
65              A message is SAFE ordered when all processors agree on the order
66              of  messages sent and those messages are not delivered until all
67              processors have a copy of the message to deliver.
68
69       virtual syncrhony
70              Virtual syncrhony is obtained when all processors agree  on  the
71              order  of  messages sent and configuration changes sent for each
72              new configuration.
73

USING VIRTUAL SYNCHRONY

75       The virtual synchrony messaging model has many benefits for  developing
76       distributed applications.  Applications designed using replication have
77       the most benefits.  Applications that must be  able  to  partition  and
78       merge also benefit from the virtual synchrony messaging model.
79
80       All  applications  receive a copy of transmitted messages even if there
81       are errors on the transmission media.  This allows optimiziations  when
82       every processor must receive a copy of the message for replication.
83
84       All  messages are ordered according to agreed ordering.  This mechanism
85       allows the avoidance of  race  conditions.   Consider  a  lock  service
86       implemented  over  several  processors.  Two requests occur at the same
87       time on two seperate processors.  The requests are  ordered  for  every
88       processor  in the same order and delivered to the processors.  Then all
89       processors will get request A before request B and can  reject  request
90       B.   Any  type  of  creation or deletion of a shared data structure can
91       benefit from this mechanism.
92
93       Self delivery ensures that messages that are sent by  a  processor  are
94       also delivered back to that processor.  This allows the processor send‐
95       ing the message to execute logic when the  message  is  self  delivered
96       according  to agreed ordering and the virtual synchrony rules.  It also
97       permits all logic to be placed in one message handler  instead  of  two
98       seperate places.
99
100       Virtual  Synchrony  allows the current configuration to be used to make
101       decisions in partitions and merges.  Since the configuration is sent in
102       the  stream  of  messages to the application, the application can alter
103       its behavior based upon the configuration changes.
104

ARCHITECTURE AND ALGORITHM

106       The EVS library is a thin IPC interface to the corosync executive.  The
107       corosync  executive provides services for the SA Forum AIS libraries as
108       well as the EVS library.
109
110       The corosync executive uses a ring protocol and membership protocol  to
111       send  messages  according to the semantics required by extended virtual
112       synchrony.  The ring protocol creates a virtual ring of processors.   A
113       token is rotated around the ring of processors.  When the token is pos‐
114       sessed by a processor, that processor may multicast messages  to  other
115       processors in the system.
116
117       The token is called the ORF token (for ordering, reliability, flow con‐
118       trol).  The ORF token orders all messages by increasing a sequence num‐
119       ber  every  time a message is multicasted.  In this way, an ordering is
120       placed on all messages that all processors agree to.   The  token  also
121       contains  a retransmission list.  If a token is received by a processor
122       that has not yet received a message it should have, a message  sequence
123       number  is  added  to  the retransmission list.  A processor that has a
124       copy of the message then retransmits the message.  The ORF  token  pro‐
125       vides  configuration-wide  flow  control by tracking the number of mes‐
126       sages sent and limiting the number of messages that may be sent by  one
127       processor on each posession of the token.
128
129       The membership protocol is responsible for ring formation and detecting
130       when a processor within a ring has failed.  If the token fails to  make
131       a rotation within a timeout period known as the token rotation timeout,
132       the membership protocol will form a  new  ring.   If  a  new  processor
133       starts,  it  will also form a new ring.  Two or more configurations may
134       be used to form a new ring, allowing many partitions to merge  together
135       into one new configuration.
136

PERFORMANCE

138       The  EVS library obtains 8.5MB/sec throughput on 100 mbit network links
139       with many processors.  Larger messages obtain better throughput results
140       because  the time to access Ethernet is about the same for a small mes‐
141       sage as it is for a larger message.   Smaller  messages  obtain  better
142       messages  per second, because the time to send a message is not exactly
143       the same.
144
145       80% of CPU utilization occurs because of encryption and authentication.
146       The  corosync  can  be  built without encryption and authentication for
147       those with no security requirements and low  CPU  utilization  require‐
148       ments.   Even  without  encryption or authentication, under heavy load,
149       processor utilization can reach 25% on 1.5 GHZ CPU processors.
150
151       The current corosync executive supports 16 processors, however, support
152       for  more  processors  is  possible by changing defines in the corosync
153       executive.  This is untested, however.
154

SECURITY

156       The EVS library encrypts all messages sent over the network  using  the
157       SOBER-128 stream cipher.  The EVS library uses HMAC and SHA1 to authen‐
158       ticate all messages.  The EVS library uses SOBER-128 as a pseudo random
159       number generator.  The EVS library feeds the PRNG using the /dev/random
160       Linux device.
161

BUGS

163       This software is not yet production, so there may still be  some  bugs.
164       But it appears there are very few since nobody reports any unknown bugs
165       at this point.
166

SEE ALSO

168       evs_initialize(3),  evs_finalize(3),  evs_fd_get(3),   evs_dispatch(3),
169       evs_join(3),  evs_leave(3),  evs_mcast_joined(3),  evs_mcast_groups(3),
170       evs_mmembership_get(3) evs_context_get(3) evs_context_set(3)
171
172
173corosync Man Page                 2004-08-31                   EVS_OVERVIEW(8)
Impressum