1COROSYNC_OVERVIEW(8C)orosync Cluster Engine Programmer's ManuCaOlROSYNC_OVERVIEW(8)
2
3
4
6 corosync_overview - Corosync overview
7
9 The corosync project is a project to implement a production quality
10 "Revised BSD" licensed implementation of the most recent SA Forum's
11 Application Interface Specification. The Application Interface Speci‐
12 fication is a software API and policies which are used to develop
13 applications that maintain service during faults. The API consists of
14 Availability Management Framework (AMF) which provides application
15 failover, Cluster Membership (CLM), Checkpointing (CKPT), Eventing
16 (EVT), Messaging (MSG), and Distributed Locking (DLOCK).
17
18 Currently Messaging is unimplemented.
19
20 Faults occur for various reasons:
21
22 * Application Faults
23
24 * Middleware Faults
25
26 * Operating System Faults
27
28 * Hardware Faults
29
30 The major focus of high availability in the past has been to mask hard‐
31 ware faults. Faults in other components of the system have gone
32 unsolved until AIS. AIS can mask many types of faults in applications,
33 middleware, operating systems, or even hardware by providing a simple
34 framework for allowing developers to create redundant applications.
35 These redundant applications can be distributed over multiple nodes
36 such that if any one node faults, another node can recover.
37
38 Application programmers develop applications to periodically record
39 their state using the checkpointing service. When an active application
40 fails, a standby application recovers the state of the application.
41 This technique, called stateful application failover, provides the fun‐
42 damental difference between corosync and other systems that have come
43 before it. With stateful application failover, the end-application
44 user doesn't have to reload the application or redial a telephone. The
45 full state is recorded, so the end-application user sees no interrup‐
46 tion in service.
47
48 Because programmers can now distribute applications across multiple
49 processes or nodes, a mechanism must exist for them to communicate.
50 This mechanism is provided by two services. The event service provides
51 a publish/subscribe model for events. The messaging service provides
52 end to end messaging. Finally a mechanism to synchronize access is pro‐
53 vided by the distributed lock service.
54
55 The corosync project also provides a group messaging toolkit called
56 EVS. The EVS service implements a messaging model known as Extended
57 Virtual Synchrony. This model allows one sender to transmit to many
58 receivers. Certain guarantees are provided for message and membership
59 delivery which make virtual synchrony ideal for developing distributed
60 applications.
61
62
64 The corosync executive must be configured. In the directory conf in
65 the source distribution are several files that must be copied to the
66 /etc/corosync directory. If corosync is packaged by a distro, this may
67 be complete.
68
69 The directory contains the file corosync.conf. Please read the
70 corosync.conf(5) man page for details on the configuration options.
71 The corosync project will work out of the box with the default configu‐
72 ration options, although the administrator may desire different
73 options.
74
75 The corosync executive uses cryptographic techniques to ensure authen‐
76 ticity and privacy of the messages. In order for corosync to be secure
77 and operate, a private key must be generated and shared to all proces‐
78 sors.
79
80 First generate the key on one of the nodes:
81
82 unix# corosync-keygen
83 Corosync Cluster Engine Authentication key generator.
84 Gathering 1024 bits for key from /dev/random.
85 Press keys on your keyboard to generate entropy.
86 Writing corosync key to /etc/corosync/authkey.
87
88 After this operation, a private key will be in the file
89 /etc/corosync/authkey. This private key must be copied to every pro‐
90 cessor in the cluster. If the private key isn't the same for every
91 node, those nodes with nonmatching private keys will not be able to
92 join the same configuration.
93
94 Copy the key to some security transportable storage or use ssh to
95 transmit the key from node to node. Then install the key with the com‐
96 mand:
97
98 unix#: install -D --group=0 --owner=0 --mode=0400
99 /path_to_authkey/authkey /etc/corosync/authkey
100
101 If a message "Invalid digest" appears from the corosync executive, the
102 keys are not consistent between processors.
103
104 Finally run the corosync executive. If corosync is packaged from a
105 distro, it may be set to start on system start. It may also be turned
106 off by default in which case the init script for corosync must be
107 enabled.
108
109 After running aisexec, a list of all processors IP addresses running
110 the corosync executive and configured on the same multicast address
111 will appear. If they don't appear, there may be a problem with multi‐
112 cast in the distro or hardware. If this happens, participation in the
113 corosync mailing list may help solve the problem. The email address is
114 openais@lists.osdl.org.
115
116
118 The corosync AIS libraries have header files which must be included in
119 the developer's application. Once the header file is included, the
120 developer can reference the AIS interfaces.
121
122 The corosync project recommends to distros to place include files in
123 /usr/include/corosync. The following include lines must be added to
124 the application to use each of the following services:
125
126 #include <corosync/saClm.h> For the Cluster Membership B.01.01 service.
127
128 #include <corosync/saCkpt.h> For the Checkpointing B.01.01 service.
129
130 #include <corosync/saEvt.h> For the Eventing B.01.01 service.
131
132 #include <corosync/ais_amf.h> For the AMF A.01.01 service.
133
134 The corosync project recommends to distros to place library files in
135 /usr/lib. The following link lines must be added to the LDFLAGS sec‐
136 tion of the makefile.
137
138 -lsaClm For the Cluster Membership B.01.01 service
139
140 -lsaCkpt For the Checkpointing B.01.01 service
141
142 -lsaEvt For the Eventing B.01.01 service
143
144 -lsaAmf For the AMF A.01.01 service
145
146 -lais Specify this to get access to all AIS libraries without specify‐
147 ing each library individually.
148
149
151 The corosync project supports both IPv4 and IPv6 network addresses.
152 The entire cluster must use either IPv4 or IPv6 for the cluster commu‐
153 nication mechanism. In order to use IPv6, IPv6 addresses must be spec‐
154 ified in the bindnetaddr and mcastaddr fields in the configuration
155 file. The nodeid field must also be set.
156
157 An example of this is: nodeid: 2 bindnetaddr: fec0::1:a800:4ff:fe00:20
158 mcastaddr: ff05::1
159
160 To configure a host for IPv6, use the ifconfig program to add inter‐
161 faces: box20: ifconfig eth0 add fec0::1:a800:4ff:fe00:20/64 box30:
162 ifconfig eth0 add fec0::1:a800:4ff:fe00:30/64
163
164 If the /64 is not specified, a route for the IPv6 network will not be
165 configured which will cause significant problems. Make sure a route is
166 available for IPv6 traffic.
167
168
170 The AIS libraries are a thin IPC interface to the corosync executive.
171 The corosync executive provides services for the SA Forum AIS libraries
172 as well as the EVS and CPG libraries.
173
174 The corosync executive uses the Totem extended virtual synchrony proto‐
175 col. The advantage to the end user is excellent performance character‐
176 istics and a proven protocol with excellent reliability. This protocol
177 connects the processors in a configuration together so they may commu‐
178 nicate.
179
180
182 The corosync executive process uses four environment variables during
183 startup. If these environment variables are not set, defaults will be
184 used.
185
186
187 COROSYNC_MAIN_CONFIG_FILE
188 This specifies the fully qualified path to the corosync configu‐
189 ration file.
190
191 The default is /etc/corosync/corosync.conf.
192
193
194 COROSYNC_AMF_CONFIG_FILE
195 This specifies the fully qualified path to the corosync Avail‐
196 ability Management Framework configuration file.
197
198 The default is /etc/corosync/amf.conf.
199
200
201 COROSYNC_DEFAULT_CONFIG_IFACE
202 This specifies the LCRSO that is used to parse the configuration
203 file. This allows other configuration file parsers to be imple‐
204 mented within the system.
205
206 The default is to use the default corosync configuration file
207 parser which parses the format specified in corosync.conf (5).
208
209
210 COROSYNC_TOTEM_AUTHKEY_FILE
211 This specifies the fully qualified path to the shared key used
212 to authenticate and encrypt data used within the Totem protocol.
213
214 The default is /etc/corosync/authkey.
215
216
218 The corosync executive optionally encrypts all messages sent over the
219 network using the SOBER-128 stream cipher. The corosync executive uses
220 HMAC and SHA1 to authenticate all messages. The corosync executive
221 library uses SOBER-128 as a pseudo random number generator. The EVS
222 library feeds the PRNG using the /dev/random Linux device.
223
224 If membership messages can be captured by intruders, it is possible to
225 execute a denial of service attack on the cluster. In this scenario,
226 the cluster is likely already compromised and a DOS attack is the least
227 of the administration's worries.
228
229 The security in corosync does not offer perfect forward secrecy because
230 the keys are reused. It may be possible for an intruder by capturing
231 packets in an automated fashion to determine the shared key. No such
232 automated attack has been published as of yet. In this scenario, the
233 cluster is likely already compromised to allow the long-term capture of
234 transmitted data.
235
236 For security reasons, the corosync executive binary should NEVER be
237 setuid or setgid in the filesystem.
238
239
241 The corosync libraries are now nearly compliant with every aspect of
242 the SA Forum's AIS specification. The AMF service, however, is not
243 compliant with the B.01.01 specification. The remaining services pass
244 most of the tests of the saftest suite against the B.01.01 specifica‐
245 tion.
246
247
249 The messaging service is partially implemented and not suitable for
250 deployment. The distributed locking service is buggy and not suitable
251 for deployment. The Availability Management Framework is under devel‐
252 opment and not suitable for deployment..
253
254
256 corosync.conf(5), corosync-keygen(8), evs_overview(8)
257
258corosync Man Page 2006-05-10 COROSYNC_OVERVIEW(8)