1o2cb(7) OCFS2 Manual Pages o2cb(7)
2
3
4
6 o2cb - Default cluster stack of the OCFS2 file system.
7
9 o2cb is the default cluster stack of the OCFS2 file system. It is an
10 in-kernel cluster stack that includes a node manager (o2nm) to keep
11 track of the nodes in the cluster, a disk heartbeat agent (o2hb) to
12 detect node live-ness, a network agent (o2net) for intra-cluster node
13 communication and a distributed lock manager (o2dlm) to keep track of
14 lock resources. It also includes a synthetic file system, dlmfs, to
15 allow applications to access the in-kernel dlm.
16
17
19 The stack is configured using the o2cb(8) cluster configuration utility
20 and operated (online/offline/status) using the o2cb init service.
21
22
23 CLUSTER CONFIGURATION
24
25 It has two configuration files. One for the cluster layout
26 (/etc/ocfs2/cluster.conf) and the other for the cluster time‐
27 outs, etc. (/etc/sysconfig/o2cb). More information about these
28 two files can be found in ocfs2.cluster.conf(5) and o2cb.syscon‐
29 fig(5).
30
31 The o2cb cluster stack supports two heartbeat modes, namely,
32 local and global. Only one heartbeat mode can be active at any
33 one time.
34
35 Local heartbeat refers to disk heartbeating on all shared
36 devices. In this mode, the heartbeat is started during mount and
37 stopped during umount. This mode is easy to setup as it does not
38 require configuring heartbeat devices. The one drawback in this
39 mode is the overhead on servers having a large number of OCFS2
40 mounts. For example, a server with 50 mounts will have 50 heart‐
41 beat threads. This is the default heartbeat mode.
42
43 Global heartbeat, on the other hand, refers to heartbeating on
44 specific shared devices. These devices are normal OCFS2 format‐
45 ted volumes that could also be mounted and used as clustered
46 file systems. In this mode, the heartbeat is started during
47 cluster online and stopped during cluster offline. While this
48 mode can be used for all clusters, it is strongly recommended
49 for clusters having a large number of mounts.
50
51 More information on disk heartbeat is provided below.
52
53
54 KERNEL CONFIGURATION
55
56 Two sysctl values need to be set for o2cb to function properly.
57 The first, panic_on_oops, must be enabled to turn a kernel oops
58 into a panic. If a kernel thread required for o2cb to function
59 crashes, the system must be reset to prevent a cluster hang. If
60 it is not set, another node may not be able to distinguish
61 whether a node is unable to respond or slow to respond.
62
63 The other related sysctl parameter is panic, which specifies the
64 number of seconds after a panic that the system will be auto-
65 reset. Setting this parameter to zero disables autoreset; the
66 cluster will require manual intervention. This is not preferred
67 in a cluster environment.
68
69 To manually enable panic on oops and set a 30 sec timeout for
70 reboot on panic, do:
71
72 # echo 1 > /proc/sys/kernel/panic_on_oops
73 # echo 30 > /proc/sys/kernel/panic
74
75 To enable the above on every boot, add the following to
76 /etc/sysctl.conf:
77
78 kernel.panic_on_oops = 1
79 kernel.panic = 30
80
81
82 OS CONFIGURATION
83
84 The o2cb cluster stack also requires iptables (firewalling) to
85 be either disabled or modified to allow network traffic on the
86 private network interface. The port used by o2cb is specified in
87 /etc/ocfs2/cluster.conf.
88
89
91 O2CB uses disk heartbeat to detect node liveness. The disk heartbeat
92 thread, o2hb, periodically reads and writes to a heartbeat file in a
93 OCFS2 file system. Its write payload contains a sequence number that it
94 increments in each write. This allows other nodes reading the same
95 heartbeat file to detect the change and associate that with a live
96 node. Conversely, a node whose sequence number has stopped changing is
97 marked as a possible dead node. Possible. Not confirmed. That is
98 because it just could be slow I/Os.
99
100 To differentiate between a dead node and one that has slow I/Os, O2CB
101 has a disk heartbeat threshold (timeout). Only nodes whose sequence
102 number has not incremented for that duration are marked dead.
103
104 However that node may not be dead but just experiencing slow I/O. To
105 prevent that, the heartbeat thread keeps track of the time elapsed
106 since the last completed write. If that time exceeds the timeout, it
107 forces a self-fence. It does so to prevent other nodes from marking it
108 as dead while it is still alive.
109
110 This self-fencing scheme has proven to be very reliable as it relies on
111 kernel timers and pci bus reset. External fencing, while attractive, is
112 rarely as reliable as it relies on external hardware and software that
113 is prone to failure due to misconfiguration, etc.
114
115 Having said that, O2CB disk heartbeat has had its share of problems
116 with self fencing. Nodes experiencing slow I/O on only one of multiple
117 devices have to initiate self-fence.
118
119 This is because in the default local heartbeat scheme, nodes in a clus‐
120 ter may not be heartbeating on the same set of devices.
121
122 The global heartbeat mode addresses this shortcoming by introducing a
123 scheme that forces all nodes to heartbeat on the same set of devices.
124 In this scheme, a node experiencing a slowdown in I/O on a device may
125 not need to initiate self-fence. It will only have to do so if it
126 encounters slowdown on 50% or more of the heartbeat devices. In a
127 cluster with 3 heartbeat regions, a slowdown in 1 region will be toler‐
128 ated. In a cluster with 5 regions, a slowdown in 2 will be tolerated.
129
130 It is for this reason, this mode is recommended for users that have 3
131 or more OCFS2 mounts.
132
133 O2CB allows upto 32 heartbeat regions to be configured in the global
134 heartbeat mode.
135
136
138 The O2CB cluster stack allows adding and removing nodes in an online
139 cluster when run in the global heartbeat mode. Use the o2cb(8) utility
140 to make the changes in the configuration and (re)online the cluster
141 using the o2cb init script. The user must do the same on all nodes in
142 the cluster. The cluster will not allow any new cluster mounts if the
143 node configuration on all nodes is not the same.
144
145 The removal of nodes will only succeed if that node is no longer in
146 use. If the user removes an active node from the configuration, the re-
147 online will fail.
148
149 The cluster stack also allows adding and removing heartbeat regions in
150 an online cluster. Use the o2cb(8) utility to make the changes in the
151 configuration file and (re)online the cluster using the o2cb init
152 script. The user must do the same on all nodes in the cluster. The
153 cluster will not allow any new cluster mounts if the heartbeat region
154 configuration on all nodes is not the same.
155
156 The removal of heartbeat regions will only succeed if the active heart‐
157 beat region count is greater than 3. This is to protect against edge
158 conditions that can destabilize the cluster.
159
160
162 The first step in configuring o2cb is deciding whether to setup local
163 or global heartbeat. If global heartbeat, then one has to format
164 atleast one heartbeat device.
165
166 To format a OCFS2 volume with global heartbeat enabled, do:
167
168 # mkfs.ocfs2 --cluster-stack=o2cb --cluster-name=webcluster --global-heartbeat -L "hbvol1" /dev/sdb1
169
170 Once formatted, setup /etc/ocfs2/cluster.conf following the example
171 provided in ocfs2.cluster.conf(5).
172
173 If local heartbeat, then one can setup cluster.conf without any heart‐
174 beat devices. The next step is starting the cluster.
175
176 To online the cluster stack, do:
177
178 # service o2cb online
179 Loading stack plugin "o2cb": OK
180 Loading filesystem "ocfs2_dlmfs": OK
181 Mounting ocfs2_dlmfs filesystem at /dlm: OK
182 Setting cluster stack "o2cb": OK
183 Registering O2CB cluster "webcluster": OK
184 Setting O2CB cluster timeouts : OK
185 Starting global heartbeat for cluster "webcluster": OK
186
187 Once the cluster stack is online, new OCFS2 volumes can be formatted
188 normally without specifying the cluster stack information.
189 mkfs.ocfs2(8) will pick up that information automatically.
190
191 # mkfs.ocfs2 -L "datavol" /dev/sdc1
192
193 Meanwhile existing volumes can be converted to the new cluster stack
194 using tunefs.ocfs2(8) utility.
195
196 # tunefs.ocfs2 --update-cluster-stack /dev/sdd1
197 Updating on-disk cluster information to match the running cluster.
198 DANGER: YOU MUST BE ABSOLUTELY SURE THAT NO OTHER NODE IS USING THIS FILESYSTEM
199 BEFORE MODIFYING ITS CLUSTER CONFIGURATION.
200 Update the on-disk cluster information? y
201
202 Another utility mounted.ocfs2(8) is useful is listing all the OCFS2
203 volumes alonghwith the cluster stack information.
204
205 To get a list of OCFS2 volumes, do:
206
207 # mounted.ocfs2 -d
208 Device Stack Cluster F UUID Label
209 /dev/sdb1 o2cb webcluster G DCDA2845177F4D59A0F2DCD8DE507CC3 hbvol1
210 /dev/sdc1 None 23878C320CF3478095D1318CB5C99EED localmount
211 /dev/sdd1 o2cb webcluster G 8AB016CD59FC4327A2CDAB69F08518E3 webvol
212 /dev/sdg1 o2cb webcluster G 77D95EF51C0149D2823674FCC162CF8B logsvol
213 /dev/sdh1 o2cb webcluster G BBA1DBD0F73F449384CE75197D9B7098 scratch
214
215 The o2cb init script can also be used to check the status of the clus‐
216 ter, offline the cluster, etc.
217
218 To check the status of the cluster stack, do:
219
220 # service o2cb status
221 Driver for "configfs": Loaded
222 Filesystem "configfs": Mounted
223 Stack glue driver: Loaded
224 Stack plugin "o2cb": Loaded
225 Driver for "ocfs2_dlmfs": Loaded
226 Filesystem "ocfs2_dlmfs": Mounted
227 Checking O2CB cluster "webcluster": Online
228 Heartbeat dead threshold: 62
229 Network idle timeout: 60000
230 Network keepalive delay: 2000
231 Network reconnect delay: 2000
232 Heartbeat mode: Global
233 Checking O2CB heartbeat: Active
234 77D95EF51C0149D2823674FCC162CF8B /dev/sdg1
235 DCDA2845177F4D59A0F2DCD8DE507CC3 /dev/sdk1
236 BBA1DBD0F73F449384CE75197D9B7098 /dev/sdh1
237 Nodes in O2CB cluster: 6 7 10
238 Active userdlm domains: ovm
239
240 To offline and unload the cluster stack, do:
241
242 # service o2cb offline
243 Clean userdlm domains: OK
244 Stopping global heartbeat on cluster "webcluster": OK
245 Stopping O2CB cluster webcluster: OK
246 Unregistering O2CB cluster "webcluster": OK
247
248 # service o2cb unload
249 Clean userdlm domains: OK
250 Unmounting ocfs2_dlmfs filesystem: OK
251 Unloading module "ocfs2_dlmfs": OK
252 Unloading module "ocfs2_stack_o2cb": OK
253
254
256 o2cb(8) o2cb.sysconfig(5) ocfs2.cluster.conf(5) o2hbmonitor(8)
257
258
260 Oracle Corporation
261
262
264 Copyright © 2004, 2011 Oracle. All rights reserved.
265
266
267
268Version 1.8.5 August 2011 o2cb(7)