1OCF_LINBIT_DRBD(7) OCF resource agents OCF_LINBIT_DRBD(7)
2
3
4
6 ocf_linbit_drbd - Manages a DRBD device as a Master/Slave resource
7
9 drbd [start | stop | monitor | promote | demote | meta-data |
10 validate-all]
11
13 This resource agent manages a DRBD resource as a master/slave resource.
14 DRBD is a shared-nothing replicated storage device.
15
16 NOTE: To avoid data-divergence, you should enable either DRBD "quorum"
17 and "on-no-quorum io-error" (recommended), or configure proper fencing
18 policies in both DRBD *and* Pacemaker (fencing resource-and-stonith).
19 This cannot be done from this resource agent alone.
20
21 See the DRBD User's Guide for more information.
22 https://docs.linbit.com/
23
25 drbd_resource
26 The name of the drbd resource from the drbd.conf file.
27
28 (unique, required, string, no default)
29
30 drbdconf
31 Full path to the drbd.conf file.
32
33 (optional, string, default "/etc/drbd.conf")
34
35 adjust_master_score
36 Space separated list of four master score adjustments for different
37 scenarios: - only access to 'consistent' data - only remote access
38 to 'uptodate' data - currently Secondary, local access to
39 'uptodate' data, but remote is unknown - local access to 'uptodate'
40 data, and currently Primary or remote is known
41
42 Numeric values are expected to be non-decreasing.
43
44 The first value is 0 by default to prevent pacemaker from trying to
45 promote while it is unclear whether the data is really the most
46 recent copy. (DRBD knows it is "consistent", but is unsure about
47 "uptodate"ness). Please configure proper fencing methods both in
48 DRBD (fencing resource-and-stonith; appropriate (un)fence-peer
49 handlers) AND in Pacemaker to make this work reliably.
50
51 Advanced use: Adjust the other values to better fit into complex
52 dependency score calculations.
53
54 Intentionally diskless nodes ("Diskless Clients") with access to
55 good data via some (or all) their peers will use the 3rd or 4th
56 value (minus one) when they are (Secondary, not all peers
57 up-to-date) or (ALL peers are up-to-date, or they are Primary
58 themselves). This may need to change if this should become a
59 frequent use case.
60
61 Special considerations:
62
63 If a Secondary DRBD is connected to a peer in Primary role, but
64 Pacemaker does not know about any Primary (using crm_resource
65 --locate), we conclude that there likely is a cluster-split-brain,
66 and may try to "help" Pacemaker by removing the master-score. Also
67 see "remove_master_score_if_peer_primary".
68
69 (optional, string, default "0 10 1000 10000")
70
71 stop_outdates_secondary
72 Recommended setting: leave at default (disabled).
73
74 Note that this feature depends on the passed in information in
75 OCF_RESKEY_CRM_meta_notify_master_uname to be correct, which
76 unfortunately is not reliable for pacemaker versions up to at least
77 1.0.10 / 1.1.4.
78
79 If a Secondary is stopped (unconfigured), it may be marked as
80 outdated in the drbd meta data, if we know there is still a Primary
81 running in the cluster. Note that this does not affect fencing
82 policies set in drbd config, but is an additional safety feature of
83 this resource agent only. You can enable this behaviour by setting
84 the parameter to true.
85
86 If this feature seems to not do what you expect, make sure you have
87 defined fencing policies in the drbd configuration as well.
88
89 (optional, boolean, default false)
90
91 ignore_missing_notifications
92 Some setups do not benefit from notifications. Allow to disable
93 notifications without patching this resource agent.
94
95 (optional, boolean, default false)
96
97 wfc_timeout
98 Unless set to the empty string or any non-digits, wait (at most)
99 this many seconds for the connection(s) to be established after
100 bringing them up during "start".
101
102 (optional, integer, default 5)
103
104 remove_master_score_if_peer_primary
105 See also "adjust_master_score" and
106 "fail_promote_early_if_peer_primary".
107
108 To prevent a potentially failed promotion attempt in case of
109 cluster split-brain (Pacemaker communication loss) while DRBD is
110 still connected to a Primary, you can request to remove any master
111 score while DRBD is connected to a Primary (and that Primary peer
112 looks like it has all disks up-to-date).
113
114 This may delay legitimate failovers after Primary crash by up to
115 some TCP timeout (until DRBD realizes that the Primary is gone)
116 plus one monitoring interval.
117
118 This parameter is interpreted almost as an "ocf boolean", with the
119 exception of a literal "unexpected", that is:
120
121 - (yes|true|1) [actually, according to the OCF spec, also
122 (YES|TRUE|True|ja|ON), but please don't go there]: is "true":
123 remove (or never assign) master scores, if DRBD appears to see a
124 (healthy) Primary
125
126 - "unexpected": assign master scores as described under
127 "adjust_master_score", while removing it if DRBD appears to see a
128 (healthy) Primary that Pacemaker does not know about (as determined
129 by crm_resource --locate).
130
131 - everything else is "false": ignore the peer role while assigning
132 master scores.
133
134 (optional, string, default "false")
135
136 fail_promote_early_if_peer_primary
137 See also "adjust_master_score" and
138 "remove_master_score_if_peer_primary".
139
140 To avoid a useless retry loop during promotion attempts in case of
141 cluster split-brain (Pacemaker communication loss) while DRBD is
142 still connected to a Primary, you can chose to give up after the
143 first try if this situation is detected.
144
145 If a Primary "vanishes", TCP may not immediately detect this, and
146 an idle DRBD may take some time until it does in-DRBD-protocol
147 "pings". Pacemaker may well detect Primary loss earlier than DRBD,
148 and try to promote while DRBD thinks it can still see a Primary.
149 Which means, in general, trying to promote at least once is
150 necessary, as that implies an in-DRBD-protocol "peer alive" check.
151
152 But if that does not succeed, re-trying until we hit the operation
153 timeout may not be desired, so you can disable it.
154
155 (optional, boolean, default false)
156
157 unfence_if_all_uptodate
158 If all volumes of this resource report to be UpToDate, call an
159 unfence script hook, just in case some stale fencing constraint or
160 similar is still around.
161
162 - With DRBD utils version <= 8.9.4, this is hardcoded to
163 /usr/lib/drbd/crm-unfence-peer.sh -r $DRBD_RESOURCE
164
165 - With DRBD utils version >= 8.9.5, this is dispatched to $DRBDADM
166 unfence-peer $DRBD_RESOURCE
167
168 In any case, the hook itself is responsible to fetch
169 $OCF_RESKEY_unfence_extra_args from its environment.
170
171 (optional, boolean, default false)
172
173 unfence_extra_args
174 This may be used to pass extra hints to the unfence hook. See
175 description of unfence_if_all_uptodate.
176
177 (optional, boolean, default --quiet --flock-required
178 --flock-timeout 0 --unfence-only-if-owner-match)
179
180 require_drbd_module_version_ge
181 Use this you want to force failure of this resource agent if the
182 detected DRBD kernel (module) driver version is lower than a
183 required minimum.
184
185 Example: use require_drbd_module_version_ge=9.0.16 to fail unless
186 DRBD module version >= 9.0.16 is available (effectively requires
187 DRBD 9).
188
189 The intention of this is to give a more useful failure message
190 after accidentally downgrading the DRBD version by
191 installing/upgrading a new kernel.
192
193 Note: "ge", "greater-or-equal", inclusive. Required format: x.y.z
194
195 Set empty to skip this check.
196
197 (optional, string, default "8.0.0")
198
199 require_drbd_module_version_lt
200 Use this you want to force failure of this resource agent if the
201 detected DRBD kernel (module) driver version is higher than a
202 required maximum.
203
204 Example: use require_drbd_module_version_lt=9.0.0 to fail unless
205 DRBD module version < 9.0 is available (effectively requires DRBD
206 8.4).
207
208 Note: "lt", "less-than", exclusive. Required format: x.y.z
209
210 Set empty to skip this check.
211
212 (optional, string, default "10.0.0")
213
214 connect_only_after_promote
215 This may be useful for "stacked" setups without proper fencing on
216 the lower layer (which we obviously do not recommend), to avoid
217 some of the ugly side effects that may arise after resolving a
218 split-brain on the lower layer.
219
220 Keep this DRBD instance disconnected until it is promoted. After
221 promotion we issue an additional "adjust", which is supposed to
222 initiate the connection attempts.
223
224 This causes a new data generation identifier ("current uuid") to be
225 generated after the failover of a "healthy" DRBD.
226
227 (optional, boolean, default false)
228
230 This resource agent supports the following actions (operations):
231
232 start
233 Starts the resource. Suggested minimum timeout: 240.
234
235 reload
236 Suggested minimum timeout: 30.
237
238 promote
239 Promotes the resource to the Master role. Suggested minimum
240 timeout: 90.
241
242 demote
243 Demotes the resource to the Slave role. Suggested minimum timeout:
244 90.
245
246 notify
247 Suggested minimum timeout: 90.
248
249 stop
250 Stops the resource. Suggested minimum timeout: 100.
251
252 monitor (Slave role)
253 Performs a detailed status check. Suggested minimum timeout: 20.
254 Suggested interval: 20.
255
256 monitor (Master role)
257 Performs a detailed status check. Suggested minimum timeout: 20.
258 Suggested interval: 10.
259
260 meta-data
261 Retrieves resource agent metadata (internal use only). Suggested
262 minimum timeout: 5.
263
264 validate-all
265 Performs a validation of the resource configuration.
266
268 The following is an example configuration for a drbd resource using the
269 crm(8) shell:
270
271 primitive p_drbd ocf:linbit:drbd \
272 params \
273 drbd_resource=string \
274 op monitor timeout="20" interval="20" role="Slave" \
275 op monitor timeout="20" interval="10" role="Master"
276
277 ms ms_drbd p_drbd \
278 meta notify="true" interleave="true"
279
281 The following is an example configuration for a drbd resource using
282 pcs(8)
283
284 pcs resource create p_drbd ocf:linbit:drbd \
285 drbd_resource=string \
286 op monitor timeout="20" interval="20" role="Slave" \
287 op monitor timeout="20" interval="10" role="Master" --master
288
290 https://docs.linbit.com/, https://clusterlabs.org/,
291 https://www.linbit.com/drbd-community/
292
294 LINBIT HA Solutions GmbH
295
296
297
298drbd-pacemaker 9.22.0 10/15/2022 OCF_LINBIT_DRBD(7)