1OCF_LINBIT_DRBD(7)            OCF resource agents           OCF_LINBIT_DRBD(7)
2
3
4

NAME

6       ocf_linbit_drbd - Manages a DRBD device as a Master/Slave resource
7

SYNOPSIS

9       drbd [start | stop | monitor | promote | demote | meta-data |
10            validate-all]
11

DESCRIPTION

13       This resource agent manages a DRBD resource as a master/slave resource.
14       DRBD is a shared-nothing replicated storage device.
15
16       NOTE: To avoid data-divergence, you should enable either DRBD "quorum"
17       and "on-no-quorum io-error" (recommended), or configure proper fencing
18       policies in both DRBD *and* Pacemaker (fencing resource-and-stonith).
19       This cannot be done from this resource agent alone.
20
21       See the DRBD User's Guide for more information.
22       https://docs.linbit.com/
23

SUPPORTED PARAMETERS

25       drbd_resource
26           The name of the drbd resource from the drbd.conf file.
27
28           (unique, required, string, no default)
29
30       drbdconf
31           Full path to the drbd.conf file.
32
33           (optional, string, default "/etc/drbd.conf")
34
35       adjust_master_score
36           Space separated list of four master score adjustments for different
37           scenarios: - only access to 'consistent' data - only remote access
38           to 'uptodate' data - currently Secondary, local access to
39           'uptodate' data, but remote is unknown - local access to 'uptodate'
40           data, and currently Primary or remote is known
41
42           Numeric values are expected to be non-decreasing.
43
44           The first value is 0 by default to prevent pacemaker from trying to
45           promote while it is unclear whether the data is really the most
46           recent copy. (DRBD knows it is "consistent", but is unsure about
47           "uptodate"ness). Please configure proper fencing methods both in
48           DRBD (fencing resource-and-stonith; appropriate (un)fence-peer
49           handlers) AND in Pacemaker to make this work reliably.
50
51           Advanced use: Adjust the other values to better fit into complex
52           dependency score calculations.
53
54           Intentionally diskless nodes ("Diskless Clients") with access to
55           good data via some (or all) their peers will use the 3rd or 4th
56           value (minus one) when they are (Secondary, not all peers
57           up-to-date) or (ALL peers are up-to-date, or they are Primary
58           themselves). This may need to change if this should become a
59           frequent use case.
60
61           Special considerations:
62
63           If a Secondary DRBD is connected to a peer in Primary role, but
64           Pacemaker does not know about any Primary (using crm_resource
65           --locate), we conclude that there likely is a cluster-split-brain,
66           and may try to "help" Pacemaker by removing the master-score. Also
67           see "remove_master_score_if_peer_primary".
68
69           (optional, string, default "0 10 1000 10000")
70
71       stop_outdates_secondary
72           Recommended setting: leave at default (disabled).
73
74           Note that this feature depends on the passed in information in
75           OCF_RESKEY_CRM_meta_notify_master_uname to be correct, which
76           unfortunately is not reliable for pacemaker versions up to at least
77           1.0.10 / 1.1.4.
78
79           If a Secondary is stopped (unconfigured), it may be marked as
80           outdated in the drbd meta data, if we know there is still a Primary
81           running in the cluster. Note that this does not affect fencing
82           policies set in drbd config, but is an additional safety feature of
83           this resource agent only. You can enable this behaviour by setting
84           the parameter to true.
85
86           If this feature seems to not do what you expect, make sure you have
87           defined fencing policies in the drbd configuration as well.
88
89           (optional, boolean, default false)
90
91       ignore_missing_notifications
92           Some setups do not benefit from notifications. Allow to disable
93           notifications without patching this resource agent.
94
95           (optional, boolean, default false)
96
97       wfc_timeout
98           Unless set to the empty string or any non-digits, wait (at most)
99           this many seconds for the connection(s) to be established after
100           bringing them up during "start".
101
102           (optional, integer, default 5)
103
104       remove_master_score_if_peer_primary
105           See also "adjust_master_score" and
106           "fail_promote_early_if_peer_primary".
107
108           To prevent a potentially failed promotion attempt in case of
109           cluster split-brain (Pacemaker communication loss) while DRBD is
110           still connected to a Primary, you can request to remove any master
111           score while DRBD is connected to a Primary (and that Primary peer
112           looks like it has all disks up-to-date).
113
114           This may delay legitimate failovers after Primary crash by up to
115           some TCP timeout (until DRBD realizes that the Primary is gone)
116           plus one monitoring interval.
117
118           This parameter is interpreted almost as an "ocf boolean", with the
119           exception of a literal "unexpected", that is:
120
121           - (yes|true|1) [actually, according to the OCF spec, also
122           (YES|TRUE|True|ja|ON), but please don't go there]: is "true":
123           remove (or never assign) master scores, if DRBD appears to see a
124           (healthy) Primary
125
126           - "unexpected": assign master scores as described under
127           "adjust_master_score", while removing it if DRBD appears to see a
128           (healthy) Primary that Pacemaker does not know about (as determined
129           by crm_resource --locate).
130
131           - everything else is "false": ignore the peer role while assigning
132           master scores.
133
134           (optional, string, default "false")
135
136       fail_promote_early_if_peer_primary
137           See also "adjust_master_score" and
138           "remove_master_score_if_peer_primary".
139
140           To avoid a useless retry loop during promotion attempts in case of
141           cluster split-brain (Pacemaker communication loss) while DRBD is
142           still connected to a Primary, you can chose to give up after the
143           first try if this situation is detected.
144
145           If a Primary "vanishes", TCP may not immediately detect this, and
146           an idle DRBD may take some time until it does in-DRBD-protocol
147           "pings". Pacemaker may well detect Primary loss earlier than DRBD,
148           and try to promote while DRBD thinks it can still see a Primary.
149           Which means, in general, trying to promote at least once is
150           necessary, as that implies an in-DRBD-protocol "peer alive" check.
151
152           But if that does not succeed, re-trying until we hit the operation
153           timeout may not be desired, so you can disable it.
154
155           (optional, boolean, default false)
156
157       unfence_if_all_uptodate
158           If all volumes of this resource report to be UpToDate, call an
159           unfence script hook, just in case some stale fencing constraint or
160           similar is still around.
161
162           - With DRBD utils version <= 8.9.4, this is hardcoded to
163           /usr/lib/drbd/crm-unfence-peer.sh -r $DRBD_RESOURCE
164
165           - With DRBD utils version >= 8.9.5, this is dispatched to $DRBDADM
166           unfence-peer $DRBD_RESOURCE
167
168           In any case, the hook itself is responsible to fetch
169           $OCF_RESKEY_unfence_extra_args from its environment.
170
171           (optional, boolean, default false)
172
173       unfence_extra_args
174           This may be used to pass extra hints to the unfence hook. See
175           description of unfence_if_all_uptodate.
176
177           (optional, boolean, default --quiet --flock-required
178           --flock-timeout 0 --unfence-only-if-owner-match)
179
180       require_drbd_module_version_ge
181           Use this you want to force failure of this resource agent if the
182           detected DRBD kernel (module) driver version is lower than a
183           required minimum.
184
185           Example: use require_drbd_module_version_ge=9.0.16 to fail unless
186           DRBD module version >= 9.0.16 is available (effectively requires
187           DRBD 9).
188
189           The intention of this is to give a more useful failure message
190           after accidentally downgrading the DRBD version by
191           installing/upgrading a new kernel.
192
193           Note: "ge", "greater-or-equal", inclusive. Required format: x.y.z
194
195           Set empty to skip this check.
196
197           (optional, string, default "8.0.0")
198
199       require_drbd_module_version_lt
200           Use this you want to force failure of this resource agent if the
201           detected DRBD kernel (module) driver version is higher than a
202           required maximum.
203
204           Example: use require_drbd_module_version_lt=9.0.0 to fail unless
205           DRBD module version < 9.0 is available (effectively requires DRBD
206           8.4).
207
208           Note: "lt", "less-than", exclusive. Required format: x.y.z
209
210           Set empty to skip this check.
211
212           (optional, string, default "10.0.0")
213
214       connect_only_after_promote
215           This may be useful for "stacked" setups without proper fencing on
216           the lower layer (which we obviously do not recommend), to avoid
217           some of the ugly side effects that may arise after resolving a
218           split-brain on the lower layer.
219
220           Keep this DRBD instance disconnected until it is promoted. After
221           promotion we issue an additional "adjust", which is supposed to
222           initiate the connection attempts.
223
224           This causes a new data generation identifier ("current uuid") to be
225           generated after the failover of a "healthy" DRBD.
226
227           (optional, boolean, default false)
228

SUPPORTED ACTIONS

230       This resource agent supports the following actions (operations):
231
232       start
233           Starts the resource. Suggested minimum timeout: 240.
234
235       reload
236           Suggested minimum timeout: 30.
237
238       promote
239           Promotes the resource to the Master role. Suggested minimum
240           timeout: 90.
241
242       demote
243           Demotes the resource to the Slave role. Suggested minimum timeout:
244           90.
245
246       notify
247           Suggested minimum timeout: 90.
248
249       stop
250           Stops the resource. Suggested minimum timeout: 100.
251
252       monitor (Slave role)
253           Performs a detailed status check. Suggested minimum timeout: 20.
254           Suggested interval: 20.
255
256       monitor (Master role)
257           Performs a detailed status check. Suggested minimum timeout: 20.
258           Suggested interval: 10.
259
260       meta-data
261           Retrieves resource agent metadata (internal use only). Suggested
262           minimum timeout: 5.
263
264       validate-all
265           Performs a validation of the resource configuration.
266

EXAMPLE CRM SHELL

268       The following is an example configuration for a drbd resource using the
269       crm(8) shell:
270
271           primitive p_drbd ocf:linbit:drbd \
272             params \
273               drbd_resource=string \
274             op monitor timeout="20" interval="20" role="Slave" \
275             op monitor timeout="20" interval="10" role="Master"
276
277           ms ms_drbd p_drbd \
278             meta notify="true" interleave="true"
279

EXAMPLE PCS

281       The following is an example configuration for a drbd resource using
282       pcs(8)
283
284           pcs resource create p_drbd ocf:linbit:drbd \
285             drbd_resource=string \
286             op monitor timeout="20" interval="20" role="Slave" \
287             op monitor timeout="20" interval="10" role="Master" --master
288

SEE ALSO

290       https://docs.linbit.com/, https://clusterlabs.org/,
291       https://www.linbit.com/drbd-community/
292

AUTHORS

294       LINBIT HA Solutions GmbH
295
296
297
298drbd-pacemaker 9.26.0             11/05/2023                OCF_LINBIT_DRBD(7)
Impressum