drbd.conf-9.0(5)

1DRBD.CONF(5)                  Configuration Files                 DRBD.CONF(5)
2
3
4

NAME

6       drbd.conf - DRBD Configuration Files
7

INTRODUCTION

9       DRBD implements block devices which replicate their data to all nodes
10       of a cluster. The actual data and associated metadata are usually
11       stored redundantly on "ordinary" block devices on each cluster node.
12
13       Replicated block devices are called /dev/drbdminor by default. They are
14       grouped into resources, with one or more devices per resource.
15       Replication among the devices in a resource takes place in
16       chronological order. With DRBD, we refer to the devices inside a
17       resource as volumes.
18
19       In DRBD 9, a resource can be replicated between two or more cluster
20       nodes. The connections between cluster nodes are point-to-point links,
21       and use TCP or a TCP-like protocol. All nodes must be directly
22       connected.
23
24       DRBD consists of low-level user-space components which interact with
25       the kernel and perform basic operations (drbdsetup, drbdmeta), a
26       high-level user-space component which understands and processes the
27       DRBD configuration and translates it into basic operations of the
28       low-level components (drbdadm), and a kernel component.
29
30       The default DRBD configuration consists of /etc/drbd.conf and of
31       additional files included from there, usually global_common.conf and
32       all *.res files inside /etc/drbd.d/. It has turned out to be useful to
33       define each resource in a separate *.res file.
34
35       The configuration files are designed so that each cluster node can
36       contain an identical copy of the entire cluster configuration. The host
37       name of each node determines which parts of the configuration apply
38       (uname -n). It is highly recommended to keep the cluster configuration
39       on all nodes in sync by manually copying it to all nodes, or by
40       automating the process with csync2 or a similar tool.
41

EXAMPLE CONFIGURATION FILE

43           global {
44                usage-count yes;
45                udev-always-use-vnr;
46           }
47           resource r0 {
48                 net {
49                      cram-hmac-alg sha1;
50                      shared-secret "FooFunFactory";
51                 }
52                 volume 0 {
53                      device    "/dev/drbd1";
54                      disk      "/dev/sda7";
55                      meta-disk internal;
56                 }
57                 on "alice" {
58                      node-id   0;
59                      address   10.1.1.31:7000;
60                 }
61                 on "bob" {
62                      node-id   1;
63                      address   10.1.1.32:7000;
64                 }
65                 connection {
66                      host      "alice"  port 7000;
67                      host      "bob"    port 7000;
68                      net {
69                          protocol C;
70                      }
71                 }
72           }
73
74       This example defines a resource r0 which contains a single replicated
75       device with volume number 0. The resource is replicated among hosts
76       alice and bob, which have the IPv4 addresses 10.1.1.31 and 10.1.1.32
77       and the node identifiers 0 and 1, respectively. On both hosts, the
78       replicated device is called /dev/drbd1, and the actual data and
79       metadata are stored on the lower-level device /dev/sda7. The connection
80       between the hosts uses protocol C.
81
82       Enclose strings within double-quotation marks (") to differentiate them
83       from resource keywords. Please refer to the DRBD User's Guide[1] for
84       more examples.
85

FILE FORMAT

87       DRBD configuration files consist of sections, which contain other
88       sections and parameters depending on the section types. Each section
89       consists of one or more keywords, sometimes a section name, an opening
90       brace (“{”), the section's contents, and a closing brace (“}”).
91       Parameters inside a section consist of a keyword, followed by one or
92       more keywords or values, and a semicolon (“;”).
93
94       Some parameter values have a default scale which applies when a plain
95       number is specified (for example Kilo, or 1024 times the numeric
96       value). Such default scales can be overridden by using a suffix (for
97       example, M for Mega). The common suffixes K = 2^10 = 1024, M = 1024 K,
98       and G = 1024 M are supported.
99
100       Comments start with a hash sign (“#”) and extend to the end of the
101       line. In addition, any section can be prefixed with the keyword skip,
102       which causes the section and any sub-sections to be ignored.
103
104       Additional files can be included with the include file-pattern
105       statement (see glob(7) for the expressions supported in file-pattern).
106       Include statements are only allowed outside of sections.
107
108       The following sections are defined (indentation indicates in which
109       context):
110
111           common
112              [disk]
113              [handlers]
114              [net]
115              [options]
116              [startup]
117           global
118           [require-drbd-module-version-{eq,ne,gt,ge,lt,le}]
119           resource
120              connection
121                 multiple path | 2 host
122                 [net]
123                 [volume]
124                    [peer-device-options]
125                 [peer-device-options]
126              connection-mesh
127                 [net]
128              [disk]
129              floating
130              handlers
131              [net]
132              on
133                 volume
134                    disk
135                    [disk]
136              options
137              stacked-on-top-of
138              startup
139
140       Sections in brackets affect other parts of the configuration: inside
141       the common section, they apply to all resources. A disk section inside
142       a resource or on section applies to all volumes of that resource, and a
143       net section inside a resource section applies to all connections of
144       that resource. This allows to avoid repeating identical options for
145       each resource, connection, or volume. Options can be overridden in a
146       more specific resource, connection, on, or volume section.
147
148       peer-device-options are resync-rate, c-plan-ahead, c-delay-target,
149       c-fill-target, c-max-rate and c-min-rate. Due to backward
150       comapatibility they can be specified in any disk options section as
151       well. They are inherited into all relevant connections. If they are
152       given on connection level they are inherited to all volumes on that
153       connection. A peer-device-options section is started with the disk
154       keyword.
155
156   Sections
157       common
158
159           This section can contain each a disk, handlers, net, options, and
160           startup section. All resources inherit the parameters in these
161           sections as their default values.
162
163       connection
164
165           Define a connection between two hosts. This section must contain
166           two host parameters or multiple path sections.
167
168       path
169
170           Define a path between two hosts. This section must contain two host
171           parameters.
172
173       connection-mesh
174
175           Define a connection mesh between multiple hosts. This section must
176           contain a hosts parameter, which has the host names as arguments.
177           This section is a shortcut to define many connections which share
178           the same network options.
179
180       disk
181
182           Define parameters for a volume. All parameters in this section are
183           optional.
184
185       floating [address-family] addr:port
186
187           Like the on section, except that instead of the host name a network
188           address is used to determine if it matches a floating section.
189
190           The node-id parameter in this section is required. If the address
191           parameter is not provided, no connections to peers will be created
192           by default. The device, disk, and meta-disk parameters must be
193           defined in, or inherited by, this section.
194
195       global
196
197           Define some global parameters. All parameters in this section are
198           optional. Only one global section is allowed in the configuration.
199
200       require-drbd-module-version-{eq,ne,gt,ge,lt,le}
201
202           This statement contains one of the valid forms and a three digit
203           version number (e.g., require-drbd-module-version-eq 9.0.16;). If
204           the currently loaded DRBD kernel module does not match the
205           specification, parsing is aborted. Comparison operator names have
206           same semantic as in test(1).
207
208       handlers
209
210           Define handlers to be invoked when certain events occur. The kernel
211           passes the resource name in the first command-line argument and
212           sets the following environment variables depending on the event's
213           context:
214
215           •   For events related to a particular device: the device's minor
216               number in DRBD_MINOR, the device's volume number in
217               DRBD_VOLUME.
218
219           •   For events related to a particular device on a particular peer:
220               the connection endpoints in DRBD_MY_ADDRESS, DRBD_MY_AF,
221               DRBD_PEER_ADDRESS, and DRBD_PEER_AF; the device's local minor
222               number in DRBD_MINOR, and the device's volume number in
223               DRBD_VOLUME.
224
225           •   For events related to a particular connection: the connection
226               endpoints in DRBD_MY_ADDRESS, DRBD_MY_AF, DRBD_PEER_ADDRESS,
227               and DRBD_PEER_AF; and, for each device defined for that
228               connection: the device's minor number in
229               DRBD_MINOR_volume-number.
230
231           •   For events that identify a device, if a lower-level device is
232               attached, the lower-level device's device name is passed in
233               DRBD_BACKING_DEV (or DRBD_BACKING_DEV_volume-number).
234
235           All parameters in this section are optional. Only a single handler
236           can be defined for each event; if no handler is defined, nothing
237           will happen.
238
239       net
240
241           Define parameters for a connection. All parameters in this section
242           are optional.
243
244       on host-name [...]
245
246           Define the properties of a resource on a particular host or set of
247           hosts. Specifying more than one host name can make sense in a setup
248           with IP address failover, for example. The host-name argument must
249           match the Linux host name (uname -n).
250
251           Usually contains or inherits at least one volume section. The
252           node-id and address parameters must be defined in this section. The
253           device, disk, and meta-disk parameters must be defined in, or
254           inherited by, this section.
255
256           A normal configuration file contains two or more on sections for
257           each resource. Also see the floating section.
258
259       options
260
261           Define parameters for a resource. All parameters in this section
262           are optional.
263
264       resource name
265
266           Define a resource. Usually contains at least two on sections and at
267           least one connection section.
268
269       stacked-on-top-of resource
270
271           Used instead of an on section for configuring a stacked resource
272           with three to four nodes.
273
274           Starting with DRBD 9, stacking is deprecated. It is advised to use
275           resources which are replicated among more than two nodes instead.
276
277       startup
278
279           The parameters in this section determine the behavior of a resource
280           at startup time.
281
282       volume volume-number
283
284           Define a volume within a resource. The volume numbers in the
285           various volume sections of a resource define which devices on which
286           hosts form a replicated device.
287
288   Section connection Parameters
289       host name [address [address-family] address] [port port-number]
290
291           Defines an endpoint for a connection. Each host statement refers to
292           an on section in a resource. If a port number is defined, this
293           endpoint will use the specified port instead of the port defined in
294           the on section. Each connection section must contain exactly two
295           host parameters. Instead of two host parameters the connection may
296           contain multiple path sections.
297
298   Section path Parameters
299       host name [address [address-family] address] [port port-number]
300
301           Defines an endpoint for a connection. Each host statement refers to
302           an on section in a resource. If a port number is defined, this
303           endpoint will use the specified port instead of the port defined in
304           the on section. Each path section must contain exactly two host
305           parameters.
306
307   Section connection-mesh Parameters
308       hosts name...
309
310           Defines all nodes of a mesh. Each name refers to an on section in a
311           resource. The port that is defined in the on section will be used.
312
313   Section disk Parameters
314       al-extents extents
315
316           DRBD automatically maintains a "hot" or "active" disk area likely
317           to be written to again soon based on the recent write activity. The
318           "active" disk area can be written to immediately, while "inactive"
319           disk areas must be "activated" first, which requires a meta-data
320           write. We also refer to this active disk area as the "activity
321           log".
322
323           The activity log saves meta-data writes, but the whole log must be
324           resynced upon recovery of a failed node. The size of the activity
325           log is a major factor of how long a resync will take and how fast a
326           replicated disk will become consistent after a crash.
327
328           The activity log consists of a number of 4-Megabyte segments; the
329           al-extents parameter determines how many of those segments can be
330           active at the same time. The default value for al-extents is 1237,
331           with a minimum of 7 and a maximum of 65536.
332
333           Note that the effective maximum may be smaller, depending on how
334           you created the device meta data, see also drbdmeta(8) The
335           effective maximum is 919 * (available on-disk activity-log
336           ring-buffer area/4kB -1), the default 32kB ring-buffer effects a
337           maximum of 6433 (covers more than 25 GiB of data) We recommend to
338           keep this well within the amount your backend storage and
339           replication link are able to resync inside of about 5 minutes.
340
341       al-updates {yes | no}
342
343           With this parameter, the activity log can be turned off entirely
344           (see the al-extents parameter). This will speed up writes because
345           fewer meta-data writes will be necessary, but the entire device
346           needs to be resynchronized opon recovery of a failed primary node.
347           The default value for al-updates is yes.
348
349       disk-barrier,
350       disk-flushes,
351       disk-drain
352           DRBD has three methods of handling the ordering of dependent write
353           requests:
354
355           disk-barrier
356               Use disk barriers to make sure that requests are written to
357               disk in the right order. Barriers ensure that all requests
358               submitted before a barrier make it to the disk before any
359               requests submitted after the barrier. This is implemented using
360               'tagged command queuing' on SCSI devices and 'native command
361               queuing' on SATA devices. Only some devices and device stacks
362               support this method. The device mapper (LVM) only supports
363               barriers in some configurations.
364
365               Note that on systems which do not support disk barriers,
366               enabling this option can lead to data loss or corruption. Until
367               DRBD 8.4.1, disk-barrier was turned on if the I/O stack below
368               DRBD did support barriers. Kernels since linux-2.6.36 (or
369               2.6.32 RHEL6) no longer allow to detect if barriers are
370               supported. Since drbd-8.4.2, this option is off by default and
371               needs to be enabled explicitly.
372
373           disk-flushes
374               Use disk flushes between dependent write requests, also
375               referred to as 'force unit access' by drive vendors. This
376               forces all data to disk. This option is enabled by default.
377
378           disk-drain
379               Wait for the request queue to "drain" (that is, wait for the
380               requests to finish) before submitting a dependent write
381               request. This method requires that requests are stable on disk
382               when they finish. Before DRBD 8.0.9, this was the only method
383               implemented. This option is enabled by default. Do not disable
384               in production environments.
385
386           From these three methods, drbd will use the first that is enabled
387           and supported by the backing storage device. If all three of these
388           options are turned off, DRBD will submit write requests without
389           bothering about dependencies. Depending on the I/O stack, write
390           requests can be reordered, and they can be submitted in a different
391           order on different cluster nodes. This can result in data loss or
392           corruption. Therefore, turning off all three methods of controlling
393           write ordering is strongly discouraged.
394
395           A general guideline for configuring write ordering is to use disk
396           barriers or disk flushes when using ordinary disks (or an ordinary
397           disk array) with a volatile write cache. On storage without cache
398           or with a battery backed write cache, disk draining can be a
399           reasonable choice.
400
401       disk-timeout
402           If the lower-level device on which a DRBD device stores its data
403           does not finish an I/O request within the defined disk-timeout,
404           DRBD treats this as a failure. The lower-level device is detached,
405           and the device's disk state advances to Diskless. If DRBD is
406           connected to one or more peers, the failed request is passed on to
407           one of them.
408
409           This option is dangerous and may lead to kernel panic!
410
411           "Aborting" requests, or force-detaching the disk, is intended for
412           completely blocked/hung local backing devices which do no longer
413           complete requests at all, not even do error completions. In this
414           situation, usually a hard-reset and failover is the only way out.
415
416           By "aborting", basically faking a local error-completion, we allow
417           for a more graceful swichover by cleanly migrating services. Still
418           the affected node has to be rebooted "soon".
419
420           By completing these requests, we allow the upper layers to re-use
421           the associated data pages.
422
423           If later the local backing device "recovers", and now DMAs some
424           data from disk into the original request pages, in the best case it
425           will just put random data into unused pages; but typically it will
426           corrupt meanwhile completely unrelated data, causing all sorts of
427           damage.
428
429           Which means delayed successful completion, especially for READ
430           requests, is a reason to panic(). We assume that a delayed *error*
431           completion is OK, though we still will complain noisily about it.
432
433           The default value of disk-timeout is 0, which stands for an
434           infinite timeout. Timeouts are specified in units of 0.1 seconds.
435           This option is available since DRBD 8.3.12.
436
437       md-flushes
438           Enable disk flushes and disk barriers on the meta-data device. This
439           option is enabled by default. See the disk-flushes parameter.
440
441       on-io-error handler
442
443           Configure how DRBD reacts to I/O errors on a lower-level device.
444           The following policies are defined:
445
446           pass_on
447               Change the disk status to Inconsistent, mark the failed block
448               as inconsistent in the bitmap, and retry the I/O operation on a
449               remote cluster node.
450
451           call-local-io-error
452               Call the local-io-error handler (see the handlers section).
453
454           detach
455               Detach the lower-level device and continue in diskless mode.
456
457
458       read-balancing policy
459           Distribute read requests among cluster nodes as defined by policy.
460           The supported policies are prefer-local (the default),
461           prefer-remote, round-robin, least-pending, when-congested-remote,
462           32K-striping, 64K-striping, 128K-striping, 256K-striping,
463           512K-striping and 1M-striping.
464
465           This option is available since DRBD 8.4.1.
466
467       resync-after res-name/volume
468
469           Define that a device should only resynchronize after the specified
470           other device. By default, no order between devices is defined, and
471           all devices will resynchronize in parallel. Depending on the
472           configuration of the lower-level devices, and the available network
473           and disk bandwidth, this can slow down the overall resync process.
474           This option can be used to form a chain or tree of dependencies
475           among devices.
476
477       rs-discard-granularity byte
478           When rs-discard-granularity is set to a non zero, positive value
479           then DRBD tries to do a resync operation in requests of this size.
480           In case such a block contains only zero bytes on the sync source
481           node, the sync target node will issue a discard/trim/unmap command
482           for the area.
483
484           The value is constrained by the discard granularity of the backing
485           block device. In case rs-discard-granularity is not a multiplier of
486           the discard granularity of the backing block device DRBD rounds it
487           up. The feature only gets active if the backing block device reads
488           back zeroes after a discard command.
489
490           The usage of rs-discard-granularity may cause c-max-rate to be
491           exceeded. In particular, the resync rate may reach 10x the value of
492           rs-discard-granularity per second.
493
494           The default value of rs-discard-granularity is 0. This option is
495           available since 8.4.7.
496
497       discard-zeroes-if-aligned {yes | no}
498
499           There are several aspects to discard/trim/unmap support on linux
500           block devices. Even if discard is supported in general, it may fail
501           silently, or may partially ignore discard requests. Devices also
502           announce whether reading from unmapped blocks returns defined data
503           (usually zeroes), or undefined data (possibly old data, possibly
504           garbage).
505
506           If on different nodes, DRBD is backed by devices with differing
507           discard characteristics, discards may lead to data divergence (old
508           data or garbage left over on one backend, zeroes due to unmapped
509           areas on the other backend). Online verify would now potentially
510           report tons of spurious differences. While probably harmless for
511           most use cases (fstrim on a file system), DRBD cannot have that.
512
513           To play safe, we have to disable discard support, if our local
514           backend (on a Primary) does not support "discard_zeroes_data=true".
515           We also have to translate discards to explicit zero-out on the
516           receiving side, unless the receiving side (Secondary) supports
517           "discard_zeroes_data=true", thereby allocating areas what were
518           supposed to be unmapped.
519
520           There are some devices (notably the LVM/DM thin provisioning) that
521           are capable of discard, but announce discard_zeroes_data=false. In
522           the case of DM-thin, discards aligned to the chunk size will be
523           unmapped, and reading from unmapped sectors will return zeroes.
524           However, unaligned partial head or tail areas of discard requests
525           will be silently ignored.
526
527           If we now add a helper to explicitly zero-out these unaligned
528           partial areas, while passing on the discard of the aligned full
529           chunks, we effectively achieve discard_zeroes_data=true on such
530           devices.
531
532           Setting discard-zeroes-if-aligned to yes will allow DRBD to use
533           discards, and to announce discard_zeroes_data=true, even on
534           backends that announce discard_zeroes_data=false.
535
536           Setting discard-zeroes-if-aligned to no will cause DRBD to always
537           fall-back to zero-out on the receiving side, and to not even
538           announce discard capabilities on the Primary, if the respective
539           backend announces discard_zeroes_data=false.
540
541           We used to ignore the discard_zeroes_data setting completely. To
542           not break established and expected behaviour, and suddenly cause
543           fstrim on thin-provisioned LVs to run out-of-space instead of
544           freeing up space, the default value is yes.
545
546           This option is available since 8.4.7.
547
548       disable-write-same {yes | no}
549
550           Some disks announce WRITE_SAME support to the kernel but fail with
551           an I/O error upon actually receiving such a request. This mostly
552           happens when using virtualized disks -- notably, this behavior has
553           been observed with VMware's virtual disks.
554
555           When disable-write-same is set to yes, WRITE_SAME detection is
556           manually overriden and support is disabled.
557
558           The default value of disable-write-same is no. This option is
559           available since 8.4.7.
560
561   Section peer-device-options Parameters
562       Please note that you open the section with the disk keyword.
563
564       c-delay-target delay_target,
565       c-fill-target fill_target,
566       c-max-rate max_rate,
567       c-plan-ahead plan_time
568           Dynamically control the resync speed. The following modes are
569           available:
570
571           •   Dynamic control with fill target (default). Enabled when
572               c-plan-ahead is non-zero and c-fill-target is non-zero. The
573               goal is to fill the buffers along the data path with a defined
574               amount of data. This mode is recommended when DRBD-proxy is
575               used. Configured with c-plan-ahead, c-fill-target and
576               c-max-rate.
577
578           •   Dynamic control with delay target. Enabled when c-plan-ahead is
579               non-zero (default) and c-fill-target is zero. The goal is to
580               have a defined delay along the path. Configured with
581               c-plan-ahead, c-delay-target and c-max-rate.
582
583           •   Fixed resync rate. Enabled when c-plan-ahead is zero. DRBD will
584               try to perform resync I/O at a fixed rate. Configured with
585               resync-rate.
586
587           The c-plan-ahead parameter defines how fast DRBD adapts to changes
588           in the resync speed. It should be set to five times the network
589           round-trip time or more. The default value of c-plan-ahead is 20,
590           in units of 0.1 seconds.
591
592           The c-fill-target parameter defines the how much resync data DRBD
593           should aim to have in-flight at all times. Common values for
594           "normal" data paths range from 4K to 100K. The default value of
595           c-fill-target is 100, in units of sectors
596
597           The c-delay-target parameter defines the delay in the resync path
598           that DRBD should aim for. This should be set to five times the
599           network round-trip time or more. The default value of
600           c-delay-target is 10, in units of 0.1 seconds.
601
602           The c-max-rate parameter limits the maximum bandwidth used by
603           dynamically controlled resyncs. Setting this to zero removes the
604           limitation (since DRBD 9.0.28). It should be set to either the
605           bandwidth available between the DRBD hosts and the machines hosting
606           DRBD-proxy, or to the available disk bandwidth. The default value
607           of c-max-rate is 102400, in units of KiB/s.
608
609           Dynamic resync speed control is available since DRBD 8.3.9.
610
611       c-min-rate min_rate
612           A node which is primary and sync-source has to schedule application
613           I/O requests and resync I/O requests. The c-min-rate parameter
614           limits how much bandwidth is available for resync I/O; the
615           remaining bandwidth is used for application I/O.
616
617           A c-min-rate value of 0 means that there is no limit on the resync
618           I/O bandwidth. This can slow down application I/O significantly.
619           Use a value of 1 (1 KiB/s) for the lowest possible resync rate.
620
621           The default value of c-min-rate is 250, in units of KiB/s.
622
623       resync-rate rate
624
625           Define how much bandwidth DRBD may use for resynchronizing. DRBD
626           allows "normal" application I/O even during a resync. If the resync
627           takes up too much bandwidth, application I/O can become very slow.
628           This parameter allows to avoid that. Please note this is option
629           only works when the dynamic resync controller is disabled.
630
631   Section global Parameters
632       dialog-refresh time
633
634           The DRBD init script can be used to configure and start DRBD
635           devices, which can involve waiting for other cluster nodes. While
636           waiting, the init script shows the remaining waiting time. The
637           dialog-refresh defines the number of seconds between updates of
638           that countdown. The default value is 1; a value of 0 turns off the
639           countdown.
640
641       disable-ip-verification
642           Normally, DRBD verifies that the IP addresses in the configuration
643           match the host names. Use the disable-ip-verification parameter to
644           disable these checks.
645
646       usage-count {yes | no | ask}
647           A explained on DRBD's Online Usage Counter[2] web page, DRBD
648           includes a mechanism for anonymously counting how many
649           installations are using which versions of DRBD. The results are
650           available on the web page for anyone to see.
651
652           This parameter defines if a cluster node participates in the usage
653           counter; the supported values are yes, no, and ask (ask the user,
654           the default).
655
656           We would like to ask users to participate in the online usage
657           counter as this provides us valuable feedback for steering the
658           development of DRBD.
659
660       udev-always-use-vnr
661           When udev asks drbdadm for a list of device related symlinks,
662           drbdadm would suggest symlinks with differing naming conventions,
663           depending on whether the resource has explicit volume VNR { }
664           definitions, or only one single volume with the implicit volume
665           number 0:
666
667               # implicit single volume without "volume 0 {}" block
668               DEVICE=drbd<minor>
669               SYMLINK_BY_RES=drbd/by-res/<resource-name>
670               SYMLINK_BY_DISK=drbd/by-disk/<backing-disk-name>
671
672               # explicit volume definition: volume VNR { }
673               DEVICE=drbd<minor>
674               SYMLINK_BY_RES=drbd/by-res/<resource-name>/VNR
675               SYMLINK_BY_DISK=drbd/by-disk/<backing-disk-name>
676
677           If you define this parameter in the global section, drbdadm will
678           always add the .../VNR part, and will not care for whether the
679           volume definition was implicit or explicit.
680
681           For legacy backward compatibility, this is off by default, but we
682           do recommend to enable it.
683
684   Section handlers Parameters
685       after-resync-target cmd
686
687           Called on a resync target when a node state changes from
688           Inconsistent to Consistent when a resync finishes. This handler can
689           be used for removing the snapshot created in the
690           before-resync-target handler.
691
692       before-resync-target cmd
693
694           Called on a resync target before a resync begins. This handler can
695           be used for creating a snapshot of the lower-level device for the
696           duration of the resync: if the resync source becomes unavailable
697           during a resync, reverting to the snapshot can restore a consistent
698           state.
699
700       before-resync-source cmd
701
702           Called on a resync source before a resync begins.
703
704       out-of-sync cmd
705
706           Called on all nodes after a verify finishes and out-of-sync blocks
707           were found. This handler is mainly used for monitoring purposes. An
708           example would be to call a script that sends an alert SMS.
709
710       quorum-lost cmd
711
712           Called on a Primary that lost quorum. This handler is usually used
713           to reboot the node if it is not possible to restart the application
714           that uses the storage on top of DRBD.
715
716       fence-peer cmd
717
718           Called when a node should fence a resource on a particular peer.
719           The handler should not use the same communication path that DRBD
720           uses for talking to the peer.
721
722       unfence-peer cmd
723
724           Called when a node should remove fencing constraints from other
725           nodes.
726
727       initial-split-brain cmd
728
729           Called when DRBD connects to a peer and detects that the peer is in
730           a split-brain state with the local node. This handler is also
731           called for split-brain scenarios which will be resolved
732           automatically.
733
734       local-io-error cmd
735
736           Called when an I/O error occurs on a lower-level device.
737
738       pri-lost cmd
739
740           The local node is currently primary, but DRBD believes that it
741           should become a sync target. The node should give up its primary
742           role.
743
744       pri-lost-after-sb cmd
745
746           The local node is currently primary, but it has lost the
747           after-split-brain auto recovery procedure. The node should be
748           abandoned.
749
750       pri-on-incon-degr cmd
751
752           The local node is primary, and neither the local lower-level device
753           nor a lower-level device on a peer is up to date. (The primary has
754           no device to read from or to write to.)
755
756       split-brain cmd
757
758           DRBD has detected a split-brain situation which could not be
759           resolved automatically. Manual recovery is necessary. This handler
760           can be used to call for administrator attention.
761
762       disconnected cmd
763
764           A connection to a peer went down. The handler can learn about the
765           reason for the disconnect from the DRBD_CSTATE environment
766           variable.
767
768   Section net Parameters
769       after-sb-0pri policy
770           Define how to react if a split-brain scenario is detected and none
771           of the two nodes is in primary role. (We detect split-brain
772           scenarios when two nodes connect; split-brain decisions are always
773           between two nodes.) The defined policies are:
774
775           disconnect
776               No automatic resynchronization; simply disconnect.
777
778           discard-younger-primary,
779           discard-older-primary
780               Resynchronize from the node which became primary first
781               (discard-younger-primary) or last (discard-older-primary). If
782               both nodes became primary independently, the
783               discard-least-changes policy is used.
784
785           discard-zero-changes
786               If only one of the nodes wrote data since the split brain
787               situation was detected, resynchronize from this node to the
788               other. If both nodes wrote data, disconnect.
789
790           discard-least-changes
791               Resynchronize from the node with more modified blocks.
792
793           discard-node-nodename
794               Always resynchronize to the named node.
795
796       after-sb-1pri policy
797           Define how to react if a split-brain scenario is detected, with one
798           node in primary role and one node in secondary role. (We detect
799           split-brain scenarios when two nodes connect, so split-brain
800           decisions are always among two nodes.) The defined policies are:
801
802           disconnect
803               No automatic resynchronization, simply disconnect.
804
805           consensus
806               Discard the data on the secondary node if the after-sb-0pri
807               algorithm would also discard the data on the secondary node.
808               Otherwise, disconnect.
809
810           violently-as0p
811               Always take the decision of the after-sb-0pri algorithm, even
812               if it causes an erratic change of the primary's view of the
813               data. This is only useful if a single-node file system (i.e.,
814               not OCFS2 or GFS) with the allow-two-primaries flag is used.
815               This option can cause the primary node to crash, and should not
816               be used.
817
818           discard-secondary
819               Discard the data on the secondary node.
820
821           call-pri-lost-after-sb
822               Always take the decision of the after-sb-0pri algorithm. If the
823               decision is to discard the data on the primary node, call the
824               pri-lost-after-sb handler on the primary node.
825
826       after-sb-2pri policy
827           Define how to react if a split-brain scenario is detected and both
828           nodes are in primary role. (We detect split-brain scenarios when
829           two nodes connect, so split-brain decisions are always among two
830           nodes.) The defined policies are:
831
832           disconnect
833               No automatic resynchronization, simply disconnect.
834
835           violently-as0p
836               See the violently-as0p policy for after-sb-1pri.
837
838           call-pri-lost-after-sb
839               Call the pri-lost-after-sb helper program on one of the
840               machines unless that machine can demote to secondary. The
841               helper program is expected to reboot the machine, which brings
842               the node into a secondary role. Which machine runs the helper
843               program is determined by the after-sb-0pri strategy.
844
845       allow-two-primaries
846
847           The most common way to configure DRBD devices is to allow only one
848           node to be primary (and thus writable) at a time.
849
850           In some scenarios it is preferable to allow two nodes to be primary
851           at once; a mechanism outside of DRBD then must make sure that
852           writes to the shared, replicated device happen in a coordinated
853           way. This can be done with a shared-storage cluster file system
854           like OCFS2 and GFS, or with virtual machine images and a virtual
855           machine manager that can migrate virtual machines between physical
856           machines.
857
858           The allow-two-primaries parameter tells DRBD to allow two nodes to
859           be primary at the same time. Never enable this option when using a
860           non-distributed file system; otherwise, data corruption and node
861           crashes will result!
862
863       always-asbp
864           Normally the automatic after-split-brain policies are only used if
865           current states of the UUIDs do not indicate the presence of a third
866           node.
867
868           With this option you request that the automatic after-split-brain
869           policies are used as long as the data sets of the nodes are somehow
870           related. This might cause a full sync, if the UUIDs indicate the
871           presence of a third node. (Or double faults led to strange UUID
872           sets.)
873
874       connect-int time
875
876           As soon as a connection between two nodes is configured with
877           drbdsetup connect, DRBD immediately tries to establish the
878           connection. If this fails, DRBD waits for connect-int seconds and
879           then repeats. The default value of connect-int is 10 seconds.
880
881       cram-hmac-alg hash-algorithm
882
883           Configure the hash-based message authentication code (HMAC) or
884           secure hash algorithm to use for peer authentication. The kernel
885           supports a number of different algorithms, some of which may be
886           loadable as kernel modules. See the shash algorithms listed in
887           /proc/crypto. By default, cram-hmac-alg is unset. Peer
888           authentication also requires a shared-secret to be configured.
889
890       csums-alg hash-algorithm
891
892           Normally, when two nodes resynchronize, the sync target requests a
893           piece of out-of-sync data from the sync source, and the sync source
894           sends the data. With many usage patterns, a significant number of
895           those blocks will actually be identical.
896
897           When a csums-alg algorithm is specified, when requesting a piece of
898           out-of-sync data, the sync target also sends along a hash of the
899           data it currently has. The sync source compares this hash with its
900           own version of the data. It sends the sync target the new data if
901           the hashes differ, and tells it that the data are the same
902           otherwise. This reduces the network bandwidth required, at the cost
903           of higher cpu utilization and possibly increased I/O on the sync
904           target.
905
906           The csums-alg can be set to one of the secure hash algorithms
907           supported by the kernel; see the shash algorithms listed in
908           /proc/crypto. By default, csums-alg is unset.
909
910       csums-after-crash-only
911
912           Enabling this option (and csums-alg, above) makes it possible to
913           use the checksum based resync only for the first resync after
914           primary crash, but not for later "network hickups".
915
916           In most cases, block that are marked as need-to-be-resynced are in
917           fact changed, so calculating checksums, and both reading and
918           writing the blocks on the resync target is all effective overhead.
919
920           The advantage of checksum based resync is mostly after primary
921           crash recovery, where the recovery marked larger areas (those
922           covered by the activity log) as need-to-be-resynced, just in case.
923           Introduced in 8.4.5.
924
925       data-integrity-alg  alg
926           DRBD normally relies on the data integrity checks built into the
927           TCP/IP protocol, but if a data integrity algorithm is configured,
928           it will additionally use this algorithm to make sure that the data
929           received over the network match what the sender has sent. If a data
930           integrity error is detected, DRBD will close the network connection
931           and reconnect, which will trigger a resync.
932
933           The data-integrity-alg can be set to one of the secure hash
934           algorithms supported by the kernel; see the shash algorithms listed
935           in /proc/crypto. By default, this mechanism is turned off.
936
937           Because of the CPU overhead involved, we recommend not to use this
938           option in production environments. Also see the notes on data
939           integrity below.
940
941       fencing fencing_policy
942
943           Fencing is a preventive measure to avoid situations where both
944           nodes are primary and disconnected. This is also known as a
945           split-brain situation. DRBD supports the following fencing
946           policies:
947
948           dont-care
949               No fencing actions are taken. This is the default policy.
950
951           resource-only
952               If a node becomes a disconnected primary, it tries to fence the
953               peer. This is done by calling the fence-peer handler. The
954               handler is supposed to reach the peer over an alternative
955               communication path and call 'drbdadm outdate minor' there.
956
957           resource-and-stonith
958               If a node becomes a disconnected primary, it freezes all its IO
959               operations and calls its fence-peer handler. The fence-peer
960               handler is supposed to reach the peer over an alternative
961               communication path and call 'drbdadm outdate minor' there. In
962               case it cannot do that, it should stonith the peer. IO is
963               resumed as soon as the situation is resolved. In case the
964               fence-peer handler fails, I/O can be resumed manually with
965               'drbdadm resume-io'.
966
967       ko-count number
968
969           If a secondary node fails to complete a write request in ko-count
970           times the timeout parameter, it is excluded from the cluster. The
971           primary node then sets the connection to this secondary node to
972           Standalone. To disable this feature, you should explicitly set it
973           to 0; defaults may change between versions.
974
975       max-buffers number
976
977           Limits the memory usage per DRBD minor device on the receiving
978           side, or for internal buffers during resync or online-verify. Unit
979           is PAGE_SIZE, which is 4 KiB on most systems. The minimum possible
980           setting is hard coded to 32 (=128 KiB). These buffers are used to
981           hold data blocks while they are written to/read from disk. To avoid
982           possible distributed deadlocks on congestion, this setting is used
983           as a throttle threshold rather than a hard limit. Once more than
984           max-buffers pages are in use, further allocation from this pool is
985           throttled. You want to increase max-buffers if you cannot saturate
986           the IO backend on the receiving side.
987
988       max-epoch-size number
989
990           Define the maximum number of write requests DRBD may issue before
991           issuing a write barrier. The default value is 2048, with a minimum
992           of 1 and a maximum of 20000. Setting this parameter to a value
993           below 10 is likely to decrease performance.
994
995       on-congestion policy,
996       congestion-fill threshold,
997       congestion-extents threshold
998           By default, DRBD blocks when the TCP send queue is full. This
999           prevents applications from generating further write requests until
1000           more buffer space becomes available again.
1001
1002           When DRBD is used together with DRBD-proxy, it can be better to use
1003           the pull-ahead on-congestion policy, which can switch DRBD into
1004           ahead/behind mode before the send queue is full. DRBD then records
1005           the differences between itself and the peer in its bitmap, but it
1006           no longer replicates them to the peer. When enough buffer space
1007           becomes available again, the node resynchronizes with the peer and
1008           switches back to normal replication.
1009
1010           This has the advantage of not blocking application I/O even when
1011           the queues fill up, and the disadvantage that peer nodes can fall
1012           behind much further. Also, while resynchronizing, peer nodes will
1013           become inconsistent.
1014
1015           The available congestion policies are block (the default) and
1016           pull-ahead. The congestion-fill parameter defines how much data is
1017           allowed to be "in flight" in this connection. The default value is
1018           0, which disables this mechanism of congestion control, with a
1019           maximum of 10 GiBytes. The congestion-extents parameter defines how
1020           many bitmap extents may be active before switching into
1021           ahead/behind mode, with the same default and limits as the
1022           al-extents parameter. The congestion-extents parameter is effective
1023           only when set to a value smaller than al-extents.
1024
1025           Ahead/behind mode is available since DRBD 8.3.10.
1026
1027       ping-int interval
1028
1029           When the TCP/IP connection to a peer is idle for more than ping-int
1030           seconds, DRBD will send a keep-alive packet to make sure that a
1031           failed peer or network connection is detected reasonably soon. The
1032           default value is 10 seconds, with a minimum of 1 and a maximum of
1033           120 seconds. The unit is seconds.
1034
1035       ping-timeout timeout
1036
1037           Define the timeout for replies to keep-alive packets. If the peer
1038           does not reply within ping-timeout, DRBD will close and try to
1039           reestablish the connection. The default value is 0.5 seconds, with
1040           a minimum of 0.1 seconds and a maximum of 30 seconds. The unit is
1041           tenths of a second.
1042
1043       socket-check-timeout timeout
1044           In setups involving a DRBD-proxy and connections that experience a
1045           lot of buffer-bloat it might be necessary to set ping-timeout to an
1046           unusual high value. By default DRBD uses the same value to wait if
1047           a newly established TCP-connection is stable. Since the DRBD-proxy
1048           is usually located in the same data center such a long wait time
1049           may hinder DRBD's connect process.
1050
1051           In such setups socket-check-timeout should be set to at least to
1052           the round trip time between DRBD and DRBD-proxy. I.e. in most cases
1053           to 1.
1054
1055           The default unit is tenths of a second, the default value is 0
1056           (which causes DRBD to use the value of ping-timeout instead).
1057           Introduced in 8.4.5.
1058
1059       protocol name
1060           Use the specified protocol on this connection. The supported
1061           protocols are:
1062
1063           A
1064               Writes to the DRBD device complete as soon as they have reached
1065               the local disk and the TCP/IP send buffer.
1066
1067           B
1068               Writes to the DRBD device complete as soon as they have reached
1069               the local disk, and all peers have acknowledged the receipt of
1070               the write requests.
1071
1072           C
1073               Writes to the DRBD device complete as soon as they have reached
1074               the local and all remote disks.
1075
1076
1077       rcvbuf-size size
1078
1079           Configure the size of the TCP/IP receive buffer. A value of 0 (the
1080           default) causes the buffer size to adjust dynamically. This
1081           parameter usually does not need to be set, but it can be set to a
1082           value up to 10 MiB. The default unit is bytes.
1083
1084       rr-conflict policy
1085           This option helps to solve the cases when the outcome of the resync
1086           decision is incompatible with the current role assignment in the
1087           cluster. The defined policies are:
1088
1089           disconnect
1090               No automatic resynchronization, simply disconnect.
1091
1092           retry-connect
1093               Disconnect now, and retry to connect immediatly afterwards.
1094
1095           violently
1096               Resync to the primary node is allowed, violating the assumption
1097               that data on a block device are stable for one of the nodes.
1098               Do not use this option, it is dangerous.
1099
1100           call-pri-lost
1101               Call the pri-lost handler on one of the machines. The handler
1102               is expected to reboot the machine, which puts it into secondary
1103               role.
1104
1105           auto-discard
1106               Auto-discard reverses the resync direction, so that DRBD
1107               resyncs the current primary to the current secondary.
1108               Auto-discard only applies when protocol A is in use and the
1109               resync decision is based on the principle that a crashed
1110               primary should be the source of a resync. When a primary node
1111               crashes, it might have written some last updates to its disk,
1112               which were not received by a protocol A secondary. By promoting
1113               the secondary in the meantime the user accepted that those last
1114               updates have been lost. By using auto-discard you consent that
1115               the last updates (before the crash of the primary) should be
1116               rolled back automatically.
1117
1118       shared-secret secret
1119
1120           Configure the shared secret used for peer authentication. The
1121           secret is a string of up to 64 characters. Peer authentication also
1122           requires the cram-hmac-alg parameter to be set.
1123
1124       sndbuf-size size
1125
1126           Configure the size of the TCP/IP send buffer. Since DRBD 8.0.13 /
1127           8.2.7, a value of 0 (the default) causes the buffer size to adjust
1128           dynamically. Values below 32 KiB are harmful to the throughput on
1129           this connection. Large buffer sizes can be useful especially when
1130           protocol A is used over high-latency networks; the maximum value
1131           supported is 10 MiB.
1132
1133       tcp-cork
1134           By default, DRBD uses the TCP_CORK socket option to prevent the
1135           kernel from sending partial messages; this results in fewer and
1136           bigger packets on the network. Some network stacks can perform
1137           worse with this optimization. On these, the tcp-cork parameter can
1138           be used to turn this optimization off.
1139
1140       timeout time
1141
1142           Define the timeout for replies over the network: if a peer node
1143           does not send an expected reply within the specified timeout, it is
1144           considered dead and the TCP/IP connection is closed. The timeout
1145           value must be lower than connect-int and lower than ping-int. The
1146           default is 6 seconds; the value is specified in tenths of a second.
1147
1148       transport type
1149
1150           With DRBD9 the network transport used by DRBD is loaded as a
1151           seperate module. With this option you can specify which transport
1152           and module to load. At present only two options exist, tcp and
1153           rdma. Please note that currently the RDMA transport module is only
1154           available with a license purchased from LINBIT. Default is tcp.
1155
1156       use-rle
1157
1158           Each replicated device on a cluster node has a separate bitmap for
1159           each of its peer devices. The bitmaps are used for tracking the
1160           differences between the local and peer device: depending on the
1161           cluster state, a disk range can be marked as different from the
1162           peer in the device's bitmap, in the peer device's bitmap, or in
1163           both bitmaps. When two cluster nodes connect, they exchange each
1164           other's bitmaps, and they each compute the union of the local and
1165           peer bitmap to determine the overall differences.
1166
1167           Bitmaps of very large devices are also relatively large, but they
1168           usually compress very well using run-length encoding. This can save
1169           time and bandwidth for the bitmap transfers.
1170
1171           The use-rle parameter determines if run-length encoding should be
1172           used. It is on by default since DRBD 8.4.0.
1173
1174       verify-alg hash-algorithm
1175           Online verification (drbdadm verify) computes and compares
1176           checksums of disk blocks (i.e., hash values) in order to detect if
1177           they differ. The verify-alg parameter determines which algorithm to
1178           use for these checksums. It must be set to one of the secure hash
1179           algorithms supported by the kernel before online verify can be
1180           used; see the shash algorithms listed in /proc/crypto.
1181
1182           We recommend to schedule online verifications regularly during
1183           low-load periods, for example once a month. Also see the notes on
1184           data integrity below.
1185
1186       allow-remote-read bool-value
1187           Allows or disallows DRBD to read from a peer node.
1188
1189           When the disk of a primary node is detached, DRBD will try to
1190           continue reading and writing from another node in the cluster. For
1191           this purpose, it searches for nodes with up-to-date data, and uses
1192           any found node to resume operations. In some cases it may not be
1193           desirable to read back data from a peer node, because the node
1194           should only be used as a replication target. In this case, the
1195           allow-remote-read parameter can be set to no, which would prohibit
1196           this node from reading data from the peer node.
1197
1198           The allow-remote-read parameter is available since DRBD 9.0.19, and
1199           defaults to yes.
1200
1201   Section on Parameters
1202       address [address-family] address:port
1203
1204           Defines the address family, address, and port of a connection
1205           endpoint.
1206
1207           The address families ipv4, ipv6, ssocks (Dolphin Interconnect
1208           Solutions' "super sockets"), sdp (Infiniband Sockets Direct
1209           Protocol), and sci are supported (sci is an alias for ssocks). If
1210           no address family is specified, ipv4 is assumed. For all address
1211           families except ipv6, the address is specified in IPV4 address
1212           notation (for example, 1.2.3.4). For ipv6, the address is enclosed
1213           in brackets and uses IPv6 address notation (for example,
1214           [fd01:2345:6789:abcd::1]). The port is always specified as a
1215           decimal number from 1 to 65535.
1216
1217           On each host, the port numbers must be unique for each address;
1218           ports cannot be shared.
1219
1220       node-id value
1221
1222           Defines the unique node identifier for a node in the cluster. Node
1223           identifiers are used to identify individual nodes in the network
1224           protocol, and to assign bitmap slots to nodes in the metadata.
1225
1226           Node identifiers can only be reasssigned in a cluster when the
1227           cluster is down. It is essential that the node identifiers in the
1228           configuration and in the device metadata are changed consistently
1229           on all hosts. To change the metadata, dump the current state with
1230           drbdmeta dump-md, adjust the bitmap slot assignment, and update the
1231           metadata with drbdmeta restore-md.
1232
1233           The node-id parameter exists since DRBD 9. Its value ranges from 0
1234           to 16; there is no default.
1235
1236   Section options Parameters (Resource Options)
1237       auto-promote bool-value
1238           A resource must be promoted to primary role before any of its
1239           devices can be mounted or opened for writing.
1240
1241           Before DRBD 9, this could only be done explicitly ("drbdadm
1242           primary"). Since DRBD 9, the auto-promote parameter allows to
1243           automatically promote a resource to primary role when one of its
1244           devices is mounted or opened for writing. As soon as all devices
1245           are unmounted or closed with no more remaining users, the role of
1246           the resource changes back to secondary.
1247
1248           Automatic promotion only succeeds if the cluster state allows it
1249           (that is, if an explicit drbdadm primary command would succeed).
1250           Otherwise, mounting or opening the device fails as it already did
1251           before DRBD 9: the mount(2) system call fails with errno set to
1252           EROFS (Read-only file system); the open(2) system call fails with
1253           errno set to EMEDIUMTYPE (wrong medium type).
1254
1255           Irrespective of the auto-promote parameter, if a device is promoted
1256           explicitly (drbdadm primary), it also needs to be demoted
1257           explicitly (drbdadm secondary).
1258
1259           The auto-promote parameter is available since DRBD 9.0.0, and
1260           defaults to yes.
1261
1262       cpu-mask cpu-mask
1263
1264           Set the cpu affinity mask for DRBD kernel threads. The cpu mask is
1265           specified as a hexadecimal number. The default value is 0, which
1266           lets the scheduler decide which kernel threads run on which CPUs.
1267           CPU numbers in cpu-mask which do not exist in the system are
1268           ignored.
1269
1270       on-no-data-accessible policy
1271           Determine how to deal with I/O requests when the requested data is
1272           not available locally or remotely (for example, when all disks have
1273           failed). When quorum is enabled, on-no-data-accessible should be
1274           set to the same value as on-no-quorum. The defined policies are:
1275
1276           io-error
1277               System calls fail with errno set to EIO.
1278
1279           suspend-io
1280               The resource suspends I/O. I/O can be resumed by (re)attaching
1281               the lower-level device, by connecting to a peer which has
1282               access to the data, or by forcing DRBD to resume I/O with
1283               drbdadm resume-io res. When no data is available, forcing I/O
1284               to resume will result in the same behavior as the io-error
1285               policy.
1286
1287           This setting is available since DRBD 8.3.9; the default policy is
1288           io-error.
1289
1290       peer-ack-window value
1291
1292           On each node and for each device, DRBD maintains a bitmap of the
1293           differences between the local and remote data for each peer device.
1294           For example, in a three-node setup (nodes A, B, C) each with a
1295           single device, every node maintains one bitmap for each of its
1296           peers.
1297
1298           When nodes receive write requests, they know how to update the
1299           bitmaps for the writing node, but not how to update the bitmaps
1300           between themselves. In this example, when a write request
1301           propagates from node A to B and C, nodes B and C know that they
1302           have the same data as node A, but not whether or not they both have
1303           the same data.
1304
1305           As a remedy, the writing node occasionally sends peer-ack packets
1306           to its peers which tell them which state they are in relative to
1307           each other.
1308
1309           The peer-ack-window parameter specifies how much data a primary
1310           node may send before sending a peer-ack packet. A low value causes
1311           increased network traffic; a high value causes less network traffic
1312           but higher memory consumption on secondary nodes and higher resync
1313           times between the secondary nodes after primary node failures.
1314           (Note: peer-ack packets may be sent due to other reasons as well,
1315           e.g. membership changes or expiry of the peer-ack-delay timer.)
1316
1317           The default value for peer-ack-window is 2 MiB, the default unit is
1318           sectors. This option is available since 9.0.0.
1319
1320       peer-ack-delay expiry-time
1321
1322           If after the last finished write request no new write request gets
1323           issued for expiry-time, then a peer-ack packet is sent. If a new
1324           write request is issued before the timer expires, the timer gets
1325           reset to expiry-time. (Note: peer-ack packets may be sent due to
1326           other reasons as well, e.g. membership changes or the
1327           peer-ack-window option.)
1328
1329           This parameter may influence resync behavior on remote nodes. Peer
1330           nodes need to wait until they receive an peer-ack for releasing a
1331           lock on an AL-extent. Resync operations between peers may need to
1332           wait for for these locks.
1333
1334           The default value for peer-ack-delay is 100 milliseconds, the
1335           default unit is milliseconds. This option is available since 9.0.0.
1336
1337       quorum value
1338
1339           When activated, a cluster partition requires quorum in order to
1340           modify the replicated data set. That means a node in the cluster
1341           partition can only be promoted to primary if the cluster partition
1342           has quorum. Every node with a disk directly connected to the node
1343           that should be promoted counts. If a primary node should execute a
1344           write request, but the cluster partition has lost quorum, it will
1345           freeze IO or reject the write request with an error (depending on
1346           the on-no-quorum setting). Upon loosing quorum a primary always
1347           invokes the quorum-lost handler. The handler is intended for
1348           notification purposes, its return code is ignored.
1349
1350           The option's value might be set to off, majority, all or a numeric
1351           value. If you set it to a numeric value, make sure that the value
1352           is greater than half of your number of nodes. Quorum is a mechanism
1353           to avoid data divergence, it might be used instead of fencing when
1354           there are more than two repicas. It defaults to off
1355
1356           If all missing nodes are marked as outdated, a partition always has
1357           quorum, no matter how small it is. I.e. If you disconnect all
1358           secondary nodes gracefully a single primary continues to operate.
1359           In the moment a single secondary is lost, it has to be assumed that
1360           it forms a partition with all the missing outdated nodes. In case
1361           my partition might be smaller than the other, quorum is lost in
1362           this moment.
1363
1364           In case you want to allow permanently diskless nodes to gain quorum
1365           it is recommendet to not use majority or all. It is recommended to
1366           specify an absolute number, since DBRD's heuristic to determine the
1367           complete number of diskfull nodes in the cluster is unreliable.
1368
1369           The quorum implementation is available starting with the DRBD
1370           kernel driver version 9.0.7.
1371
1372       quorum-minimum-redundancy value
1373
1374           This option sets the minimal required number of nodes with an
1375           UpToDate disk to allow the partition to gain quorum. This is a
1376           different requirement than the plain quorum option expresses.
1377
1378           The option's value might be set to off, majority, all or a numeric
1379           value. If you set it to a numeric value, make sure that the value
1380           is greater than half of your number of nodes.
1381
1382           In case you want to allow permanently diskless nodes to gain quorum
1383           it is recommendet to not use majority or all. It is recommended to
1384           specify an absolute number, since DBRD's heuristic to determine the
1385           complete number of diskfull nodes in the cluster is unreliable.
1386
1387           This option is available starting with the DRBD kernel driver
1388           version 9.0.10.
1389
1390       on-no-quorum {io-error | suspend-io}
1391
1392           By default DRBD freezes IO on a device, that lost quorum. By
1393           setting the on-no-quorum to io-error it completes all IO operations
1394           with an error if quorum is lost.
1395
1396           Usually, the on-no-data-accessible should be set to the same value
1397           as on-no-quorum, as it has precedence.
1398
1399           The on-no-quorum options is available starting with the DRBD kernel
1400           driver version 9.0.8.
1401
1402       on-suspended-primary-outdated {disconnect | force-secondary}
1403
1404           This setting is only relevant when on-no-quorum is set to
1405           suspend-io. It is relevant in the following scenario. A primary
1406           node loses quorum hence has all IO requests frozen. This primary
1407           node then connects to another, quorate partition. It detects that a
1408           node in this quorate partition was promoted to primary, and started
1409           a newer data-generation there. As a result, the first primary
1410           learns that it has to consider itself outdated.
1411
1412           When it is set to force-secondary then it will demote to secondary
1413           immediately, and fail all pending (and new) IO requests with IO
1414           errors. It will refuse to allow any process to open the DRBD
1415           devices until all openers closed the device. This state is visible
1416           in status and events2 under the name force-io-failures.
1417
1418           The disconnect setting simply causes that node to reject connect
1419           attempts and stay isolated.
1420
1421           The on-suspended-primary-outdated option is available starting with
1422           the DRBD kernel driver version 9.1.7. It has a default value of
1423           disconnect.
1424
1425   Section startup Parameters
1426       The parameters in this section define the behavior of DRBD at system
1427       startup time, in the DRBD init script. They have no effect once the
1428       system is up and running.
1429
1430       degr-wfc-timeout timeout
1431
1432           Define how long to wait until all peers are connected in case the
1433           cluster consisted of a single node only when the system went down.
1434           This parameter is usually set to a value smaller than wfc-timeout.
1435           The assumption here is that peers which were unreachable before a
1436           reboot are less likely to be reachable after the reboot, so waiting
1437           is less likely to help.
1438
1439           The timeout is specified in seconds. The default value is 0, which
1440           stands for an infinite timeout. Also see the wfc-timeout parameter.
1441
1442       outdated-wfc-timeout timeout
1443
1444           Define how long to wait until all peers are connected if all peers
1445           were outdated when the system went down. This parameter is usually
1446           set to a value smaller than wfc-timeout. The assumption here is
1447           that an outdated peer cannot have become primary in the meantime,
1448           so we don't need to wait for it as long as for a node which was
1449           alive before.
1450
1451           The timeout is specified in seconds. The default value is 0, which
1452           stands for an infinite timeout. Also see the wfc-timeout parameter.
1453
1454       stacked-timeouts
1455           On stacked devices, the wfc-timeout and degr-wfc-timeout parameters
1456           in the configuration are usually ignored, and both timeouts are set
1457           to twice the connect-int timeout. The stacked-timeouts parameter
1458           tells DRBD to use the wfc-timeout and degr-wfc-timeout parameters
1459           as defined in the configuration, even on stacked devices. Only use
1460           this parameter if the peer of the stacked resource is usually not
1461           available, or will not become primary. Incorrect use of this
1462           parameter can lead to unexpected split-brain scenarios.
1463
1464       wait-after-sb
1465           This parameter causes DRBD to continue waiting in the init script
1466           even when a split-brain situation has been detected, and the nodes
1467           therefore refuse to connect to each other.
1468
1469       wfc-timeout timeout
1470
1471           Define how long the init script waits until all peers are
1472           connected. This can be useful in combination with a cluster manager
1473           which cannot manage DRBD resources: when the cluster manager
1474           starts, the DRBD resources will already be up and running. With a
1475           more capable cluster manager such as Pacemaker, it makes more sense
1476           to let the cluster manager control DRBD resources. The timeout is
1477           specified in seconds. The default value is 0, which stands for an
1478           infinite timeout. Also see the degr-wfc-timeout parameter.
1479
1480   Section volume Parameters
1481       device /dev/drbdminor-number
1482
1483           Define the device name and minor number of a replicated block
1484           device. This is the device that applications are supposed to
1485           access; in most cases, the device is not used directly, but as a
1486           file system. This parameter is required and the standard device
1487           naming convention is assumed.
1488
1489           In addition to this device, udev will create
1490           /dev/drbd/by-res/resource/volume and
1491           /dev/drbd/by-disk/lower-level-device symlinks to the device.
1492
1493       disk {[disk] | none}
1494
1495           Define the lower-level block device that DRBD will use for storing
1496           the actual data. While the replicated drbd device is configured,
1497           the lower-level device must not be used directly. Even read-only
1498           access with tools like dumpe2fs(8) and similar is not allowed. The
1499           keyword none specifies that no lower-level block device is
1500           configured; this also overrides inheritance of the lower-level
1501           device.
1502
1503       meta-disk internal,
1504       meta-disk device,
1505       meta-disk device [index]
1506
1507           Define where the metadata of a replicated block device resides: it
1508           can be internal, meaning that the lower-level device contains both
1509           the data and the metadata, or on a separate device.
1510
1511           When the index form of this parameter is used, multiple replicated
1512           devices can share the same metadata device, each using a separate
1513           index. Each index occupies 128 MiB of data, which corresponds to a
1514           replicated device size of at most 4 TiB with two cluster nodes. We
1515           recommend not to share metadata devices anymore, and to instead use
1516           the lvm volume manager for creating metadata devices as needed.
1517
1518           When the index form of this parameter is not used, the size of the
1519           lower-level device determines the size of the metadata. The size
1520           needed is 36 KiB + (size of lower-level device) / 32K * (number of
1521           nodes - 1). If the metadata device is bigger than that, the extra
1522           space is not used.
1523
1524           This parameter is required if a disk other than none is specified,
1525           and ignored if disk is set to none. A meta-disk parameter without a
1526           disk parameter is not allowed.
1527

NOTES ON DATA INTEGRITY

1529       DRBD supports two different mechanisms for data integrity checking:
1530       first, the data-integrity-alg network parameter allows to add a
1531       checksum to the data sent over the network. Second, the online
1532       verification mechanism (drbdadm verify and the verify-alg parameter)
1533       allows to check for differences in the on-disk data.
1534
1535       Both mechanisms can produce false positives if the data is modified
1536       during I/O (i.e., while it is being sent over the network or written to
1537       disk). This does not always indicate a problem: for example, some file
1538       systems and applications do modify data under I/O for certain
1539       operations. Swap space can also undergo changes while under I/O.
1540
1541       Network data integrity checking tries to identify data modification
1542       during I/O by verifying the checksums on the sender side after sending
1543       the data. If it detects a mismatch, it logs an error. The receiver also
1544       logs an error when it detects a mismatch. Thus, an error logged only on
1545       the receiver side indicates an error on the network, and an error
1546       logged on both sides indicates data modification under I/O.
1547
1548       The most recent example of systematic data corruption was identified as
1549       a bug in the TCP offloading engine and driver of a certain type of GBit
1550       NIC in 2007: the data corruption happened on the DMA transfer from core
1551       memory to the card. Because the TCP checksum were calculated on the
1552       card, the TCP/IP protocol checksums did not reveal this problem.
1553

VERSION

1555       This document was revised for version 9.0.0 of the DRBD distribution.
1556

AUTHOR

1558       Written by Philipp Reisner <philipp.reisner@linbit.com> and Lars
1559       Ellenberg <lars.ellenberg@linbit.com>.
1560

REPORTING BUGS

1562       Report bugs to <drbd-user@lists.linbit.com>.
1563

COPYRIGHT

1565       Copyright 2001-2018 LINBIT Information Technologies, Philipp Reisner,
1566       Lars Ellenberg. This is free software; see the source for copying
1567       conditions. There is NO warranty; not even for MERCHANTABILITY or
1568       FITNESS FOR A PARTICULAR PURPOSE.
1569

NOTES

1575        1. DRBD User's Guide
1576           http://www.drbd.org/users-guide/
1577
1578        2.
1579
1580                 Online Usage Counter
1581           http://usage.drbd.org
1582
1583        3. DRBD Web Site
1584           http://www.drbd.org/
1585
1586
1587
1588DRBD 9.0.x                      17 January 2018                   DRBD.CONF(5)